Releasing TUNET - A training ML tool

check my post, attached my hotrodded version (( i probably maybe broke something ))

messed a bunch with it to get stepsizes down especially the data read part, there is also slicer.py which is my version of a pre-slicer that takes care of that so we dont have to do it on runtime

23:29:28 [INFO] Epoch[3601] Step[1800100] (100/500), L1:0.0008(Avg:0.0009), LR:1.0e-04, T/Step:0.592s (D:0.113 T:0.002 C:0.490)

thats on a single 4090 .

I

NIce. Yes exatcly that @cnoellert @mlheureux

Recently discovery that Cosmos family of models released by Nvidia are great video foundation models due its way of training, one of use cases is use to extract AOV passes from Videos. Is a very new technique that will soon (i think) be out from labs.
Tunet was specially combined with those techniques.

Was not my intent share pre-trained models that could work in to the wild videos, that is why wont (if i done well) not work outside the examples shots i built. It may try but wont be able to.

you can have a look on the most recent research if you wanna learn more and try to reproduce on your end. Sadly no code or repo are out yet.

3 Likes

Am I the only one who doesn’t understand why noise pattern reduction is not a bedrock process?

Particularly because diffusion training is based on adding noise and then asking machine learning tools to find the cat.

1 Like

Thanks for the information! Good to understand nonetheless what’s possible in the future!

Reading through this discussion it seems that the exported (onnx) Flame inference should give similar results than the nuke inference.
I’m having very differents results, where the Flame inference gives something of very poor quality, where it looks fine in Nuke.
Is there something about onnx and input resolution? Or other things to consider?

Exported and tested at 110, then 405 epoch, trained on windows (4090) with 10 frames datasets (actually the
latest might be with 20)
1164x1580 crops for training and checking
Attaching screenshots

1 - source

2- ground truth

3 - nuke inference

4 - flame inference

For info, I started the test to try adding stuff on the source, got the same discrepancies, then swapped to test again, and the image with the “after” between eyebrows became the before (source) footage.

Yes, Flame does not do TIled inference like regular Pytorch as well as Nuke infer. Flame does full resize.
Ive opned a ticket to Autodesk (FP-03392) to add Tiled inference option, so we could choose between full pre resize or tiled, that way would work the same.
Current workaround is you crop the image same as training resolution. Infer in the crop.

https://raw.githubusercontent.com/obss/sahi/main/resources/sliced_inference.gif

3 Likes

@tpo- i definitely upvoted it

3 Likes

Is that the torch export to onnx limitation?
Was reading about the dynamic shapes problem (and ran into them myself) and stumbled upon this: GitHub - fabio-sim/Depth-Anything-ONNX: ONNX-compatible Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
the dynamo dynamic shapes stuff?

Current workaround is you crop the image same as training resolution. Infer in the crop.

Not sure I understand.
Split the dataset images into 512x512 crops, train with these, then run the inference on other takes similarily cropped and re-combine the results? I must misunderstand

Pre-crop, train, inference on cropped-size, stitch-in to full res. Copycat handles the cropping to model size and the stitching back (effectively tiling the source to be inferenced pre-inference and rebuilding the tiles after) automatically whereas flame does not.

There is also a noticeable performance increase in training on the cropped image size as the risk for training large areas of unaffected source is lessened dramatically.

At least this was my experience.

4 Likes

Thanks @cnoellert & @tpo.

voted up on FP-03392

1 Like

upvoted

1 Like

To everybody (is there a handle for this?)
Please vote this up, as it stands today, this is a key step to being able to use up to date ML/AI current working tools/tech in Flame, imho.
FP-03392

3 Likes

Here’s the link but it’s a beta defect. Should be moved to being a general improvement.

3 Likes

if anyone is thinking about doing a tutorial how to deploy this on some easy to use cloud system (i dont even know where to start with that) that would also be extremely fantastic, would love to be able to spin up some massive H200 node or whatever based thing but i am lacking experience to where even start…

1 Like

It is exactly same thing, i do have my own node locally so will be hard for me rent just todo a tutorial but is basically the same, remote in.

Make sure the node is Ubuntu or Rocky, Cuda 12 and you are fine. Works out of box.
Tunet is not friend of Cuda 11, works but expect perfomance degraded or miner issues so Book with Cuda 12+ .

i guess ill just jump in the deep end, what hardware should I be looking for for rental node ? I am not that fluent with AI cards and their respective performance

This was submitted as a defect. We have converted it to a General Improvement request:

FI-03536 Inference: Tiling support

5 Likes

Aah.. that’s why I couldn’t find it.. +1

1 Like

Thank You Fred! :raising_hands:

Thanks this looks great, … thought i try it out on windows, and got stuck here trying to start the training.

any ideas ?

Many Thanks

M