Releasing TUNET - A training ML tool

tpo · April 9, 2025, 10:08am

Yes directly. Good way of thinking is:

Let’s say you have a feature film where you will need to change something in the main characters that will need Roto.
So you will have hundreds of shots, millions of frames . With TUNET you simply could train all at once with many many samples of exemples, for that, you need directly use an extreme large batch size that would never fit in a single GPU.
To make sure the model has a chance to analyze enough samples per step you choose a batch of 64 per GPU, on a 8 GPUs machine, you are effectively training with 512 batch size.
never possible on a single gpu

Ie:

john-geehreng · April 19, 2025, 1:21am

These scripts are amazing, thanks @tpo! I wanted to build a UI for them to make it easier to use.

With the scripts below installed, in the MediaHub you can right click a folder (ideally a folder called “src” with a “dst” folder next to it) to bring up Tunet->Tunet UI. Simply pick your source and destination folders, if they haven’t been selected for you already. It’ll make the model folder automatically. Then after the training is done, back in MediaHub you can right click the checkpoint and hit Tunet → Convert Checkpoint and Import which will run Thiago’s convert_flame script. Next it waits for the onnx to be generated and then it automatically imports that into batch as an Inference node.

If anyone wants to test these scripts, copy them wherever you keep your scripts, and then modify the last 3 lines in the tunet/config/config.json file. I used find /usr -name conda.sh 2>/dev/null in a terminal to get the conda_init path. I figured if you can install something from GitHub, modifying 3 lines of a json will be easy.

Obviously, you’ll need to Tunet installed first. I don’t think the scripts are quite Logik Portal worthy
just yet, but the plan is to get them on there soon.

I only have a UI for the “simple” yaml, but could build on for the “advanced” version if needed.

Lastly, I’m assuming this will only work on Linux. DM me if you have any issues.

tunet_ui.zip (769.1 KB)

tpo · April 19, 2025, 3:13pm

Like i told you John, Just amazing! you nailed. So cool.

tpo · April 19, 2025, 3:21pm

Hey @ALan ive updated Multi-gpu brench.

Changed things thinking on PCIe GPUs as well, how the weights are update and how data are sync between gpus. You should get better perfomance.
Now multi-gpu and single-gpu are merged into one, so you dont need anything else.

Tested on multi 6000 Ada cards and worked great, almost same speed but double batch size.

make sure to git clone from multi-gpu brench, main brench im keeping original for now.

Since i mainly use SXM cards and those automatic deal with p2p in between, i endup not paying attention to pcie, but now that is gone.

cnoellert · April 22, 2025, 12:58am

@tpo does the training allow for isolation of some sort? An alpha in source that denotes where the learning should focus?

tpo · April 22, 2025, 5:47pm

Sorry Chris! No, currently is Front (rgb) only training.
While the Tunet model will eventually learn the difference, it’s not guaranteed to focus on it quickly or efficiently, especially if the differing region is small relative to the fullframe.

cnoellert · April 22, 2025, 5:52pm

So better to pre crop then before feeding it for training then. Thanks @tpo

zorrofx · April 24, 2025, 3:10pm

Amazing work @tpo @john-geehreng and all you script-heads out there. This is amazing work you’re doing

febreroflame · April 29, 2025, 7:15pm

About to give this a try on Windows / WSL2 Ubuntu, huge thanks!

EDIT: Just did a 30mins train, enough to say, OH MY GOD THIAGO

cristhiancordoba · April 30, 2025, 2:23pm

Just to let you know, people who want to try this on Windows, I just made it work. By default this does not work on windows since it needs nccl which is a linux only lib but I made it work by tweaking a bit of code and using pytorch 2.3. I also think this can open the door to use it on Mac only environments. I have to try but have no Macs around, i’m a pc guy.

See screenshot below, THIS IS NOT WSL, pure windows. Windows 10.
Also, @tpo thank you thank you very much for this. @tpo

I could post it here if anyone wants to try it, but first I need permission from @tpo, he has the word.

ALan · April 30, 2025, 2:58pm

make a pull request to his GitHub repo with the change.

tpo · April 30, 2025, 4:55pm

Is so nice to see @cristhiancordoba @febreroflame that is great.
Glad you liked it @febreroflame : )

Ive got lots of messagens after release asking for a Mac version. Also from Nuke community.

Im working on a multi-os version, Tunet will automatic detect os and just work. Same for yaml, can be windows, mac or linux paths.

Is already working, just cleaning up some stuff and will release.

attached on native window conda.

tpo · May 1, 2025, 2:46pm

hey everyone! Cross-Platform support is now live.
Git clone again and You should be good to go in any OS.

Multi-gpu is Linux only.

Benchmark:
On mac for training is bit meh as expected:
Win with RTX 6000 Blackwell: 0.3 seconds per step
Mac with M1 Max: 3.9 seconds per step

meaning if a training would take 1 week on Nvidia, would take 13 Weeks on Apple M1 max.

cristhiancordoba · May 1, 2025, 5:08pm

~~@tpo Is there a way to implement the ability to resume training on latest checkpoint? I was looking at the flags and didnt find any flag for this.~~ Check Chris’ comment below. Thanks

cnoellert · May 1, 2025, 5:10pm

It does this automatically based on the naming I believe.

cristhiancordoba · May 1, 2025, 5:12pm

Thank you Chris. You’re right! Fantastic!!!

cristhiancordoba · May 1, 2025, 6:11pm

I just tested this and for some reason with the new/updated scripts, the converted onnx model is not working, it loads but it just gives a solid color output. Fortunately I had a backup of previous scripts. On the previous one was working, but the output was quite blurry as if it was low res. Nuke was working fine, but flame was not.
Could anyone try this with the new/updated scripts?

tpo · May 1, 2025, 6:44pm

Try start a training from scratch with the new version. Mixing is no go. I believe that is why.
Make sure convert scripts are also updated.

You still have the older trainer, I’m calling legacy trainer, it is under util folder, if you use that, is same as before.

ALan · May 5, 2025, 8:49pm

@tpo for multi-gpu, do we still use the separate branch, or was that merged into master?

Thanks.

tpo · May 5, 2025, 9:34pm

separate branch Multi-gpu - Linux only, same for the converters.

Topic		Replies	Views
Making Rain - Flame's inference experiment Tools and Tech machine-learning , batch , ai , inference	11	591	April 22, 2025
Logik Live #57: Make your own ML Tools with Nuke's Copycat Logik Flame News	4	443	July 21, 2021
flameSimpleML - Flame Machine Learning Source/Target tool with bespoke training Flame Questions ml	93	2984	February 1, 2025
Flame Machine Learning Timewarp, now on Linux and Mac Flame Questions ml , timewarp	626	30592	February 9, 2025
ML Timewarp - Tech & Business Flame Questions ml-timewarp	4	405	December 8, 2023

Releasing TUNET - A training ML tool

Related topics