I’d like to give a bit more info on models and training in v0.4.5
When you run it it in Flame has an ability to point to an arbitrary “.pth” file.
Those .pth files are standard Pytorch compressed dictionaries that can be saved and loaded and can contain basically any data. In case of ML Timewarp it now contains models weights (say, the “knowledge” it has) and also the name of an actual python file to use with its knowledge.
This python files are actual “ML batches” so to speak and describe the way images are processed, and can be very different in its structure and complexity. The knowledge from .pth file is something like actual parameters for the nodes described in this python files.
In dev001 I’ve put flownet4_v004 as a main model but it has not been widely tested and in terms of processing is a bit of an overkill. In dev002 the one that has been named “light” before became main one. It might perform slightly worse on certain shots but it works way much faster and also trains way much faster.
At the moment I’m running a set of tests on a selected small number of challenging shots and making a small graduate changes in between flownet4_v001 and flownet4_v004 in a bid to try to find some sweet spot in terms of balancing speed and accuracy, especially when it comes to a large blurry edges.
You can use any of .pth files as a starting point to fine-tune on a specific shot or set of shots, and feel free to check the speeds and quality difference between them.
If you would like to try and train from scratch I would recommend using “–model flownet4_v001e” as an argument for a fresh start. The only substantial difference between v001 and v001e is different activation function in image feature encoder and it seem to work better and converge faster, but feel free to make some tests between those two and let me know.
Other versions there in pytorch/models folder are rather experimental and are there as a result of a set of tests I’m running to find out what changes works best
v0.4.4 and before has been running strictly in CPU mode and was trying to split processing in several CPU threads to speed things up.
v0.4.5 is using a new Metal backend in Pytorch (called “mps”) that theoretically should allow running the same code as for “cuda” backend with decent speed and also allow training and finetuning.
Practically it seem to be a bit buggy and not all the parts are implemented at the moment, so in order to make it work I have to make a switch to CPU and back to MPS on the fly in a couple of places. I’m also not sure if training actually works at the moment 'cause I had no possibility to really check it yet.
Pytorch is an opensource tool and they improving its Metal backend slowly so chances are that at some point all missing bits can be implemented and optimized for greater speedups. This won’t require to change any code and it should be just enough to remove those fallbacks to CPU.
Unfortunately Intel Macs are left behind here with Pytorch 2.2.2 being the latest version avaliable, so it will only make new versions for Apple silicon chips. Intel machines will probably stay where they are.
There is an Apple’s own ML framework for Apple silicon chips called “MLX” (GitHub - ml-explore/mlx: MLX: An array framework for Apple silicon).
It closely follows all the common Numpy and Pytorch conventions and it should be possible to amend current model code to be able to run on it with minimal changes.
If someone use Flame on Mac, have some time and can code or have this resources I would love to help with having it done, so please contact me. As this MLX library comes directly from Apples very own ML research team chances are it might bring a great speedup to ML Timewarp on Mac, we just need someone who can test it and translate current code for it to understand.
It does have a new option to fine-tune model based on given shot or set of shots. It might be a rare case but once in a while when there’s a nasty shot to respeed it might make some difference if you set it to learn from it overnight.
In case you need more in terms of training there is a command-line script as well.
I’ve tested it on Apple Silicon MacOS M2 and it seem to work, let me know if it does for you.
There should be a menu item for finetune when right-click clip or selection of clips. This should bring up a dialog where one can select a folder to export it, source weights and target file to put the new weights in. Does this dialog come up?
Also are you on Mac or Linux? I was not able to test it properly on Mac in terms of results as I have only 8GB M2 to try and it is a bit too little ram for this.
So after you click apply it should export clip or clips to a folder and fire up a window with some info on what’s going on during its startup and at some point there should be two strings with some statistics changing when it looks at training samples, is it getting there?
Hi guys, there seem to be a bug in v0.4.5 dev 003 that might lead to wrong data being saved on exiting training loop. I’ll try to fix it ASAP and let you know. In the meantime if you just close training window with the mouse it most likely to save it correctly.
Hi Alan, I’ve just made a fresh install of Rocky 8.7 and Flame 2025.1 and it works just fine on that setup - I was not able to reproduce the problem. Maybe PM me
@talosh Is there a way to treat multiple channels within an exr? We are working on a feature where some external vendors are creating comps that they have subsequently decided to do respeeds/morph cuts on. Loving what ML Timewarp does to the main picture but considering there are several embedded mattes in other channels, can it apply the same effect to match what is happening in the foreground?
It is possible but not implemented at the moment - I have a pure python code to read and write exrs to avoid complicated dependency rabbit hole, and so far its been tested with 4 channels (RGBA).
Do you mind sharing an example of a multi-channel exr (can be any image just replicating a structure) so I can re-write this part to be able to handle multi-channel?
@talosh Good morning genius,
I’ll be in front of a workstation in about half an hour and will send you a batch setup so that you can manufacture a multi channel openexr.