Just noticed one thing while trying to prepare video - while exporting exr’s make sure to uncheck “include alpha” box so it has 3 channels. That type of models can have any number of input and output channels but at the moment there a lots of assumptions of having only 3 RGB channels for both input and output. I’ll try to work around it in next release
OMG!!! Wanna try that. Great work @talosh
There is a quick bugfix release:
https://github.com/talosh/flameSimpleML/releases/tag/v0.0.2
- Default learning rate lowered from 0.004 to 0.0034
- Fixed reading images other then 16fp in ApplyModel script
- Fixed training images corruption on Mac
- Removed Debug flag to reduce clutter in Flame console
How about that screen record demo?
This is awesome!! Quick question… I managed to train a model, easy enough… now I’m wondering if I can use other/ external checkpoints from the likes of hugginface et al… guessing I can’t, can I?
TDLR: Coming hopefully middle of this week
Hi Alan, I’ve actually started with one and then I’ve bumped into something that I would like to address first. Since Timewarp 0.5. I’ve been trying to use Wiretap to pull the actual image data, and it did mostly work. Wiretap is great tool but on the other side it is a bit hardcore and its python bindings didn’t seem to be carefully updated to python3. Also you need to use shared libraries in order to writhe things back and it might mess up workflows where shared libraries are actively used.
This might change in the future releases but the reality is that (at least here in London) there are many houses running the latest flavours of Flame 2023, sometimes still on Centos7 with Pacsal Nvidia cards, so this sort of setup seem to be a fair baseline in terms of an API.
Considering this I’ve decided to move away from Wiretap for now and get back to the old model when you have exr’s exported to a folder and the result is imported back.
Uncompressed exr’s are relatively easy to read and write in pure python so there are no additional dependencies involved and I believe this would make it more robust. It would also make it easier to create a pybox for this to be able to use in right in batch setups.
Therefore I’ve spent some time on moving the code over to the older way of “export” - > “process” → “import back” workflow and it is almost complete now. I hope I will be able to release a new version with a video at some point this week.
Whilst this tool is in its very early stage of development it is build in the way that one should be able to integrate other models. There are models folder and it should be possible to adopt the code so in can be picked up by the tool in the same way. I’m planning to add two more flavours of the model soon and if you have something particular you would like to bring in and willing to code a bit DM me
Hi guys, here is a new version of flameSimpleML:
https://github.com/talosh/flameSimpleML/releases/tag/v0.0.3
Here is a video with a walk through the process (sorry for a late night mumbling in broken English):
https://vimeo.com/905784396?share=copy
Main change is that it does not use Wiretap and Shared Libraries and works by exporting uncompressed exrs, processing them and importing back to Flame. It is possible now to train and apply models using several clips as input (for example Front / Back / Matte scenarios). There’s also a little addition that would help to export dataset for training.
While training the main parameter to play with is a Learning Rate: it is possible to adjust it with --lr argument in the command line. It is also possible to pick already trained model data and continue training with --model_path. Please run train.py it with --help to explore the options.
In light of the recent discussions about ML tools and licensing - there is a license file that contains the licenses of the components used in the tool and not included in Autodesk python distribution bundled with Flame. As far as I’m aware those licenses are fully permissive. Additionally, it’s important to note that no pre-trained data of any kind is included. This should help facilities with reviewing the tool and determining its compliance with their policies.
Give it a try, let me know if it works for you and please give suggestions on how to improve an overall user experience.
Thanks @talosh I was waiting for this demo. Great job.
here’s the color match / LUT generator @fredwarren !!!
amazing stuff @talosh can’t wait to see what others do with it!
@talosh
This is amazing. Would the training take advantage of multi-GPUs on Linux? Also, what about multi-machine training? This would be very cool to be able to submit as a Job on Backburner and have the whole farm crank on this.
Hi Alan, at the moment you can select a GPU to train on, though multi-gpu training can be implemented relatively easily, there’s a pytorch framework for this. Not sure about multi-machine options though
That would be great. I can see setting up a machine with multiple cheap data center GPUs to offload training. I have like 6 P40s on a shelf.
Legend.
Hey @talosh on first run I’m getting:
/opt/Autodesk/shared/python/flameSimpleML/packages/.lib/python3.10/site-packages/torch/cuda/__init__.py:146: UserWarning:
NVIDIA RTX A5000 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA RTX A5000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
and later a little…
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
…and so the fun begins. Any thoughts before I go down the “Start Locally” rabbit hole?
Pytorch bundled with flameSimpleML is too old for this GPU, it has to be updated, some info is here: https://github.com/talosh/flameTimewarpML/issues/87
Dm me if needed
All good fam. We’re off to the races.
Why not just add the latest version to your repo?
That was the first thing I ran. Let it run 40 epochs and then killed it. Works a fucking treat… I do wish the inference render was a pybox though so we could use it inline but this is amazing regardless.
That should be possible, don’t have much experience with pybox but as long as it can give an uncompressed exr it should be pretty straightforward to implement