Rec709 to ACEScg AI/ML/inference detail/bit depth recreation

Hi all

Are there any flame inference models or workflows in comfy that rebuild lost detail and colour bit depth in rec709 footage? This has long been a general vfx problem in ACEScg workflows when using rec or sRGB elements in in Aces workflows. Seems with all of the crazy things you can do these days we should be figuring this one out, (if it hasnt already been done.)

Im sure we could use an inference model trained on rec to aces conversions to rebuild detail in rec or sRGB images/videos for a start..

Rebuilding detail in shadows and highlights would be a bonus! (But i guess this would need to be a comfyui workflow)

3 Likes

Very good topic.

A few thoughts:

  • There are specific models to address bit depth (like this: GitHub - subraoul/ComfyUI_Bit-Depth-Enhancer: Custom nodes for bit-depth enhancement and banding removal in ComfyUI ), which is though for images, not footage. I’ve tried it out, but didn’t have the perfect use case to evaluate in depth.
  • There are multiple deficiencies to overcome - compression, dynamic range, color gamut. Each may require separate approaches. The model above primarily focuses on debanding low bit depth. You probably know the challenge for inverse view transforms, and the trade-off between the correct math and the correct look.
  • One challenge is that most of the public models have been trained on a lot of 8-bit material and are of limited use. Made even worse that most ComfyUI workflows run at lower resolution due to hardware constraints.
  • There are a few stand-outs. For example, BorisFX is training all their models with 32bit float math, something they’re not highlighting enough. So your more likely solution would come from BorisFX or Topaz rather than any of the public models you can run in Comfy. Though this might change over time.
  • You could also look at this: FlashVSR in ComfyUI Workflow | Real-Time Video Restoration workflow. It does seem more focused on restoring detail rather than the full quality range that got lost. But maybe still an improvement.
1 Like

@PeteRypstra, curious how are you converting your sRGB/Rec709 assets to ACEScg.

Currently, i just invert a Viewing lut in the colour management node ACEScg <–> Rec709.. But with this, crushed shadows and particularly highlights (close to pure white) are quite unpredictable and can be wildly out and ‘break’ when pushing grade on true linear comps (with 8bit imagery comped into them) - It feels like theres an opportunity to do something a bit smarter than this.

Hey Allklier, thanks for this. I did some Google Gemini deep research and it came back with the Bitdepth enhancer and a HDR VAE - heres the json file it made me :eyes: - I dont know enough about comfy to get it working or test it but i’ll be looking at it soon - I guess Im just looking for a simple inference model that works in the flame node

1 Like

We do a fair amount of this, comping Rec/srgb into device screens, and I approach it from the perspective of matching the gamut of the sensor on the acquisition camera. We’ve taken to shooting reference chips on the device, 100% white to 100% black in 10% increments, and matching that with a color correct node. We then convert to acescg with an inverted view transform.

Thinking about it as “throwing away” data seems wrong, as you’re really just squishing an unrealistic srgb gamut into something that could actually occur or be captured in real life. Even in half float, you’re not really throwing much away, and most things can be massaged in by matching black and white, and then tweaking your middle gray.

But that’s for comping screens into a quasi photoreal scene (creatives always want a bit less than totally realistic, bless them). For full screen graphics in a high bitdepth/hdr sequence, we would probably do something else, but we have yet to get that request…

2 Likes

This looks interesting. I can attempt to recreate and test it tomorrow.

1 Like

Heres the deep research report - worth a read

1 Like

That’s fair. There’s a distinction between the theory and the practice, especially if we’re not combining original HDR material compressed into Rec709 and then re-mixed with still original HDR material in an ACES pipeline.

Comping GFX and screens (which at times can contain what used to be HDR material) is a subset of the problem that can definitely tolerate some shortcuts.

2 Likes

Interesting solve @kirk. Most of the work at my end has also been phone screen comps with assets coming in from AE so am curious about the sensor gamma matching.

Concur. In the mean time if it helps in any way, so far what I have found to be working for me well enough is to not use the inverted view trim lut for the obvious reasons of losing data, rather, using the input transform for any sRGB/Rec709 assets to ACEScg conversion.

Yes, initially it looks visually out of whack but since the input transform node spreads out and remaps the values by placing them at their mathematically correct locations within ASECcg, after the conversion, once the values are brought back to some visual normalcy using a mg/cc node etc, I find it easier to match the blacks/whites and overall match-grading to the backplate. And usually the same initial base “visual correction” node ends up working for all other sRGB/Rec709 assets.

Aside from minimum loss of values during conversion it could also be because the values on both AE assets and the plates are at their technically correct locations within ACEScg space and therefore are responding to the grading node’ sliders/controls with some uniformity.

Haven’t done any R&D on it yet, but since the input-transform conversion to ACEScg is a technically correct and a repeatable math process and the specific color values should always fall at the same locations post conversion, what if a visual correction cc node was created that could be just universally applied to any sRGB/Rec709 assets (project to project) post their input-transform ACEScg conversion, just for initial baseline visual correction before any match-grading nodes are applied.

Also wonder why the inverted view trim lut loses so much data at top/bottom. Technically it shouldn’t as you are going from a smaller to a much larger space with the correct inverted math. Technically there should not be any data within any sRGB/Rec709 elements that is beyond the boundaries of ACEScg space. Unless the loss is the byproduct of stripping sRGB/Rec709’ elements of their 2.2/2.4 gamma, doesn’t make sense.

I have also messed around with just doing a straight input transform to aces and color correcting to taste, but this has two problems: 1) it freaks people out, and 2) color correcting in acescg kind of sucks.

My approach is to view transform the plate (or our reference chips shot) into rec709, attach it to the back input of a cc node, then plug my graphic into the front and, using the Match function in curves editor, match the white and black points (if you want to avoid pissing off the Human Interface team or your grading partners by tinting your graphic at this point, you can always desaturate your reference before matching). Then I view transform the result of the color correcting node into AcesCG.

The point is to do any color manipulation in the native colorspace of the graphic, then transform the result.

I’ve had very good luck with this approach, but if @andy_dill wants to chime in here and share his thoughts, I’ll give him a one time pass to do so (but I’ve got my eye on you buddy. Be careful.)

Again, I think the belief that information is being lost in a view transform is not totally accurate. Mathematically, yes, things are not perfect, but assuming you’re working in 16 bit, you can recover a surprising amount of information through color correction (try gaining a graphic down to like .001 and then use Levels in another cc node to pull it back up if you want to see what I mean). The issue is that color correcting in acescg is super unwieldy, so doing it prior to the transform eliminates a lot of the hassle.

Plus, as Jan said earlier, in practice it doesn’t matter. Anything you end up throwing away is outside the sensor gamut and doesn’t belong there anyway.

2 Likes

I’ve also considered building a LUT for this, or a matchbox, or whatever, but with our acquisition gamuts being a bit of a moving target, it’s really just easier to shoot our refs and match it manually. The Juice Is Not Worth The Squeeze, as our SWE folks like to say.

3 Likes

That’s standard color matching for comps. If you’re not already using AFX_ReverseGrade (matchbox by John Ashby), it makes it super simple to sample the matching colors.

And for the topic as a whole, there’s a fantastic set of tutorials by Victor Perez on fxphd. They’re Nuke centric, but the theory applies regardless of app. And in fact AFX_ReverseGrade is meant to emulate the Nuke tools.

1 Like

That’s the thing - the standard input transform is meant to bring camera material into an ACES pipeline. As something that was photographed (sRGB JPG), or recorded on a Rec709 camera. It looks correct in those circumstances.

However, when you bring material in that has already gone through a full color pipeline, and most importantly the tone mapping that is part of the view transform, it looks often not quite right (meaning if you did a side-by-side with the original if you have it, it won’t match). Doing the reverse view transforms from Rec709 Display attempts to reverse the tone mapping, but since that’s a lossy process, it isn’t necessarily right either nor can it be mathematically correct.

From a pipeline view you have two things to solve:

  • If the material has gone through tone mapping previously, this needs to be inverted as best as possible. Whether that is into an ACES color space, or whether that is into whatever ‘legacy’ color space the background plate is in.
  • When it comes to screen comps, and work that has been created in AfterEffects, it often uses a GFX dynamic range, not what footage might actually look like. And the matching of white, black point, and middle gray restores harmony.

And then this comped material can go through the whole pipeline again and be tone mapped yet again into whatever delivery space you have (or multiple if you do SDR + HDR).

Last night I watched the first fifteen minutes of “47 Ronin” and “The Island” and my takeaway is the first fifteen minutes of a movie will define if it’s worth your time. Additionally, “The Island” does a better job of “kid idiot adults” than more recent fare like “Alien: the TV Series on Earth”.

I plan on watching the remaining 110 minutes of The Island soon, but I’m not sure when.

I haven’t read the thread.

5 Likes

Great, will check it out.

I read the thread now. I don’t have a lot to add beyond, “please do your color work in log. linear is a pain”

It’s strange to me that nobody’s cracked bit-depth upres+deband and it seems like the sort of thing ML would be great at. Turns out it’s easier to render presidential diarrhea.

4 Likes

I’ve spent a few hours on the setup referenced by Gemini. As shown, it does not translate. Still work in progress to modify it to make work. And I also want to checkout that other link for on some footage.

There are a few loras for recoloring (or restoring color), but they don’t work on all models. Folks seem to be more interested in models that turn things into Anime than restoring material. At least on the Internet. Quite possible that big studios have their internal solutions. Like the VaeDecodeHDR which came out of big studios, and then had to be taken down because of it.

It’s also unknown just how much high DR / camera log material you would need to train with to get reliable results. We’ve seen the report of RunwayML finding the Disney catalog too small. Now that is for full generation, not treatment.

I think technically it’s totally a solvable problem. But the right resources in material and funding are not aligned with independent users.

That may in fact be a perfect opportunity for ADSK (@FrancisBouillon). Since they (and folks like BorisFX) have the resources to do private training and then making these models available as part of paid software packages, they’re in a position to close that gap and make the economics work. And it would be a differentiator that is not met by the open source community who chases other styles instead. Leave the basic workflow tasks and trend du jour to the open source community, and solve the hard problems that productions depend on, but don’t exist outside of walled gardens at the moment by all accounts.

4 Likes

I’m always interested in this perspective. Is there a technical reason for this as it pertains to the color correction node in Flame specifically?

Generally it seems to me that people feel comfortable in the colorspace they have spent the most time working.