Ok! Had this reply open for like a week but kept gettting distracted… I read a ton about denoising around the time I made the Ls_Dollface matchbox and mainly learnt that it’s really complicated and can’t be broken down into a batch tree most of the time. People have been trying to figure it out since the dawn of digital images and it’s still not perfect… here’s some ways I vaguely remember from oldest to newest:
Median - one of the oldest ideas beyond just a blur: sort nearby pixels by value and take the middle one, so extreme changes are ignored
Bilateral filtering/K-nearest neighbours (KNN) - look at nearby pixels and average the ones that are similar, ignoring ones that are different in value so are probably from a different object - like a blur that doesn’t blur across edges… this is kinda how the Dollface shader works but I had to cut some corners to keep it fast on the GPU
Non-local means - instead of just looking for nearby pixels which are similar this looks at whole blocks of pixels, 7x7 or so, and tries to find blocks which are similar in order to average them together, so it can take advantage of detailed repeating patterns that bilateral struggles with
BM3D - extends the non-local means idea to work with time as a third dimension, looking for similar blocks from previous and next frames to average together
Recursive filtering - hardware video DNRs used to combine a few techniques but the simplest was to average the previous output frame with the input, which works great for non-moving backgrounds but obviously leaves trails behind moving objects just like Compound does… they would have a threshold for when to ignore the filtering to preserve moving things, and the later ones had motion compensation so could figure out how each frame had moved and average it overlaid on its original position. You could maybe build this in batch, using Pixel Spread’s vector displace mode with a motion vector input to align each frame to the previous one. The earliest CG-specific denoisers did this using the motion vector pass but it’s not that effective on its own. At least one hardware unit had a median filter which worked across time as well, like the TemporalMedian node in Nuke which works well for removing sparkles and rain
FFT denoising - these work totally differently and are way harder to understand and write… the idea is that noise looks different to image data in frequency terms so you can split one from the other. The same way white noise in audio is constant across all frequencies, so looks like a flat line on a spectrum analyzer, noise in an image is similarly constant after you FFT it - wheras the image itself has a more lumpy and complex spectrum. Whenever you see something that needs you to sample a flat area to get a noise profile it’s probably doing this - calculating the spectrum of the noise from that area, then subtracting that spectrum from the whole image before converting it back from the frequency domain. Neat video even shows you a little graph of the noise spectrum. Wavelet denoising is the same idea but replaces the FFT with a wavelet transform
Neural/ML methods - an obvious thing to do these days is train a neural net on images before and after adding noise… I think the Optix one was trained like that. Interestingly I haven’t been that impressed with Optix compared to Neat for denoising CG renders - the main advantage of Optix is that it’s crazy fast. Would like to try the Intel OpenImage denoiser on real images some time, it seems more impressive in demo vids on CG renders
Alright sorry for the length
Haven’t been keeping up with the latest ML papers so maybe Runway or similar already have something that’s on par with Neat Video? Would love to hear if anyone’s tried…