Where is the bottleneck?

finnjaeger · February 19, 2023, 8:50am

interesting for sure,

I an going to deepdive into this some more next week and figure out if anything in particular creates the massive cliff between linux and mac

finnjaeger · February 20, 2023, 7:30am

yea same…

going to do some benchmarks today also on m1max

Ill just take 4 batches from my last job to get a idea of the speed differences.

allklier · February 20, 2023, 8:43am

Depending on what you use to see GPU usage, be mindful that some GPU loads don’t show up the same way. The most reliable GPU resource meter is actually the temp sensor, not the usage stats.

finnjaeger · February 20, 2023, 7:13pm

interesting i tried 4 shots and the one i struggled with is by far the slowest on the mac pro, which implies some bottleneck that the other shots dont have, but the sample size is too small for any conclusions, will dive deeper

But here are some numbers ::

Shot01 71 vs 146
Shot02 28 vs 62
Shot03 64 vs 73
Shot04 41 vs 267

seconds, faster is always the linux box (16core ryzen + 3090 + 8x nvme raid + 128gb ram)

slower is the mac pro, 16core xeon, 256gb ram, dual vega pro duo(32gb vram).

definetely interesting results that tell me i need go further down this rabbithole and create a full benchmark suite thats better than these random 4 shots from my last job.

cnoellert · February 20, 2023, 11:41pm

Sure, but those findings seem pretty consistent with what the benchmarks show already… and with what most here experience anecdotally. There is a marked improvement with the Apple Silicon equipped Macs that narrows the gap, but the performance inequalities between the platforms are real and measurable.

I wonder, once there is a metal native version of the gap will narrow further…

finnjaeger · February 21, 2023, 7:01am

yes but i expected a more consitent gap tbh

allklier · February 22, 2023, 6:56am

Not necessarily, the platform performance will impact some features more than others. And in a way more helpful. What’s in the script of shot 4 that’s not in the others, or what is different on shot 4 in terms of content that makes it harder to scale an algorithm?

If the gap was more even then it would be harder to compare and isolate.

lewis · February 24, 2023, 3:58pm

Finn if you wanna really get into it you could try using the macOS profiling tools which come with Xcode - run Instruments.app, pick Time Profiler, choose Running Applications/Flame at the top and hit the red record button while you’re rendering… hit stop after 20 seconds or so then from the Call Tree menu at the bottom enable just Invert Call Tree and you’ll see a list of which functions were using the CPU the most.

I think I remember it showing useful stuff for Flame - things like a slow network share or most of the time being spent debayering R3Ds were obvious, and I definitely picked up the problem with weirdly sized IBL maps being crazy slow on Mac like this. Can feel a bit like deciphering entrails though… looks like this for Houdini, I can see my VDB surfacing is slowest by far:

finnjaeger · February 25, 2023, 8:49pm

love it lewis thank you!

Edusanjo · February 26, 2023, 7:27pm

I agree and I don’t. The “orange” dot in the nodes is great but must be used wisely. At the end is a cache, and having several “on” takes a lot of space from the ram/I don’t know where, and it gets slower.

It always depends on the resolution and amount of frames but in our world, or at least mine, of working with camera resolutions, 16 bits EXR with plenty of 3d, and with the number of shots we use to take…well…sometimes gets veeery slow.

Cleaning the cache and rebooting gets back to normal, but there should be something, a solution, more specific regarding this issue.

Topic		Replies	Views
Cryptomatte Spaghetti Flame Questions	9	555	January 20, 2022
Grainy Cryptomatte? Flame Questions batch	11	308	February 13, 2024
4K 32-bit EXR with passes / batch too slow Flame Questions action , exr	13	898	March 11, 2021
More performance from Mac Pro for long form Hardware Questions	11	704	February 11, 2022
2025 cryptomatte Flame Questions batch	0	100	March 8, 2025

Where is the bottleneck?

Related topics