Where is the bottleneck?

finnjaeger · February 15, 2023, 7:34pm

So ive got this pretty big batch script and it EXTREMELY slow, as in 1 frame every 10s slow on my mac pro.

while rendering i look at all my stats, CPU, GPU, ram… storage,… they are all ideling around, not a single core is peaked so it does not look like some single thread limitation. (its like 20% useage on all cores)
How can that be, how do you find the bottleneck? are there any logs or stats for a batch script like nukes profiler node? or is there a way to simply render multiple frames at once like nukes frameserver (or deadline)? for nuke its usually single core beign maxxed out so rendering 16 frames at once is a almost 16x speed improvement, maybe I am just overlooking something but its like ridicolously slow.

There are a bunch of heavy cryptomatte nodes and whatnot but still, it should at least max out some part of my system when rendering, no?

PlaceYourBetts · February 15, 2023, 8:52pm

I have been pre-rendering my cryptomattes @finnjaeger lately

But also if I am trying to find the bottle neck I find pre-rendering can be a helpful way of finding that stuff out. You don’t even need to use write or render you can just hit that orange dot that everyone loves so much and cache a few nodes to simulate a pre-render and then see how things improve ¯\_(ツ)_/¯

finnjaeger · February 15, 2023, 10:59pm

Yea cryptos are weirdely slow, but they should max out some piece of hardware still, or am i thinking wrong?

I understand that if a downstream node is waiting for a upstream node and that node is really slow thats what it is, bur seeign my machine just basiclaly idle while everything is slow is just kinda infuriating.

So lets say you look at just cryptomatte node, it doesnt use many ressources but is still slow - how can that be? there needs to be a bottleneck somewhere that I am not able to see, something like memory bandwidth? GPU memory speed? i have no idea.

PlaceYourBetts · February 15, 2023, 11:23pm

What about the access and unpacking of the 32-bit file from off the server.
Do you cache the cryptomatte import?

finnjaeger · February 16, 2023, 8:10am

Source is a 16bit exr thats cached on import, so it should live on my framestore as PIZ compressed, shouldnt be that hard. Even if the decompress would be hard I would see my cpu spike?

The thing is its not just this comp in particular i feel like Flame is really fast until it hits a steep drop in performance and I can usually never tell why, i feel like a LOT of performance is left on the table.

Need to dive deeper into this, it kinda sounds to me like something just isnt optimzied and is waiting for stuff unnecessarry or so

ALan · February 16, 2023, 12:26pm

What about GPU ram? There are situations, mostly with MotionVector Tracking, that GPU ram leaks and then the card crashes and flame turns into dogshit. In Terminal type dmesg and look for nvidia stuff. You can also monitor via Resource Monitor in flame.

finnjaeger · February 16, 2023, 6:40pm

interesting this would be a mac in this case, checked vram useage and it looked all good gpu was doing mostly nothing

AdamArcher · February 16, 2023, 7:28pm

If it is a memory leak I don’t think there is an easy way of monitoring/detecting that looking at the system analytics. More than happy to be corrected and enlightened on this though.

finnjaeger · February 17, 2023, 6:19am

i expected a memory leak to show up as excessive memory useage but i might be wrong, need to check some other comps and see what vauses the dropoff in performance with almost no hardware usesge

Ton · February 17, 2023, 9:01am

Did you check vram in macOS or Resource manager in Flame?

finnjaeger · February 17, 2023, 9:20am

yea its a long done project just revisiting to figure our why it was so damn slow.

finnjaeger · February 17, 2023, 9:21am

using “stats” tool in macOS. but I can check the ressource manager.

currently updating my linux flame to see if that same project runs 100* better maybe…

finnjaeger · February 17, 2023, 12:07pm

Its extremely fast on my linux flame … so … its just the mac beign a mac who could have guessed .

16c mac pro with dual vega2 and a nvme raid vs diy linux flame witha 16c ryzen cpu and a 3090…

AdamArcher · February 17, 2023, 12:24pm

It’s interesting how the Mac Studio that is half the price of the Mac Pro is faster for Flame but a gaming PC that is half the price again is faster still.

kily · February 17, 2023, 1:00pm

When a batch becomes unusually slow, my suspicion is always vram. Regardless of what the resource manager shows. Sometimes, this happens after a long time with flame opened and taking resources. I’m sure you’ve already done it, but by restarting flame, works for me 70-80% of the time to get back to the original render time. Also I check it out to have the minimum possible batches, libraries, etc. opened.

I have a custom machine with an “old” 1080 ti (here known by “game cards”), a bit short on vram, but it works fine restarting flame with this kind of issues. On another sever I use a 2080 ti, working really well and solid.

But I would bet that with so many cryptomate nodes, it simply full vram. Flame is so vram demanding.

Graphics capabilities on mac are a disaster. The amd cards simply sucks. It’s also disappointing to see how rendering full batchs , cpu consumption never exceeds 30%-40%. That’s why I switched to linux when we renewed the machine. Mac plattform is a sad joke.

finnjaeger · February 17, 2023, 2:36pm

Yea even though the mac has 32Gb vram and the 3090 only has 24G

I think its just how the mac is… everything just feels 10x as slow as on the linux thing.

Its not a perfect benchmark and there was only 2 live cryptomatte nodes left but its

5fps playback on the linux machine and 3-4s per frame on the mac so… its quiet crazy both having nvme raids all is cached etc.

ALan · February 17, 2023, 3:03pm

Flame only uses 1 GPU, so if you are counting 32GB as 2x16GB Vega, well, no.

finnjaeger · February 17, 2023, 3:17pm

Vega 2 Duo 64Gb total , 32Gb each

finnjaeger · February 18, 2023, 11:58am

Yea I mean thats to find what slows down the script,i understand that, and I know which are heavy nodes,

the thing is that baffles me is that a “heavy node” would still use system ressources, this mac is idleing doing almost nothing while flame is slow as molasses.

also same script on linux uses way more ressources like 80% CPU on all 32 cores, gpu is doing things… , So I assume there is something on the mac system where its a harware bottleneck that I cant see, something like moving frames from the system ram to the gpu, stuff you cant see in any type of task manager.

something that makes the CPU and GPU wait for data…

AdamArcher · February 18, 2023, 8:59pm

I wonder if it could have something to do with the graphics API? You’d think it would either just work or not but the Mac implementation of open gl is depreciated and some capabilities are limited.

That’s just a wild stab in the dark but I’m wondering if there is a node in your setup where this is the case so slowing it right down without seeing anything in the metrics.

Topic		Replies	Views
Flame Benchmark Archive Flame Questions	55	4859	June 28, 2024
Why Renders so slow? Flame Questions color-management , renders	21	607	January 5, 2024
Mac Pro vrs Z8 Hardware Questions linux	21	2454	May 26, 2021
A port of a port of a port = terrible performance on mac Tools and Tech	16	465	January 26, 2021
Frustrating day in Flame Flame Questions timeline	65	990	February 25, 2025

Where is the bottleneck?

Related topics