Hey team, I ran a basic Flame GPU benchmark on a bunch of cards I’ve got in my office today.
Idea was yoinked from randy in the M1 thread!
My methodology was incredibly simple as I wanted to 100% use the GPU exclusively. I fed colour noise @ 4K and @ 8K into a motion analysis node, which then fed into a render node (rendering both forward and backward vectors). I timed how long it took to render 200 frames @ 4K and 100 frames @ 8K.
The cards in question are: A5000, 3090, 3080 and 3060. These cards are all ampere-based cards so the results were quite predictable. The biggest surprise was the 3080 comfortably beating the A5000 and the 3090 being significantly faster than the 3080, still.
I also thought the significantly larger VRAM of the A5000 would help it in the 8K test vs. the 3080 as VRAM usage was just starting to exceed 10GB but it didn’t. This may be because the A5000 is using DDR6 while the 3080/3090 use DDR6X.
ECC also seemed to add a solid 5% performance hit but I confess that I may not have set this up correctly. I just switched it on in the nvidia-settings window.
A final puzzling note (and one that might be solved by running the DKU between GPU swaps (something I was not prepared to do)) my box is A LOT snappier with the 3090 inside. loading up Flame is almost instantaneous with it. The others chug along for a bit first. Once flame is open it seems fine however.
I’ve got a real rink-y dink spreedsheet you can see with some real numbers.
Another test that would be useful to run would be using a heavy Action to viewport test and render from. If anyone has an action like that to share I would love to give it a try. I would also love to test an RDNA2 card but I don’t have any around.
Because the methodology is so simple, please feel free to add your times or verify my own. I know where an RTX 8000 is which I might do these test with.
Conclusion: Surprise! The 3090 will save you a lot of time-- not worth forking out more for an A5000 unless you need its blower design and/or lower TDP for rack installation (though there are blower design 3090s out there…).
Hey, the benchmark I made was incredibly simple and is different from the archive linked above. My benchmark only generated motion vectors so it can isolate the GPU. I can upload the 2 node batch for it in a moment if you are still interested!
I suspected that the Ultra might be around 15 based on the M1 Max tests I did—think they were around 24 if memory serves. There was no way it was going to get to that a6000rtx/3995 bench under emulation. Not bad numbers but definitely not Threadripper either.
Ran the bench last night again on a fully maxed out M1 Max 16 on 2022.3.
The times are no where near 5… I don’t know how you’re getting those speeds @gizmorivera unless you’re doing the GPU bench and not the timeline bench. The timeline bench arch wouldn’t have a 4K and 8k difference. It only 4k.
@david-parker’s speeds seem consistent with the timeline bench given what my M1 Max is doing. Maybe we’re both somehow getting it wrong…
probably a daft question - just running the Benchmark on my M1 Max MBP - and i was wondering how you can see how long a render takes?
I know the progress bar has a count down, but i don’t know what it started at, and i don’t remember ever having seen a figure for render times once its completed.
I haven’t done any benchmarking in particular but when ML timewarp first came out I was using a 2080. I bought a 3080 on release and it was significantly faster. My 3090 was a decent bump over the 3080 too.
I’ll put it on my to-do list for testing and report back!