Flame GPU Benchmarks

Hey team, I ran a basic Flame GPU benchmark on a bunch of cards I’ve got in my office today.

Idea was yoinked from randy in the M1 thread!

My methodology was incredibly simple as I wanted to 100% use the GPU exclusively. I fed colour noise @ 4K and @ 8K into a motion analysis node, which then fed into a render node (rendering both forward and backward vectors). I timed how long it took to render 200 frames @ 4K and 100 frames @ 8K.

The cards in question are: A5000, 3090, 3080 and 3060. These cards are all ampere-based cards so the results were quite predictable. The biggest surprise was the 3080 comfortably beating the A5000 and the 3090 being significantly faster than the 3080, still.

I also thought the significantly larger VRAM of the A5000 would help it in the 8K test vs. the 3080 as VRAM usage was just starting to exceed 10GB but it didn’t. This may be because the A5000 is using DDR6 while the 3080/3090 use DDR6X.

ECC also seemed to add a solid 5% performance hit but I confess that I may not have set this up correctly. I just switched it on in the nvidia-settings window.

A final puzzling note (and one that might be solved by running the DKU between GPU swaps (something I was not prepared to do)) my box is A LOT snappier with the 3090 inside. loading up Flame is almost instantaneous with it. The others chug along for a bit first. Once flame is open it seems fine however.

I’ve got a real rink-y dink spreedsheet you can see with some real numbers.

Another test that would be useful to run would be using a heavy Action to viewport test and render from. If anyone has an action like that to share I would love to give it a try. I would also love to test an RDNA2 card but I don’t have any around.

Because the methodology is so simple, please feel free to add your times or verify my own. I know where an RTX 8000 is which I might do these test with.

Conclusion: Surprise! The 3090 will save you a lot of time-- not worth forking out more for an A5000 unless you need its blower design and/or lower TDP for rack installation (though there are blower design 3090s out there…).

5 Likes

Well that’s kind of suprising (and sad), the 3090 being almost 1K cheaper than an A5000.
Annoyingly Teradici also requires Quadro cards to work in Linux :frowning:

1 Like

Hey!

Would you mind sharing the setup for this? Would love to try it on my new build.

Thanks,

Andrew

1 Like

It’s on the Logik Portal as well as…

Thanks Randy!!

1 Like

Hey, the benchmark I made was incredibly simple and is different from the archive linked above. My benchmark only generated motion vectors so it can isolate the GPU. I can upload the 2 node batch for it in a moment if you are still interested!

Edit: Here is the link

Flame GPU Benchmark 2022.3

I have yet to use Teradici. I recently bought an A2000 and its sitting inside my flame box as it only requires PCIe power. would this enable teradici??

Honestly I’m not sure, but as long as it’s the quadro range it should enable it yes.

Apple M1 Ultra
flame benchmark test
14:03
WTF?

Did someone throw some numbers down?

I suspected that the Ultra might be around 15 based on the M1 Max tests I did—think they were around 24 if memory serves. There was no way it was going to get to that a6000rtx/3995 bench under emulation. Not bad numbers but definitely not Threadripper either.

There is a benchmark for M1 Max at 5:16 by @gizmorivera on the google sheet. I’m guessing that is the Macbook Pro. Did you do this test yourself @david-parker ?

yes
M1 MAX 64gb
MacBook Pro (16-inch, 2021)

something went wrong…i am assuming your stone is a 7200 rpm external drive…local SSD or a raid 0 SSD is a-lot faster.

Ran the bench last night again on a fully maxed out M1 Max 16 on 2022.3.

The times are no where near 5… I don’t know how you’re getting those speeds @gizmorivera unless you’re doing the GPU bench and not the timeline bench. The timeline bench arch wouldn’t have a 4K and 8k difference. It only 4k.

@david-parker’s speeds seem consistent with the timeline bench given what my M1 Max is doing. Maybe we’re both somehow getting it wrong…

Something seems weird, my 2020 10core iMac/5700xt runs the timeline bench in 14.00 flat.

Argument for a new bench right there.

1 Like

Perhaps a logik live in how to run the bench consistently @andymilkis

2 Likes

@friendlyape thanks for doing this! If you do another run, or have you 3090 in. How does it handle the ML timewarp? Curious if there is a difference.

Thanks
Brooks

probably a daft question - just running the Benchmark on my M1 Max MBP - and i was wondering how you can see how long a render takes?
I know the progress bar has a count down, but i don’t know what it started at, and i don’t remember ever having seen a figure for render times once its completed.

I haven’t done any benchmarking in particular but when ML timewarp first came out I was using a 2080. I bought a 3080 on release and it was significantly faster. My 3090 was a decent bump over the 3080 too.

I’ll put it on my to-do list for testing and report back!