Mac studio finally in my hands. Testing surprise

PlaceYourBetts · March 22, 2024, 10:31am

I finally got my hands on a Mac Studio for testing. This isn’t my machine, so it’s not fully spec’d out the way I would have preferred, but I’m still very grateful to have been given the opportunity to test it.

It’s a Mac Studio M2 Ultra with 64GB of RAM and a 60-core GPU, running Flame 2024.2.1.

I’m reaching out to all Flame artists who have access to an M series Mac computer to help us test the performance of Apple’s new M series computers with Flame. Our recent experience with an M2 Ultra chip has yielded some surprising results, and I need your expertise to further understand and analyse these findings.

We have in the past used the benchmark test but I wanted to throw some of our real world comp shots at it.

Here’s a summary of our initial testing:

Impressive Rendering Speed: The render speed on the M2 Ultra chip for basic HDTV-sized shots was exceptionally fast. It was nearly three times faster than our newest Intel Mac and over 10 times faster than our older Linux machines. Please allow me to undulge you with a graph

HD_compTime

Giddy with excitement with that first result that we decided to throw some 4K at it. Our last project was a UHD vertical project for a door-sized installation. We initially ran this project on our newest Intel Mac, but soon found it to be sluggish at 4K, with better results on our aging Linux systems. This seemed like a good test for Flame 2024 on the M2.

Challenges with Complex Jobs: However, when faced with a more challenging resolution, such as one of our vertical UHD setups, we found the M2 machine was twice as slow.

These results initially blindsided me because of just how much slower the M2 was. However, a closer look at the two setups highlighted a significant difference. Our 4K setup involved lots of CG particles with LS_Airglow, LS_Glint, and some defocus.

In fact, when I simplified my setup and just used ColorNoise with a defocus blur node set to a value of 20, the results were shocking. I tested 200 frames on three of our systems: M2 (Flame 2024.2.1), one of our old Linux boxes (Flame 2023.3), and our Intel Mac (Flame 2023). The render time with the blur node applied to UHD in a vertical format skyrocketed on the M2, reaching 218.66 seconds.

blurNode

I ran my particle comp on all available machines: the M2 test machine, two aging Linux boxes, and our Intel Mac. I modified the setup from UHD vertical to UHD and finally HDTV to see the render speeds.

disneyParticles

The comp was consistently slower on the M2 this time. I would love your help, and I’m calling upon any Flame artists with access to Apple M series computers to test my blur defocus theory. Keep it simple: Color Noise and the blur node set to defocus 20. It might well be my M2 Ultra configuration.

simpleSetup

I also tested a few other nodes, y_lensblur, and Autodesk denoise. I started with my favorite defocus matchbox first and then denoise because I know it can be quite a heavy process. In both cases, I used 200 frames of Color Noise and then the effect.

matchbox
denoise

In none of these cases did the change in UHD orientation have a massive effect on render time. In fact, in both cases, the M2 was much faster.

Here’s how you can help:

Testing: Got an Apple M series computer? Try out Flame 2024 with blur defocus at different resolutions. Let us know how it goes - share render times, stability, and any surprises you encounter.
Comparative Analysis: If you’re working with Flame on M series Apple hardware and other setups like Intel Mac or Linux, compare the performance. I’d be curious to see how they stack up.

I am trying to get a better idea on Flame’s performance on the Mac studio to help me make the right hardware decision.

Thank you for your support and collaboration.

allklier · March 22, 2024, 11:13am

Very interesting observations.

I can’t add anything to your data points. Just some basic observation - I spend more time at my M1 MacStudio than my Linux Flame or my big Windows Nuke/CG box out of pure office convenience. And while there are things the MacStudio is blazing fast at, there are also things where it’s anything but fast.

I’ve seen others write about it, but I haven’t seen any specific conclusions on where the skeletons are. My guess it has to do with the Apple Silicon approach to GPU processing. While unified memory and some other aspects may favor their approach (e.g. no NVRAM concern), I think they’re still missing some of the secret sauce that NVidia and AMD have built up over the decades, or may simply not be able to get there because of system architecture constraints.

All that to say, I’m not at all surprised at your findings, that there are probably logical explanations for it, which will be good to get our hands on so we can make more informed decisions, but I haven’t seen anything concise that explains it with actual root causes and data.

finnjaeger · March 22, 2024, 12:31pm

CPU	RAM	GPUs	OS/Model description	storage speed	Flamebench2015(batchified)	mographAction	mographBatch	OfowAbelMilanes	CGintegration02	CGintegration01
mac studio	128GB	M2Ultra	M2UltraFullSpec	3GB/s	00:07:41	01:30:00	00:00:49	00:05:59	00:01:09	00:00:37
16c Xeon	256GB	Vega2Duo	Mac Pro	10GB/s	00:11:20	01:00:00	00:01:26	00:30:00	00:02:17	00:58
Ryzen 5950X	128GB	RTX3090	Linux/DIYworkstation	3.4GB/s	00:09:00	00:08:20	00:00:42	00:15:42	00:01:19	00:01:21

This sort of matches my observations , i tried different OFLOW setups and stuff to get some data on a fully specced M2 Ultra

its definetely hard to compare as its not just M2 is X times faster as X its like it depends on WHAT you are doing it might be faster than a liunux machine or maybe not

paul_round · March 22, 2024, 12:39pm

Just as an aside, I remember a couple of years ago, my intel macbook would render particles way quicker than my specced up linux box, 40 minutes vs many hours.

dcrites · March 22, 2024, 3:10pm

Hi Richard,

I have a fully spec’d MacPro M2 Ultra, and I did a quick test of the Color Noise/Blur Defocus 20 setup. In landscape format it took 25 seconds to render 200 frames. In portrait format it took 4 minutes and 38 seconds. Pretty surprising. 192 GB RAM Storage 9.2 GB/sec

hBomb42 · March 22, 2024, 3:56pm

@PlaceYourBetts what are the specs on the aging linux boxes you’re comparing to the M2?

cristhiancordoba · March 23, 2024, 2:20am

I got a M3 Max here, 14-Core CPU / 30-Core GPU / 36GB Unified Memory / 1TB SSD Storage… I can contribute for these tests if you want…

PlaceYourBetts · March 25, 2024, 3:28am

Ok so they are from 2016

Dual xeon (10 core) ES-2640 v4
128GB
8GB GeForce GTX 1080

PlaceYourBetts · March 25, 2024, 4:10am

Monday here with us now and I was able to get a bit more time testing this M2.

Ran the Flame benchmark - 05:29 (329.2s)

A very respectable time but i just can’t tear myself away from this Autodesk blur (defocus) anomaly.

I just ran various different resolutions of colour noise through a blur node set to a defocus value of 20 and recorded the time in seconds using renderTime_v2.py (842 Bytes)

I get a very unusual spike in render time around 3840px

Render time of Autodesk defocus on various resolutions on M2

allklier · March 25, 2024, 11:03am

Throwing one more variable into the mix - which version of Flame are you testing? Pre 2024.1 you had a totally different graphics engine, which might affect blur node processing - blurring is usually a GPU intensive algorithm.

That doesn’t explain the bump at 3840, but should be accounted for in the comparison data.

Pure informed speculation on your 3840 anomaly - maybe the way this blur node is coded and adapted to Apple Silicon GPU processing, it hits some API limitation and degrades into a different processing path (e.g. throwing it back to the CPU) that penalizes you.

If that is the case, then conclusion changes - it’s not that M2 is overall worse than Linux, but that there are a few specific nodes / operations that are problematic and may need to be optimized by ADSK. Finding those all will be another matter.

At least in the case of the blur node, there’s lots of alternatives to pick from. Others may not be so lucky.

johnag · March 27, 2024, 10:31pm

@PlaceYourBetts It’s worth reporting this to ADSK.

johnag · March 27, 2024, 11:40pm

I was just playing around and came across an interesting situation.

If you change the blur nodes on each res to 30 then you get the opposite effect. ie vertical is faster than landscape.

If you change the blur nodes to 50 then you get the same times on each.

Something weird going on with the way blur is handling resolutions at different blur values.

I Also noticed that if you duplicate the blur node and apply it to the other res the blur values change, again affecting render times.

PlaceYourBetts · March 28, 2024, 5:25am

Yeah I have been in touch with support. Very curious

mlheureux · March 29, 2024, 4:16pm

Curious to know what the render times would be for timewarp ML

hBomb42 · March 29, 2024, 4:21pm

Uhd is currently about a minute a frame on a maxed out Mac Studio, so…brutal. This will hopefully get better with the updates discussed here. The flame ML stuff runs as well as it does on Linux from my testing.

bryanb · March 30, 2024, 6:05pm

In my testing with Timewarp ML on M2… HD frames take four or five minutes each. Not good at all but we know the code is meant to run on cuda.

I’ve set up a Gaming PC (3040 gpu) to run RIFE on windows and it’s 1/5th the time to round trip Timewarps out to that PC and back into flame.

PlaceYourBetts · September 2, 2024, 1:41am

FLME-66259 Vertical resolution render time of Blur Defocus on Apple M2 slower than horizontal resolution

stu · June 14, 2025, 7:52am

Hi @PlaceYourBetts @johnag

My work around for this has taken my render from 2 minutes 50 seconds down to 47 seconds.
I’m on Flame 2025.2.3

Like you found John, there seems to be certain resolutions that Blur defocus doesn’t like on Mac.

I’ve found 3840x2160 and Blur defocus values of 80 are bad.
At 4096 x 2160 Blur defocus values of 50 and 80 are bad.
But then on 4096 x 2160 if you go to 51 on Blur defocus it’s fine.

What I’ve been doing is adding a resize node before the blur, set it to Centre, then increase the resolution slightly, do your Blur defocus, then crop the result back to working resolution.
For example if you resize your 4096x2160 media to 4100x2162 (Using Adaptive in resize node is easiest and keeps the aspect the same), then do your Blur defocus at 50 or 80 without any problems.

Also how do I see your bug report FLME-66259?
Can I upvote it or add more detail?
Maybe I just create a new one.

AdamArcher · June 14, 2025, 9:45am

That sounds crazy but explains a lot. We have a couple of Mac Studio M3 ultras and on one system we were seeing crazy slowdowns and on the other we weren’t. We likely were testing on different resolution footage. That just seems bizarre to me though?!!

allklier · June 14, 2025, 10:10am

Large blurs (in the ranges you describe) are compute intensive because you have to access many pixels of the frame for each pixel you render. It is an exponential curve of effort based on blur value.

So small difference in code and also memory layout of the image can have significant impact. Memory caching at the CPU level could be come into play why 50 is good but 51 isn’t in that particular implementation. There is no easy way to determine the culprit.

Topic		Replies	Views
Mac studio. Bloody hell. Amazing Flame Questions	145	7165	October 30, 2025
M4 Mac Studio Hardware Questions mac-studio , m4	45	1991	April 11, 2025
New update: Flame now supports Apple M1! 🎉 Logik Announcements	174	7338	April 19, 2022
2024.1 - Metal on MacOS Speed Tests Tips and Tricks benchmark-testing	2	644	July 27, 2023
Flame Benchmark Archive Flame Questions	55	4992	June 28, 2024

Mac studio finally in my hands. Testing surprise

Related topics