Weird h264 Audio behaviour

So i have stumbled across something thats extremely weird.

Lets say you take a prores File, for example this one:

and you convert it to h264 using lets say ffmpeg/shutter encoder.

you then import that back into flame and boom its off by 1 frame

but then… I import both clips into resolve and its… fine?

also in FCPX they are identical

so it seems to be a issue with how Flame interprets the h254?

Update: when I download the H264 from catchin syncs website directly its off by 1 frame in resolve

But its all the same and fine in FCPX

and all the same in Logic Pro.

also same in ProTools… all fine.

Trying again to find better more constient files to test with:

here you can download the files:
https://replayboys.fromsmash.com/soudnsyncTEST

Flame does this, everthing fine except the one created by ffmpeg.

Resolve, everything fine except the one created by flame

In FCPX all are fine

ProTools cant read the h264 from Flame but otherwise … fine?

Logic, same way, all is good.

my suspicion is the “duration” metadata of the file . as that seems to be the differentiating factor here.

2 Likes

I think its due to THIS

Basically AAC by design uses PRIMED samples , so bascially a bit of preroll or something like that for reasons, and then its like fixed “blocks” … something something but it seems like different tools are interpreting this differently based on whatever metadata or something …

Could this be related?

-Ted

1 Like

totally thats exactly why we found it.

Client complaint about async sound - Audio is blaming us because they ingested a mp4/aac and compared to their pcm/wav file in proTools.

Its not a AME problem per se bur rather flame not decoding correctly. Or ame not setting the correct metadata - potato potato

But its not just flame we have files that work in resolve but dont work in protools and vice versa…

its odd but its totally related

1 Like

Can you use ProRes or dnxhd for rough cuts to get around this?

its a AAC audio thing so you can also use .mov h264 which supports PCM/WAV audio… or anything else, problem is what file is the one that every pc/mac/whatever can open natively? h264 mp4 with aac audio…

mp4 h264 does not support PCM audio even though apparently it SHOULD since 2020 …

and in the end thats delivery spec often so what can you do? you cant really do anything… just live with it and know it might be a thing.

i love this kinda stuff and its a great feeling to be able to somewhat crack this.

Good to know that aac can be async depending on the encoder and decoder combo used and there is nothing much we can do.

A few years ago we ran into this issue with our audio department seeing our MP4s out-of-sync.
We found the Apple Dev article and a little more from what I remember.

I believe now, MP4s have a value set for audio priming in the MP4’s header. But Apple didn’t write QTKit/QuickTime 7 to read this value, and always assumes 2112 samples for audio priming. I think Apple’s development of QTKit pre-dated the MP4 spec addressing this issue.

So here is the fun part, most AAC encoders use 1024 samples, which is why we see the wrong offset in some apps. QTKit has been deprecated since Mac OSX Lion. And developers were told to transition to AVFoundation, which handles all of this MP4/AAC mess correctly. But, apps like Media Composer and Pro Tools were using QTKit Apple stopped shipping it with MacOS.

But I’m guessing because that took so long and because of complaints, some apps like AME did some janky offset to compensate. However by doing this offset to make it “right” in QTKit. You then makes it out-of-sync in anything using AVFoundation/anything modern.

So you get into the pickle and angry meeting on why the client’s iPhone, Pro Tools, and QT7 can’t all be in sync at the same time.

It seems to be getting better now that clients aren’t trying to play files back in QT7 and developers have been forced to re-write using AVFoundation. But we try to discourage importing the MP4 files internally to avoid this headache.

I always imagine a high fidelity future where we start using FLAC audio with MP4s. It is supported and whenever I try it, it always seems to work. But yet to see it take off.

2 Likes

Hi Finn,

I made some tests too and it looks like, that each software is dealing different with the delay of compressed audio.

I used a Prores as a base and encoded MP4s via Adobe Media Encoder, Shutter Encoder and Handbrake and imported these files to Adobe Audition and Audacity.

In Audacity:
Prores = reference
Shutter Ecoder MP4 and AME MP4 = 20ms later
Handbrake MP4 = 40ms later

In Adobe Audition:
Prores and AME MP4 = equal
Shutter Ecoder MP4 = 20ms earlier
Handbrake MP4 = 20ms earlier


Anyway we are talking about one or half a frame and in most cases no one would recognise the timeshift. And the mediaplayers should deal correctly with that delay :slight_smile:

Cheers
Hicka

1 Like

cool thanks for confirming my findings thats awesome!!

thank you so much, thats great info, at least we stumbled across the same issues all seperately :slight_smile:

Its really extremely good to know whats what, and if I know see aac audio beign not sync I know what to say and do , cheers man!

+1 on the FLAC part, also PCM audio in mp4 has been standadized since about 2020 but implementation is … lacking.

It’s the QT gamma issue of the audio world!

2 Likes

Elliot, nice to know these background infos.

What me confuses a bit is, that MP4s from Media Encoder and Shutter Encoder the “same” in AUDACITY, same difference of 20ms to the uncompressed audio and in ADOBE AUDITION the Shutter Encoder based MP4 is shifted 20ms ealier.

I “would” trust AUDACITY to display the real AUDIO PRIMING of all files correctly and not dealing with it.
Maybe ADOBE Products are analysing the header of a file and if there is written, that the file is encoded with an ADOBE product, than a fixed value of 1024samples were cut of when importing these files and for all other files they assume a length of 2112 samples.

1 Like

I would trust audacity too. I think the funk is coming from the encoding.

It’s likely the AAC encoder they’re using has 1024 audio priming. But when putting picture and audio together an offset is set.

Basically you get a mess when decoding. :slight_smile:

Our workflow is ProRes out of flame an simple ffmpeg commands to make deliverables.

1 Like

I do the same , ffmpeg to make deliverables, its fine its half a frame at most so … whatever :snowman_with_snow: