Ancestors

Toot

Written by fasterthanlime 🌌 on 2025-01-17 at 17:04

oh my god. OH my god.

https://obsproject.com/blog/obs-studio-hybrid-mp4

=> View attached media

=> More informations about this toot | More toots from fasterthanlime@hachyderm.io

Descendants

Written by draeath on 2025-01-17 at 17:53

@fasterthanlime why can't the encoder initialize itself so that it can take its first sample as an actual input?

Such a strange decision, with such infuriating ramifications.

=> More informations about this toot | More toots from draeath@infosec.exchange

Written by Josh Jersild on 2025-01-17 at 21:03

@draeath @fasterthanlime I've worked on codecs (not these ones): there are technical reasons for this to be the case - the filters that are used to decode an encoded audio stream, among other things, use overlapping techniques (probably something like an MDCT or ELT) which means the first outputs from said filters are necessarily going to contain unusable data.

Because of this, many (most?) codecs have some kind of priming scenario at stream start where zeros are fed into the filters to get them rolling, so that the first real audio data to go in will come out at the beginning of the first "meaningful" output sample. But since most of these filters are written as roughly "put N samples in, get N samples out" (it's not quite that because the input is in frequency domain, but close enough), the first batch or two have to be discarded by the code (somewhere) as they won't contain usable audio.

It sounds like this particular implementation just didn't do that part properly

=> More informations about this toot | More toots from JoshJers@peoplemaking.games

Written by Josh Jersild on 2025-01-17 at 21:04

@draeath @fasterthanlime (and, to be clear, this is an issue on the decode side, not the encode side)

=> More informations about this toot | More toots from JoshJers@peoplemaking.games

Written by :heart_clockwork: 0x4d6165 :queer_anarchy: on 2025-01-17 at 18:33

@fasterthanlime another W for ffmpeg

=> More informations about this toot | More toots from 0x4d6165@wanderingwires.net

Written by F4GRX Sébastien on 2025-01-17 at 18:53

@fasterthanlime yes but what about jitter?

=> More informations about this toot | More toots from f4grx@chaos.social

Written by Leon P Smith on 2025-01-17 at 19:34

@f4grx @fasterthanlime 6.5ms jitter in the timebase would translate to a audio noise problem that would be supremely obvious.

=> More informations about this toot | More toots from leon_p_smith@ioc.exchange

Written by yomimono, still on earth on 2025-01-17 at 18:53

@fasterthanlime when your encoder needs a millennial pause

=> More informations about this toot | More toots from yomimono@wandering.shop

Written by Hector Martin on 2025-01-17 at 18:55

@fasterthanlime OBS audio jank, news at 11... they used to have a bug that would randomly kill audio after a few hours, I fixed that one and saw how the sausage is made and it's not pretty. (Worst part was they were gaslighting users for years into believing the bug was their fault and not OBS's until I started sending PRs, but I digress...)

The good news is audio being delayed is definitely a lot better than audio being early. 50ms of delay is just 17 meters at the speed of sound, which we are very used to in concerts etc.

But this is kind of moot because any self-respecting streamer would have adjusted audio/video sync by watching and listening to the output (or a recording), which makes output encoder delay irrelevant since you'd just calibrate it out as long as it's consistent. In particular, OBS's internal live audio monitoring is completely asynchronous, and therefore useless for gauging any kind of A/V sync (on Linux the PulseAudio backend even falls behind by whole seconds sometimes, it's bad), so you have to check the output. And even then, most streamers don't bother with proper A/V sync tuning at all, and then it doesn't matter either since they have bigger problems. So this change will actually end up throwing already-calibrated setups out of sync instead.

The even bigger problem for getting this truly correct is OBS has no way to sync discrete input audio and video streams to begin with at all, so every time you start it you get a random offset. The entire stack is based on wishful thinking and hoping it's "realtime enough" that the random jitter isn't horrible, but that's highly environment dependent. You can't really fully blame OBS for that, since desktop OSes generally don't have frameworks with end-to-end synchronized audio/video with a common timestamp source, unless you're using high-end SDI equipment or something.

There isn't even a way to reliably sync multiple audio sources even if they come from the same clock domain, e.g. if you add multiple JACK audio sources (for example, to send them to different audio output streams in multi-audio-stream mode) they will randomly be in or out of sync by one period or so (I tested this). This one is something OBS could do better, but it doesn't.

So in the end agonizing over fine output audio/video sync is kind of pointless in today's streaming world, because the only thing that can even be already synced in OBS to begin with is an audio/video playback source, which is only useful if all you're using OBS for is playing back premade videos into a stream.

TL;DR just calibrate A/V sync so that the output "feels" right aiming for erring on the side of delayed audio, and expect easily 20ms of random offset from startup to startup. It's the best you can do.

And also, if you're a streamer and you're ever doing singing streams, it's way more important to make sure your voice is in sync with the backing track, video be damned. And the way you do that is by doing the mixing, monitoring, and audio processing entirely outside of OBS, ensuring your monitoring mix is identical in timing to what gets sent to OBS (easiest is to just do software monitoring and literally use the same mix, or do all your mixing in a hardware mixer, but you can also do hardware mic monitoring and software backing playback if you calibrate the delays properly, use a single audio interface for I/O, and a software stack that gets everything right like JACK or PipeWire). Under no circumstances play back the audio track in OBS with its built in monitoring, or use its mixer as your main mixer for karaoke, it's worse than useless for this purpose. Yes, this means OBS's built-in audio filter support is useless if you're doing karaoke, since you need to move all that processing off of OBS to guarantee consistent sync.

There is one way to acceptably use the OBS audio monitoring feature for karaoke, which is to set your backing track to "monitor only", then capture OBS's monitoring output at the OS level and feed it into your offboard processing/mixing/actual monitoring setup and back into OBS once mixed in sync with your mic feed. Then you can use OBS playback for the backing track and some offset there doesn't matter, which might be useful to play karaoke videos or something. Just don't mix inside OBS.

=> More informations about this toot | More toots from marcan@treehouse.systems

Written by Ivan Molodetskikh on 2025-01-17 at 19:51

@marcan @fasterthanlime i remember at some point i tried to sync webcam video feed to game capture and audio feeds and to microphone feed, and it somewhat worked, but even it would sometimes drift apart over the duration of the stream, and randomly get completely off across PC restarts. A losing battle it seems

=> More informations about this toot | More toots from YaLTeR@mastodon.online

Written by falktx@mastodon.falktx.com on 2025-01-17 at 20:14

@marcan

I dealt with this before and created https://github.com/DISTRHO/Zinc just to deal with sample-accurate audio monitoring/recording when using OBS and JACK.

The audio plugin hosting I did for OBS was rejected in the end, so the usefulness of these plugins is minimal unless you manually build from https://github.com/falkTX/obs-studio/tree/carla-v3

=> More informations about this toot | More toots from falktx@mastodon.falktx.com

Written by ROTOPE~1 :yell: on 2025-01-17 at 20:54

@marcan if Johnny Streamer with OBS can get anywhere within 700ms of sync, they're doing far better than my local CBS affiliate.

=> More informations about this toot | More toots from rotopenguin@mastodon.social

Written by David Blume on 2025-01-17 at 19:23

@fasterthanlime I love seeing things like this in the wild. I'm one of the app peeps at Roku (streaming media players/TVs), and dealing with priming samples for A/V sync is a concern. One ends up having to reconcile with streaming services as to how/where compensations occurs.

=> More informations about this toot | More toots from dblume@mastodon.social

Written by RejZoR on 2025-01-17 at 20:57

@fasterthanlime Doesn't matter when using JBL earbuds. They are so shit they have half a second delay by default. Those extra 40ms make no difference lol

=> More informations about this toot | More toots from rejzor@mastodon.social

Written by Stepland on 2025-01-17 at 22:05

@fasterthanlime I think mp3 has a similar problem that I only know about because it caused enough problems in the DDR scene to become relatively common knowledge there

You'd hit the problem when setting the audio sync for a stepchart in a piece of software that used a different mp3 lib than the one you'd play the file with

=> More informations about this toot | More toots from Stepland

Written by Ben Stokman on 2025-01-18 at 14:33

@fasterthanlime "warm up the encoder"??? Is this real?

=> More informations about this toot | More toots from benjistokman@mast.benstokman.me

Written by Dennis on 2025-01-18 at 17:27

@fasterthanlime As the person who wrote the blog post I think the note that the default FFmpeg AAC encoder wasn't affected by this because it happens to line up at 0 anyway is fairly important, because probably 90+% of users just use that (And almost nobody uses Opus). This was primarily an issue for users on macOS and those who intalled iTunes on Windows and used CoreAudio as a result.

=> More informations about this toot | More toots from Dennis@chaos.social

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113844778212305519
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
357.182219 milliseconds
Gemini-to-HTML Time
4.206113 milliseconds

This content has been proxied by September (3851b).