This is actually an insane amount of gaslighting from #LLM.
The #Whisper transcription model automatically translates the speech of the user after detecting another language or accent, then the output of the model translates the speech of the user into that language even if the person is speaking English! Basically translation all transcriptions in the wrong language, but with correct meaning.
I'm just trying to transcribe some subtitles for #38c3 talks with English audio and it's giving me German, with no way to turn this off because it's baked into the model.
https://community.openai.com/t/whisper-is-translating-my-audios-for-some-reason/86468
=> More informations about this toot | More toots from h4sh@infosec.exchange
Can anyone recommend a local-only transcription software that doesn't use LLMs?
=> More informations about this toot | More toots from h4sh@infosec.exchange
@h4sh You can use Open Whisper and their local models, so nothing leaves your device: https://github.com/openai/whisper
For Mac I can recommend https://goodsnooze.gumroad.com/l/macwhisper as an easy-to-use version of it.
If you want to avoid LLMs completely (even offline) I suppose you won’t find comparative solutions.
=> More informations about this toot | More toots from ron@chaos.social
@ron Yea MacWhisper is where I encountered the bug - the OpenAI forum links were users encountering the same issue with Whisper. Using the Turbo model instead of the older "small" Whisper model fixed the issue
Kind of sad that there aren't good non LLM alternatives offline but I guess translation software has always been based on machine learning and transformers so it makes sense that the latest ones are all LLMs
=> More informations about this toot | More toots from h4sh@infosec.exchange
@h4sh Oh, I see. Thought the issue reports refer to the OpenAI API only. So my idea was that the older/smaller local models might be less „clever“.
Good to know that using another model helps mitigating this issue.
=> More informations about this toot | More toots from ron@chaos.social
@ron Yea.. I think using the OpenAI API for Whisper is just using the same models they open source online. Compared to ChatGPT, Whisper is a lot more special-purposed and lightweight (it doesn't take more than 5% of my macbook's battery and 10 minutes to transcribe a 1 hour long audio).
Where as the new o1 and o3 model would probably burn up a third of my battery just for one query (if it even runs locally)
=> More informations about this toot | More toots from h4sh@infosec.exchange This content has been proxied by September (3851b).Proxy Information
text/gemini