Ancestors

Toot

Written by Crow on 2025-01-07 at 21:20

Is there an FOSS alternative to dragon naturally speaking, that hopefully does not rely upon ridiculous ai LLM technology that is burning the planet?

=> More informations about this toot | More toots from Crow@pagan.plus

Descendants

Written by Nicola on 2025-01-07 at 22:44

@Crow I feel this is something that @KathyReid might be across, given her (nuanced and very cool) presentation at LCA a few years ago.

https://archive.org/details/lca2019-Open_Source_AI_and_speech_recognition

=> More informations about this toot | More toots from nnye@aus.social

Written by Kathy Reid on 2025-01-07 at 23:26

@nnye @Crow Sorry I thought I replied to this but it doesn't seem to have come through.

The one piece that I'm aware of is a project called Linux Voice Control, which uses Whisper under the hood - which isn't really open source, it's just freely available.

https://github.com/omegaui/linux-voice-control

I haven't tried this out at all so cannot vouch for usability, but it would be my starting point.

=> More informations about this toot | More toots from KathyReid@aus.social

Written by Kathy Reid on 2025-01-07 at 23:11

@Crow The short answer is "sort of".

If we break down Dragon Naturally Speaking (which is now Nuance, which was purchased by Microsoft in April 2021 for $16 billion, so it's Microsoft), it has two broad functions.

One is speech recognition, for which there are freely available but not really open source options, primarily Whisper.

The second is controlling the desktop through speech, where the main one I'm aware of is Linux Voice Control, although I haven't used it. I am fairly sure it uses Whisper under the hood, as evidenced by

https://github.com/omegaui/linux-voice-control/blob/c9238684d0dee6a42befb2034b4e73e007328a9d/requirements.txt#L9

https://github.com/omegaui/linux-voice-control

Happy to answer more Q's and thanks for the flag, @nnye !

=> More informations about this toot | More toots from KathyReid@aus.social

Written by Benjamin Sonntag-King on 2025-01-07 at 23:11

@Crow As far as I know, whisper (the transcription software from openai) is opensource

https://openai.com/index/whisper/

and as much as I hate openai, this software is nice :)

DISCLAIMER: they say it should absolutely not be used to transcribe professionnal speech where a lot of word may not be known to the AI... since the program may invent terms...

https://www.thehindu.com/sci-tech/technology/openais-whisper-transcription-tool-used-in-hospitals-invents-things-no-one-ever-said-researchers-claim/article68805114.ece

that's a big IF for its usage...

=> More informations about this toot | More toots from benjamin@piaille.fr

Written by Crow on 2025-01-07 at 23:23

@benjamin noted, thanks 👍

=> More informations about this toot | More toots from Crow@pagan.plus

Written by Jeff Forcier on 2025-01-08 at 00:48

@Crow I haven’t used either so maybe this isn’t enough overlap, but: when I hear “open source speech recognition” I always see folks discussing Talon: https://talonvoice.com/

=> More informations about this toot | More toots from bitprophet@social.coop

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113789160862779534
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
273.094977 milliseconds
Gemini-to-HTML Time
1.180288 milliseconds

This content has been proxied by September (ba2dc).