Need some advice:
I'm looking for an open source text-to-speech library I can run locally or in a server. I needs to be pretty pleasant to listen to even if generation takes a bit longer. Ideally #python as the language but I'm open to looking around.
Recommendations? Found StyleTTS2 so far.
=> More informations about this toot | More toots from mkennedy@fosstodon.org
@mkennedy I tried this myself recently, and found it harder than I expected. Lots of people saying "just use one of the many amazing modern AI generated systems, download a voice you like, and you are good". But the options I tried were all a mess and hard to install.
I settled on piper, which only works against Python <= 3.10 (eye roll). I have dead snakes ppa installed so this Python is just an apt install for me...
=> More informations about this toot | More toots from tartley@mastodon.social
@mkennedy On Ubuntu/Pop!OS, I ended up writing my own script to wrap the installation of piper and its dependencies, downloading a voice, and the ultimate invocation:
https://github.com/tartley/dotfiles/blob/main/bin/say
=> More informations about this toot | More toots from tartley@mastodon.social
@mkennedy I would love to hear if there are better options or I'm doing it wrong.
=> More informations about this toot | More toots from tartley@mastodon.social
@mkennedy It's possible to make a less clumsy invocation of piper in bash which pipes the output directly into aplay, but this fails if your text to speak includes a period, because piper does some sort of seek on the output if it contains multiple sentences.
=> More informations about this toot | More toots from tartley@mastodon.social
@mkennedy (eye roll again)
=> More informations about this toot | More toots from tartley@mastodon.social
@tartley Thanks for all of this Jonathan! I'm starting to get the same feeling of clumsy academic projects that aren't installable or very hard to do so.
=> More informations about this toot | More toots from mkennedy@fosstodon.org This content has been proxied by September (ba2dc).Proxy Information
text/gemini