Hello, I’m looking to set up my own AI to where I can use my voice to talk to it and it will talk back to me with a generated voice. Is there free and open-source project out there where I can do this easily? It would be cool to see something like GPT4All implement something like this. I’m on Arch Linux using a 7900 XTX.
Whisper is the way to go for speech to text (edit: had that backwards). Whisper.cpp is decently fast too: https://github.com/ggerganov/whisper.cpp/releases/tag/v1.7.1 Get the binaries from the link that’s on that page (god GitHub usability sucks)
I thought whisper was hallucinating huge chunks of text in that medical transcription app. Is it more reliable with smaller chunks?
Whisper is fantastic and has different sized models so you can zero in to what gives you the best mix of speed/accuracy for whatever hardware you’ll be running it on