Hello, I’m looking to set up my own AI to where I can use my voice to talk to it and it will talk back to me with a generated voice. Is there free and open-source project out there where I can do this easily? It would be cool to see something like GPT4All implement something like this. I’m on Arch Linux using a 7900 XTX.

  • Scrubbles@poptalk.scrubbles.tech
    link
    fedilink
    English
    arrow-up
    29
    arrow-down
    1
    ·
    edit-2
    2 days ago

    Generally there are not LLMs that do this, but you start building up a workflow. You speak, one service reads in the audio and translates it to text. Then you feed that into an LLM, it responds in text, and you have another service translate that into audio.

    Home Assistant is the easiest way to get them all put together.

    https://www.home-assistant.io/integrations/assist_pipeline

    Edit agree with others below. Use the apps that are made for it.

    • Whisper for STT
    • Any hosted LLM can work, text-generation-webui or tabbyapi
    • I use xttsv2 for TTS