• CubitOom@infosec.pub
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    9 months ago

    So what is Home Assistant using for this?

    If I were to build it myself I’d probably over complicate it by using multiple llm agents on a local server. Probably use whisper to do the speech to text and then Mistral fine tuned on the Rosetta code dataset to send the API calls to HA. However that wouldnt keep it from always listening to me and trying to interpret what I say into a command for HA. Is that just a prompting issue for whisper or would I need another agent to turn on whisper?

    I could maybe get this to run without specialized hardware like a GPU but it would be better to have something for the llms to be a bit more responsive.

    • redcalcium@lemmy.institute
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      9 months ago

      There is no LLM, it just used to recognize simple commands such as “turn on kitchen light”. What the “conversation agent” can do is very limited, though you can extend it to recognize custom commands. It’s not comparable to Google Assistant/Siri, let alone ChatGPT.

      • 4am@lemm.ee
        link
        fedilink
        English
        arrow-up
        4
        ·
        9 months ago

        I believe there is a ChatGPT integration in the works (optional, of course)

        • Serinus@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          9 months ago

          If it runs locally, that’ll be awesome. I just hope it never decides to turn the heat up to 90F.

          • Saik0@lemmy.saik0.com
            link
            fedilink
            English
            arrow-up
            2
            ·
            9 months ago

            There’s plenty of local LLM options these days. It’s entirely feasible to run it in house.

            And if someone can do it… I would suspect that there’ll be a HACS module up about 2 weeks ago…

          • Buddahriffic@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            9 months ago

            Ideally IMO you’d want a system with safeties in place. Like acceptable temperature ranges or durations for the oven to be on to avoid situations where the software misinterprets a command in a dangerous way.

            Something like this:

            User: Set temperature to 19 degrees. (Yeah it’s on the cold side even for Celsius, but not a crazy amount as room temperature is around 22 degrees)

            Assistant: Setting temperature to 90 degrees. (Deadly in Celsius… Water boils at around 100 degrees, depending on pressure)

            Assistant: 90 degrees is outside of the safe range defined by your configuration. Intrusion suspected. Deploying sentry guns.

            • AA5B@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              9 months ago

              Good question - I have an allowed range configured on my thermostat but I don’t know if it applies to API calls or is just for the UI

      • CubitOom@infosec.pub
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        Ok, hmm I wonder how much work it would be to implement it using open source models. I think the hardest part would be translating the voice instructions to an API call that HA can use correctly.

        Then there is the whole hardware issue to fix too. I do know that some SOCs are getting good at running 7B parameter models locally but the cost is still probably going to be prohibitive.