• ForgotAboutDre@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      4
      ·
      4 months ago

      They are copying. These LLM are a product of their input, and solely a product of their input. It’s why they’ll often directly output their training data. Using more data to train reduces this effect, that’s why all these companies are stealing and getting aggressive in stopping others stealing their data.

    • aStonedSanta@lemm.ee
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      2
      ·
      4 months ago

      Proof? I am fairly certain I am correct but I will gladly admit fault. This whole LLM thing is indeed new to me also