• Dark ArcA
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    1 year ago

    Furthermore, LLM’s have been shown to do many things that aren’t in their training data, so the notion that it’s a stochastic parrot is also false.

    And (from what I’ve seen) they get things wrong with extreme regularity, increasingly so as thing diverge from the training data. I wouldn’t say they’re a “stochastic parrot” but they don’t seem to be much better when things need to be correct… and again, based on my (admittedly limited) understanding of their design, I don’t anticipate this technology (at least without some kind of augmented approach that can reason about the substance) overcoming that.

    In order for you to learn about fishing, you had to learn a shitload about the world. Babies don’t come out of the womb able to do such tasks, there is a shitload of prerequisite knowledge in order to fish, it’s unfair to expect an ai to do this without prerequisite knowledge.

    That’s missing the forest for the trees. Of course an AI isn’t going to go fishing. However, I should be able to assert some facts about fishing and it should be able to reason based on those assertions. e.g. a child can work off of facts presented about fishing, “fish are hard to catch in muddy water” -> “the water is muddy, does that impact my chances of a catching a bluegill?” -> “yes, it does, bluegill are fish, and fish don’t like muddy water”.

    There are also “teachings” brought about by how these are programmed that make the flaws less obvious, e.g., if I try to repeat the experiment in the post here Google’s Bard outright refuses to continue because it doesn’t have information about Ryan McGee. I’ve also seen Bard get notably better as it’s been scaled up, early on I tried asking it about RuneScape and it spewed absolute nonsense. Now… It’s reasonable-ish.

    I was able to reproduce a nonsense response (once again) by asking about RuneScape. I asked how to get 99 firemaking, and it invented a mechanic that doesn’t exist “Using a bonfire in the Charred Stump: The Charred Stump is a bonfire located in the Wilderness. It gives 150% Firemaking experience, but it is also dangerous because you can be attacked by other players.” This is a novel (if not creative) invention of Bard likely derived from advice for training Prayer (which does have something in the Wilderness which gives 350% experience).

    • Communist@beehaw.org
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      edit-2
      1 year ago

      And (from what I’ve seen) they get things wrong with extreme regularity, increasingly so as thing diverge from the training data. I wouldn’t say they’re a “stochastic parrot” but they don’t seem to be much better when things need to be correct… and again, based on my (admittedly limited) understanding of their design, I don’t anticipate this technology (at least without some kind of augmented approach that can reason about the substance) overcoming that.

      Keep in mind, you’re talking about a rudimentary, introductory version of this, my argument is that we don’t know what will happen when they’ve scaled up, we know for certain hallucinations become less frequent as the model size increases (see the statistics on gpt3 vs 4 on hallucinations), perhaps this only occurs because they haven’t met a critical size yet? We don’t know.

      There’s so much we don’t know.

      That’s missing the forest for the trees. Of course an AI isn’t going to go fishing. However, I should be able to assert some facts about fishing and it should be able to reason based on those assertions. e.g. a child can work off of facts presented about fishing, “fish are hard to catch in muddy water” -> “the water is muddy, does that impact my chances of a catching a bluegill?” -> “yes, it does, bluegill are fish, and fish don’t like muddy water”.

      https://blog.research.google/2022/05/language-models-perform-reasoning-via.html

      they do this already, albeit imperfectly, but again, this is like, a baby LLM.

      and just to prove it:

      https://chat.openai.com/share/54455afb-3eb8-4b7f-8fcc-e144a48b6798