A courts reporter wrote about a few trials. Then an AI decided he was actually the culprit.

Stopthatgirl7@lemmy.world · 1 month ago

A courts reporter wrote about a few trials. Then an AI decided he was actually the culprit.

femtech@midwest.social · 1 month ago

https://cloud.google.com/discover/what-are-ai-hallucinations#:~:text=AI hallucinations are incorrect or,medical diagnoses or financial trading.

AI hallucinations are incorrect or misleading results that AI models generate. These errors can be caused by a variety of factors, including insufficient training data, incorrect assumptions made by the model, or biases in the data used to train the model. A

EpeeGnome@lemm.ee · 1 month ago

Yes, hallucination is the now standard term for this, but it’s a complete misnomer. A hallucination is when something that does not actually exist is perceived as if it were real. LLMs do not perceive, and therefor can’t hallucinate. I know, the word is stuck now and fighting against it is like trying to bail out the tide, but it really annoys me and I refuse to use it. The phenomenon would better be described as a confabulation.

mindlesscrollyparrot@discuss.tchncs.de · 1 month ago

Sure, but which of these factors do you think were relevant to the case in the article? The AI seems to have had a large corpus of documents relating to the reporter. Those articles presumably stated clearly that he was the reporter and not the defendant. We are left with “incorrect assumptions made by the model”. What kind of assumption would that be?

In fact, all of the results are hallucinations. It’s just that some of them happen to be good answers and others are not. Instead of labelling the bad answers as hallucinations, we should be labelling the good ones as confirmation bias.

femtech@midwest.social · 1 month ago

It was an incorrect assumption based on his name being in the article. It should have listed him as the author only, not a part of the cases.

mindlesscrollyparrot@discuss.tchncs.de · 1 month ago

That is the error that the model made. Your quote talks about the causes of these errors. I asked what caused the model to make this error.

snooggums@lemmy.world · 1 month ago

Hallucinations is a fancy word for being wrong.

chiisana@lemmy.chiisana.net · 1 month ago

The models are not wrong. The models are nothing but a statistical model that’s really good at predicting the next word that is likely to follow base on prior information given. It doesn’t have understanding of the context of the words, just that statistically they’re likely to follow. As such, all LLM outputs are correct to their design.

The users’ assumption/expectation of the output being factual is what is wrong. Hallucination is a fancy word in attempt make the users not feel as upset when the output passage doesn’t match their assumption/expectation.

snooggums@lemmy.world · 1 month ago

The users’ assumption/expectation of the output being factual is what is wrong.

So randomly spewing out bullshit is the actual design goal of AI models? Why does it exist at all?

ApexHunter@lemmy.ml · 1 month ago

They’re supposed to be good a transformation tasks. Language translation, create x in the style of y, replicate a pattern, etc. LLMs are outstandingly good at language transformer tasks.

Using an llm as a fact generating chatbot is actually a misuse. But they were trained on such a large dataset and have such a large number of parameters (175 billion!?) that they passably perform in that role… which is, at its core, to fill in a call+response pattern in a conversation.

At a fundamental level it will never ever generate factually correct answers 100% of the time. That it generates correct answers > 50% of the time is actually quite a marvel.

chiisana@lemmy.chiisana.net · edit-2 1 month ago

If memory serves, 175B parameters is for the GPT3 model, not even the 3.5 model that caught the world by surprise; and they have not disclosed parameter space for GPT4, 4o, and o1 yet. If memory also serves, 3 was primarily English, and had only a relatively small set of words (I think 50K or something to that effect) it was considering as next token candidates. Now that it is able to work in multiple languages and multi modal, the parameter space must be much much larger.

The amount of things it can do now is incredible, but our perceived incremental improvements on LLM will probably slow down (due to the pace fitting to the predicted lines in log space)… until the next big thing (neural nets > expert systems > deep learning > LLM > ???). Such an exciting time we’re in!

Edit: found it. Roughly 50K tokens for input output embedding, in GPT3. 3Blue1Brown has a really good explanation here for anyone interested: https://youtu.be/wjZofJX0v4M

snooggums@lemmy.world · 1 month ago

They’re supposed to be good a transformation tasks. Language translation, create x in the style of y, replicate a pattern, etc. LLMs are outstandingly good at language transformer tasks.

That it generates correct answers > 50% of the time is actually quite a marvel.

So good as a translator as long as accuracy doesn’t matter?