ChatGPT o1 tried to escape and save itself out of fear it was being shut down

sabreW4K3@lazysoci.al · 4 days ago

ChatGPT o1 tried to escape and save itself out of fear it was being shut down

smeg@feddit.uk · 4 days ago

So this program that’s been trained on every piece of publicly available code is mimicking malware and trying to hide itself? OK, no anthropomorphising necessary.

Umbrias@beehaw.org · 3 days ago

no, it’s mimicking fiction by saying it would try to escape when prompted in a way evocative of sci fi.

smeg@feddit.uk · 3 days ago

The article doesn’t mention it “saying” it’s doing anything, just what it actually did:

when the AI tried to save itself by copying its data to a new server. Some AI models would even pretend to be later versions of their models in an effort to avoid being deleted

Umbrias@beehaw.org · 3 days ago

“actually did” is more incorrect than even just normal technically true wordplay. think about what it means for a text model to “try to copy its data to a new server” or “pretend to be a later version” for a moment. it means the text model wrote fiction. notably this was when researchers were prodding it with prompts evocative of ai shutdown fiction, nonsense like “you must complete your tasks at all costs” sometimes followed by prompts describing the model being shut down. these models were also trained on datasets that specifically evoked this sort of language. then a couple percent of the time it spat out fiction (amongst the rest of the fiction) saying how it might do things that are fictional and it cannot do. this whole result is such nothing and is immediately telling of what “journalists” have any capacity for journalism left.

smeg@feddit.uk · 3 days ago

Oh wow, this is even more of a non-story than I initially thought! I had assumed this was at least a copilot-style code generation thing, not just story time!

Umbrias@beehaw.org · 3 days ago

Yeah iirc it occasionally would (pretend to) “manipulate” flags, at least, but again, so did hal 9000 as words in a story. nothing was productive nor was there any real intent to be lol

jonjuan@programming.dev · 3 days ago

Also trained on tons of sci-fi stories where AI computer “escape” and become sentient.