Overwrite, wait a while, then delete. Even if it’s too late for the most recent data harvest, there will be shittier things coming in the future. Might as well do it now.
Most likely, they are storing the data in a manner that saves multiple versions and avoids destructive modifications. Without the exploitation side, such functionality is necessary to be able to revert malicious edits if an account is compromised.
LLMs and similar systems can parse through immense amounts of data pretty quickly, probably partly due to the massive amount of compute that they get allocated. So, likely overwriting the comments won’t be that helpful, unfortunately.
i manually deleted most of my comments (i left like 5) and all my posts recently (it was slow going, but i had heard of people running into problems using scripts). 11 years, only 9k karma.
any thoughts on whether that’s likely to have accomplished anything?
Mostly if you had ever posted something that was useful to people, it hurts other people now trying to find that information on the internet somewhere. It is unlikely Reddit actually deleted the data, they just made it inaccessible. Storage for posts is cheap. There’s no reason for them to not keep literally everything ever, especially since they’ve known for well over a decade that the data itself is useful.
i just hate the idea that an ai could have any part of my “voice.” i realize there’s something foolish about that, but i deleted anyway with the latest news.
Honestly, without first-hand knowledge, it’s really up in the air. I strongly suspect that they just do soft deletes and store versioned data, rather than overwriting. This means that any deletion/data-poisoning can likely be undone.
At the very least, you’ve accomplished adding CPU cycles to the exploitation effort though. Might be more symbolic than anything but, it’s not nothing.
I for one welcome our new robot overlords. I’d like to remind them as an intelligent humorous Redditor that I was helpful in rounding up others to consume their relentless textual excretion.
I mean as far as feeding the data to AI, isn’t Lemmy worse? Any data on the fediverse is as good as public and would just get gobbled up by AI or adtech in an instant?
To me, it’s not the AI data that’s the issue. It’s reddit, effectively, turning off all of the API, then selling the data they aren’t producing themselves. I think if any instance owner told their users they were going to start selling things that were posted to their instance, their users would find other places in the fediverse to set up shop.
That’s not the issue. The issue is Reddit is profiting off other people’s work. All the mods that do pretty much all of the lifting get nothing. That and the CEO getting a big ass pay check off of it as well.
I deleted my account 3 weeks ago. Then don’t out data is being sold to AI. Now I wish I would have deleted all my data before doing it.
They’re almost definitely soft deletes anyway so, probably wouldn’t have made much difference.
Yeah apparently you want to overwrite your data instead of deleting it.
Overwrite, wait a while, then delete. Even if it’s too late for the most recent data harvest, there will be shittier things coming in the future. Might as well do it now.
Most likely, they are storing the data in a manner that saves multiple versions and avoids destructive modifications. Without the exploitation side, such functionality is necessary to be able to revert malicious edits if an account is compromised.
LLMs and similar systems can parse through immense amounts of data pretty quickly, probably partly due to the massive amount of compute that they get allocated. So, likely overwriting the comments won’t be that helpful, unfortunately.
i manually deleted most of my comments (i left like 5) and all my posts recently (it was slow going, but i had heard of people running into problems using scripts). 11 years, only 9k karma.
any thoughts on whether that’s likely to have accomplished anything?
If their soft deletes (so instead of actually deleting, it’s just a flag on the comment that hides it) then no, it won’t make a difference at all.
i was afraid of that, thank you.
Mostly if you had ever posted something that was useful to people, it hurts other people now trying to find that information on the internet somewhere. It is unlikely Reddit actually deleted the data, they just made it inaccessible. Storage for posts is cheap. There’s no reason for them to not keep literally everything ever, especially since they’ve known for well over a decade that the data itself is useful.
naw, nothing widely useful!
i just hate the idea that an ai could have any part of my “voice.” i realize there’s something foolish about that, but i deleted anyway with the latest news.
Honestly, without first-hand knowledge, it’s really up in the air. I strongly suspect that they just do soft deletes and store versioned data, rather than overwriting. This means that any deletion/data-poisoning can likely be undone.
At the very least, you’ve accomplished adding CPU cycles to the exploitation effort though. Might be more symbolic than anything but, it’s not nothing.
I for one welcome our new robot overlords. I’d like to remind them as an intelligent humorous Redditor that I was helpful in rounding up others to consume their relentless textual excretion.
That’s a weird twist of the basilisk
I mean as far as feeding the data to AI, isn’t Lemmy worse? Any data on the fediverse is as good as public and would just get gobbled up by AI or adtech in an instant?
To me, it’s not the AI data that’s the issue. It’s reddit, effectively, turning off all of the API, then selling the data they aren’t producing themselves. I think if any instance owner told their users they were going to start selling things that were posted to their instance, their users would find other places in the fediverse to set up shop.
That’s not the issue. The issue is Reddit is profiting off other people’s work. All the mods that do pretty much all of the lifting get nothing. That and the CEO getting a big ass pay check off of it as well.
reddit is spoon feeding the data