Nothing exactly. But that’s okay, because the fediverse data is available to all, which makes it worthless, monetarily speaking. Nobody will sell your data to anyone. Any AI company could use the data to train their models, but they wouldn’t be able to sell those models since they wouldn’t be any better than an open source model. The fediverse levels the playing field and doesn’t allow the situation where Google pays reddit for AI training data.
I’m just refuting your point that the data is worthless because anyone can train AI on it. It’s not worthless because although anyone can train their model on it, most companies would rather purchase the services from specialists, so all training data has value.
The A.I will see it when it’s trained on it
What’s keeping AI from training on Lemmy?
Hint:
that it’s (currently) much less popular than reddit.
Reddit is new facebook at this point. A friend’s mom made a reddit account to upvote cat pictures a couple of weeks ago.
I doubt her joining reddit will make it worse.
Doesn’t matter they’ve already ran out most quality content they could find and Reddit has limited who can train AI on their website.
Maybe she will join Lemmy.
Nothing exactly. But that’s okay, because the fediverse data is available to all, which makes it worthless, monetarily speaking. Nobody will sell your data to anyone. Any AI company could use the data to train their models, but they wouldn’t be able to sell those models since they wouldn’t be any better than an open source model. The fediverse levels the playing field and doesn’t allow the situation where Google pays reddit for AI training data.
They can still sell their services, not every company want to launch their own LLM model
Then they earn stuff on their services, not the model. Why should they harvest fediverse data? And so what if they do? Anyone can do that.
I’m just refuting your point that the data is worthless because anyone can train AI on it. It’s not worthless because although anyone can train their model on it, most companies would rather purchase the services from specialists, so all training data has value.
All the more reason not to post it in the first place.