Google (and search engines in general) is at least providing a service by indexing and making discoverable the websites they crawl. OpenAI is is just hoovering up the data and providing nothing in return. Socializing the cost, privatizing the profits.
That’s not a meaningful distinction, I spent all day using a Copilot search engine because the answers I wanted were scattered across a bunch of different documentation sites.
It was both using the AI models to interpret my commands (not generation at all), and then only publishes content to me specifically.
It’s absolutely a meaningful distinction. Search engines push people to tour website where you can capitalize on your audience however you see fit. LLM’s take your content, through them through the mixer and sell it back to people. It’s the difference between a movie reviewer explaining a movie and a dude in an alley selling a pirated copy of the movie.
A) An LLM does not inherently sell you anything. Some companies charge you to run and use their LLMs (OpenAI), and some companies publish their LLMs open source for anyone to use (Meta, Microsoft). With neural chips starting to pop in PCs and phones, pretty soon anyone will be able to run an open source LLM locally on their machine, completely for free.
B) LLMs still rarely regurgitate the exact same original source. This would be more like someone in the back alley putting on their own performance of the movie and morphing it and adjusting it in real time based on your prompts and comments, which is a lot closer to parody and fair use than blatant piracy.
Google (and search engines in general) is at least providing a service by indexing and making discoverable the websites they crawl. OpenAI is is just hoovering up the data and providing nothing in return. Socializing the cost, privatizing the profits.
Uh, that’s objectively false.
OoenAI also provides ChatGPT as a “free” service, and Google has made billions off of that “free” service they oh so altruistically provide you.
Google points to your content so others can find it.
OpenAI scrapes your content to use to make more content.
That’s not a meaningful distinction, I spent all day using a Copilot search engine because the answers I wanted were scattered across a bunch of different documentation sites.
It was both using the AI models to interpret my commands (not generation at all), and then only publishes content to me specifically.
I’m talking about the training phase of LLMs.that is the portion that is doing the scraping and generation of copy written data.
You using an already trained LLM to do some searches is not the same thing.
It’s absolutely a meaningful distinction. Search engines push people to tour website where you can capitalize on your audience however you see fit. LLM’s take your content, through them through the mixer and sell it back to people. It’s the difference between a movie reviewer explaining a movie and a dude in an alley selling a pirated copy of the movie.
A) An LLM does not inherently sell you anything. Some companies charge you to run and use their LLMs (OpenAI), and some companies publish their LLMs open source for anyone to use (Meta, Microsoft). With neural chips starting to pop in PCs and phones, pretty soon anyone will be able to run an open source LLM locally on their machine, completely for free.
B) LLMs still rarely regurgitate the exact same original source. This would be more like someone in the back alley putting on their own performance of the movie and morphing it and adjusting it in real time based on your prompts and comments, which is a lot closer to parody and fair use than blatant piracy.