• masterspace@lemmy.ca
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    34
    ·
    4 months ago

    K, so Google should be shut down too?

    They can’t operate without scraping copyrighted data.

    • MoogleMaestro@lemmy.zip
      link
      fedilink
      English
      arrow-up
      24
      arrow-down
      1
      ·
      edit-2
      4 months ago

      This is a false equivalency.

      Google used to act as a directory for the internet along with other web search services. In court, they argued that the content they scrapped wasn’t easily accessible through the searches alone and had statistical proof that the search engine was helping bring people to more websites, not preventing them from going. At the time, they were right. This was the “good” era of Google, a different time period and company entirely.

      Since then, Google has parsed even more data, made that data easily available in the google search results pages directly (avoiding link click-throughs), increased the number of services they provide to the degree that they have a conflict of interest on the data they collect and a vested interest in keeping people “on google” and off the other parts of the web, and participated in the same bullshit policies that OpenAI started with their Gemini project. Whatever win they had in the 2000s against book publishers, it could be argued that the rights they were “afforded” back in those days were contingent on them being good-faith participants and not competitors. OpenAI and “summary” models that fail to reference sources with direct links, make hugely inaccurate statements, and generate “infinite content” by mashing together letters in the worlds most complicated markov chain fit in this category.

      It turns out, if you’re afforded the rights to something on a technicality, it’s actually pretty dumb to become brazen and assume that you can push these rights to the breaking point.

    • Admiral Patrick@dubvee.org
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      4
      ·
      4 months ago

      Google (and search engines in general) is at least providing a service by indexing and making discoverable the websites they crawl. OpenAI is is just hoovering up the data and providing nothing in return. Socializing the cost, privatizing the profits.

      • masterspace@lemmy.ca
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        20
        ·
        edit-2
        4 months ago

        Uh, that’s objectively false.

        OoenAI also provides ChatGPT as a “free” service, and Google has made billions off of that “free” service they oh so altruistically provide you.

        • teft@lemmy.world
          link
          fedilink
          English
          arrow-up
          25
          arrow-down
          1
          ·
          4 months ago

          Google points to your content so others can find it.

          OpenAI scrapes your content to use to make more content.

          • masterspace@lemmy.ca
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            25
            ·
            4 months ago

            That’s not a meaningful distinction, I spent all day using a Copilot search engine because the answers I wanted were scattered across a bunch of different documentation sites.

            It was both using the AI models to interpret my commands (not generation at all), and then only publishes content to me specifically.

            • teft@lemmy.world
              link
              fedilink
              English
              arrow-up
              14
              ·
              4 months ago

              I’m talking about the training phase of LLMs.that is the portion that is doing the scraping and generation of copy written data.

              You using an already trained LLM to do some searches is not the same thing.

            • BakerBagel@midwest.social
              link
              fedilink
              English
              arrow-up
              11
              arrow-down
              1
              ·
              4 months ago

              It’s absolutely a meaningful distinction. Search engines push people to tour website where you can capitalize on your audience however you see fit. LLM’s take your content, through them through the mixer and sell it back to people. It’s the difference between a movie reviewer explaining a movie and a dude in an alley selling a pirated copy of the movie.

              • masterspace@lemmy.ca
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                2
                ·
                edit-2
                4 months ago

                A) An LLM does not inherently sell you anything. Some companies charge you to run and use their LLMs (OpenAI), and some companies publish their LLMs open source for anyone to use (Meta, Microsoft). With neural chips starting to pop in PCs and phones, pretty soon anyone will be able to run an open source LLM locally on their machine, completely for free.

                B) LLMs still rarely regurgitate the exact same original source. This would be more like someone in the back alley putting on their own performance of the movie and morphing it and adjusting it in real time based on your prompts and comments, which is a lot closer to parody and fair use than blatant piracy.