OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling’s Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

  • afraid_of_zombies@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    ·
    1 year ago

    I am sure they have patched it by now but at one point I was able to get chatgpt to give me copyright text from books by asking for ever large quotations. It seemed more willing to do this with books out of print.

    • stewsters@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      Yeah, it refuses to give you the first sentence from Harry Potter now.

      Which is kinda lame, you can find that on thousands of webpages. Many of which the system indexed.

      If someone was looking to pirate the book there are way easier ways than issuing thousands of queries to ChatGPT. Type “Harry Potter torrent” into Google and you will have them all in 30 seconds.

      • BURN@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        ChatGPT has a ton of extra query qualifiers added behind the scenes to ensure that specific outputs can’t happen