• LiveLM@lemmy.zip
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      9 hours ago

      It’s an attempt to stop poorly behaved AI crawlers written by someone who had their Git server flooded Amazon’s crawler. The worst part is that it seems these companies are so desperate to suck up data to feed their models that they’ll happily disregard “good etiquette” and summarily workaround the website owner’s attempt to slow them down.

      I don’t want to have to close off my Gitea server to the public, but I will if I have to. It’s futile to block AI crawler bots because they lie, change their user agent, use residential IP addresses as proxies, and more. I just want the requests to stop.

      Source: Amazon’s AI crawler is making my git server unstable by Xe Iaso

      Given the Gnome gitlab is a rather known website, I can only imagine they’ve suffered from the same fate, probably even worse.

    • unhrpetby@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      14 hours ago

      Gnome shenanigans.

      The most hilarious part about how Anubis is implemented is that it triggers challenges for every request with a User-Agent containing “Mozilla”.

      If you have JavaScript disabled, this “challenge” is just a wall. They might’ve stopped bots, but they’ve stopped me too.