Jelle van der Waa venit, vidit, dixit 2025-07-04 10:04:42: > Hi, > > On 03/07/2025 22:18, Kevin Kofler via devel wrote: > > Leigh Scott wrote: > >> Why isn't fedora infra using Anubis to block LLM scrappers? > > > > Why should they? Anubis is a scourge that wastes massive energy for all > > legitimate browsers, breaks search engines, and if configured in a > > particularly aggressive way as on the GNOME GitLab, even entirely locks out > > some browsers (though that is an issue with the setup at GNOME > > specifically). > > I just want to point out that this is completely false, Anubis does not > break search engines they are allowlisted to go through without a > challenge. Only the useragents with "Mozilla" in them are being "checked". > > The "wasted" cycles are only incurred once per week (that's how long the > cookie is valid). And you didn't account for the massive energy wasted > by AI scrapers :)
I was wondering what other websites do. I mean, Fedora's are certainly not the only ones being AI-scraped, and I hadn't heard of that being an issue before. So there have to be practical solutions. That being said: This thread is interesting background info for all AI discussions, be it energy waste, intellectual property questions or shady connections. Would someone have pointers for the latter, e.g. botnet usage by scrapers feeding AI models? Cheers Michael -- _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue