Am 04.07.25 um 10:19 schrieb Michael J Gruber:
I was wondering what other websites do. I mean, Fedora's are certainly
not the only ones being AI-scraped, and I hadn't heard of that being an
issue before. So there have to be practical solutions.

This is a massive issue for all websites which serve some dynamic content and which get at least a bit of traffic.

Basically these AI scrapers do not care about any restrictions like robots.txt or whatever. They try access all pages and do so with ridiculous frequency.

LWN published an article in February about their issues:
https://lwn.net/Articles/1008897/

As far as I know, there is not general "solution" for this. You can either:
- serve only static content, possibly with a strong CDN
- use e.g. Cloudflare which injects verification screens using JavaScript
- use Anubis

Felix

--
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to