Re: [Savannah-hackers-public] Web Crawler Bots

Karl Berry Sat, 07 Jan 2017 15:37:18 -0800

Bob - as you probably know, there are some existing fail2ban filters for
this -- {apache,nginx}-botsearch.conf are the most apropos I see at
first glance. fail2ban is the only scalable/maintainable way I can
imagine to deal with it.


A nonscalable/nonmaintainable way ... for tug.org, years ago I created a
robots.txt based on spammer user-agent strings I found at
projecthoneypot.org
(https://www.projecthoneypot.org/harvester_useragents.php nowadays, it
seems). It's still somewhat beneficial, though naturally it was surely
out of date the instant I put it up, let alone now. I also threw in
iptable rules by hand when the server was getting bogged down. I hope
one day I'll set up fail2ban (including recidive) for it ... -k

Re: [Savannah-hackers-public] Web Crawler Bots

Reply via email to