Bob - as you probably know, there are some existing fail2ban filters for
this -- {apache,nginx}-botsearch.conf are the most apropos I see at
first glance. fail2ban is the only scalable/maintainable way I can
imagine to deal with it.

A nonscalable/nonmaintainable way ... for tug.org, years ago I created a
robots.txt based on spammer user-agent strings I found at
projecthoneypot.org
(https://www.projecthoneypot.org/harvester_useragents.php nowadays, it
seems). It's still somewhat beneficial, though naturally it was surely
out of date the instant I put it up, let alone now. I also threw in
iptable rules by hand when the server was getting bogged down. I hope
one day I'll set up fail2ban (including recidive) for it ... -k


Reply via email to