Toni Mueller wrote:
Pro: Every bot can access the url exactly one time, afterwards its
blacklisted.
Use expire-table to free the pf table occassionally and of course make
sure that you don't block yourself - whitelist ip addresses like your
standard gateway, otherwise you may DoS yourself ;)
I'm researching the same problem and so far have arrived at the
following conclusions (feedback & improvement desired!):
* Blacklisting individual IPs is a sharp edged knife, and cumbersome
to handle.
Not really when done automatically. I use incremental time per offense.
First time you do it, you are block for a period of time, then remove
from the lists later on. You do it again, you are block for more time
then clear again, etc. Works very well for me and I can share the same
SQL data between all servers.
* Some request storms appear to be triggered by a unlucky interaction
between the server sending PDF files, and the client using Internet
Exploder (which often breaks, see the discussion around
range-requests).
* Use a non-forking server.
???
* Rate limiting, or at least rate limiting per network (eg. per /16),
would "solve" the problem for me, and is maintenance-free.
* Use it with connection rate limiting in pf...
PF can handle rate limit pretty well, just increase your table size if
you reach the limit of them and be aggressive optimization:
Start in PF with :
set optimization aggressive
Any comments on this are welcome!
One obvious downside is that one apparently cannot make this work (eg
specifically denying range-requests from IE-users) with the stock
Apache.
You can deny request based on IE versions if need be from the stock
apache. All my previously describe time limiting and redirect is only
affecting the IE version and anything NOT IE pass without delay or redirect.
Daniel