Everyone:
I found a workaround to this non-intuitive little quirk in how Apache evaluates
DENY directives with (partial) hostnames: I'm using SetEnvIf expressions to
control a variable, which I then reference in a single DENY directive:
# assume spammers doing recon will only perform GETs
SetEnvIfNoCase Request_Method "GET" spammer_recon
# assume spammers always use both empty Referer and U-A
SetEnvIfNoCase Referer ".+" !spammer_recon
SetEnvIfNoCase User-Agent ".+" !spammer_recon
# if the host/IP is any of these, they're definitely spammers regardless!
SetEnvIfNoCase Remote_Host "\.barak\-online\.net" spammer_recon
SetEnvIfNoCase Remote_Host "\.barak\.net\.il" spammer_recon
SetEnvIfNoCase Remote_Host "\.cable\.casema\.nl" spammer_recon
SetEnvIfNoCase Remote_Host "\.client\.bresnan\.net" spammer_recon
SetEnvIfNoCase Remote_Host "\.ctinets\.com" spammer_recon
SetEnvIfNoCase Remote_Host "\.dip\.t\-dialin\.net" spammer_recon
SetEnvIfNoCase Remote_Host "\.dsl\.ip\.tiscali\.nl" spammer_recon
SetEnvIfNoCase Remote_Host "\.easyspeedy\.com" spammer_recon
SetEnvIfNoCase Remote_Host "\.goo\.ne\.jp" spammer_recon
SetEnvIfNoCase Remote_Host "\.hostingprod\.com" spammer_recon
SetEnvIfNoCase Remote_Host "\.internetserviceteam\.com" spammer_recon
SetEnvIfNoCase Remote_Host "\.keymachine\.de" spammer_recon
SetEnvIfNoCase Remote_Host "\.knology\.net" spammer_recon
SetEnvIfNoCase Remote_Host "\.lorerweb\.net" spammer_recon
SetEnvIfNoCase Remote_Host "\.onlinehome\-server\.info" spammer_recon
SetEnvIfNoCase Remote_Host "\.pppoe\.mtu-net\.ru" spammer_recon
SetEnvIfNoCase Remote_Host "\.qwerty\.ru" spammer_recon
SetEnvIfNoCase Remote_Host "\.sputnikmedia\.net" spammer_recon
SetEnvIfNoCase Remote_Host "\.static\.theplanet\.com" spammer_recon
SetEnvIfNoCase Remote_Host "\.starnet\.md" spammer_recon
SetEnvIfNoCase Remote_Host "\.starnet\.ru" spammer_recon
SetEnvIfNoCase Remote_Host "\.svservers\.com" spammer_recon
SetEnvIfNoCase Remote_Host "\.dip\.t\-dialin\.net" spammer_recon
SetEnvIfNoCase Remote_Host "\.dip0\.t\-ipconnect\.de" spammer_recon
SetEnvIfNoCase Remote_Host "\-xbox\.dedi\.inhoster\.com" spammer_recon
SetEnvIfNoCase Remote_Addr "^\210\.240\." spammer_recon
<Directory "C:/Documents and Settings/macraig/Desktop/local Web/blog">
Options None
AllowOverride None
#Order allow,deny
#Allow from all
order deny,allow
deny from env=spammer_recon
</Directory>
The conditionals don't suffer from the same quirk as DENY directives, so my
partial hostnames *always* work as intended, regardless of the results of an
rDNS. So far this is working like a charm. Since I also found that most of the
time the spammers don't send U-A or Referer info when they're doing recon and of
course are using GETs, I was able to add a generic catch-all at the front, with
the confirmed evil-doers bringing up the rear. I don't know what effect this
has on performance, but I don't expect this list to grow enormously. If anyone
sees a way to improve the parsing efficiency of my regular expressions (I
suspect I suck at constructing them), I'd love the help.
Mark
-------- Original Message --------
Subject: Re: [EMAIL PROTECTED] <directory> and deny directives
From: Mark A. Craig <[EMAIL PROTECTED]>
To: users@httpd.apache.org
Date: Friday, September 14, 2007 08:57:49 AM
Joshua:
I see what you mean about the rDNS, though perversely it was the
svservers.com case that drove me to use partial hostnames in the first
place, because they lease from multiple IP blocks from multiple sources,
and I've been getting spam recon from all of them, so I thought I could
kill all the birds with just the one hostname stone. It seemed
intuitive at the time.... As of last night, I have another instance
just like it, a different partial hostname from my list that passes
thru, apparently because the actual IP address doesn't fall in the
returned IP range when an rDNS *on just the primary domain* is
performed. I think you're right.
One way to test it: substitute partial IP addresses, to represent each
of the leased IP blocks. It's more work, but I'll see what happens. It
would sure be nice if the code didn't pull a non-intuitive stunt like
this, though! If the DNS lookup resolves to the specified *partial*
hostname, it should act on it, not second-guess it with an rDNS like this.
Mark
-------- Original Message --------
Subject: Re: [EMAIL PROTECTED] <directory> and deny directives
From: Joshua Slive <[EMAIL PROTECTED]>
To: users@httpd.apache.org
Date: Friday, September 14, 2007 06:06:13 AM
On 9/14/07, Mark A. Craig <[EMAIL PROTECTED]> wrote:
Joshua:
Thanks for the quick and comprehensive reply. Lemme address
everything in order:
1. Whatcha mean by "the config is inherited"? Did you mean to
address my
question about sub-directories? I suspect so, but if not please
clarify.
2. The status codes are in fact mostly 403s, but not ALL... some that
match my
deny directives, notably ".svservers.com", are still being allowed
with 200s.
The 403s that are occurring could also be the result of the http:BL
module in
the blog software itself, which checks the IPs of attempted
commenters against
the Project Honeypot DNS blacklist and bounces them with a 403 if the
IP is a
match (there's a lot of 403s for hostnames not in my little DENY
list). At
least that's the only explanation I can imagine for the inconsistency.
My goal here is to nail the spammy GETs; at first I'd considered a
<LIMIT GET>
directive, but I couldn't figure out where/how to apply it and so
resorted to
this current technique.
Don't use <Limit GET>. See the docs on <Limit> for why that would be a
mistake.
Your config looks basically correct. But of course, other things in
your config file could be overriding it. If you replace all those Deny
directives with a "Deny from all", do you block all access? If not,
then either you aren't editing the correct place in the config file,
or you are overriding this config someplace else (such as in a
<Location> section).
Another likely issue is your use of hostnames. The hostnames that are
getting the 200 response above have messed-up reverse lookups. (The
domain you get when looking up the IP address does not map back to
that IP address.) Although I haven't checked the code, it is possible
that apache is ignoring those ones because it can't confirm whether or
not the client is really in that domain.
In general, it is better to use IP addresses for blocking instead of
domains.
Joshua.
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server
Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [EMAIL PROTECTED]
" from the digest: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [EMAIL PROTECTED]
" from the digest: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]