On Sat, Oct 22, 2011 at 12:52:05PM +0200, MegaBrutal wrote: > What I tried: > > # Include the virtual host configurations: > Include sites-enabled/ > > # Deny all unknown virtual host names > <VirtualHost *:80> > ServerName * > DocumentRoot /var/www > <Location /> > Order allow,deny > # Allow from googlebot.com > Allow from 127.0.0.1 > </Location> > SetEnvIf Remote_Addr "127\.0\.0\.1" localhostlog > CustomLog "/var/log/apache2/access.log" combined env=localhostlog > CustomLog "/var/log/apache2/reject.log" vhost_combined env=!localhostlog > ErrorLog "/var/log/apache2/reject_error.log" > </VirtualHost> > [snip] > If no matching virtual host is found, then *the first listed virtual > host*that matches the IP address will be used. > > (Apache website on Name-based Virtual Host > > Support<http://httpd.apache.org/docs/2.2/vhosts/name-based.html> > > ) > > > > OK, make it the first virtual host config! Naive! If I put my all-rejecting > virtual host before the include for my specific virtual hosts, then all > request will be served by the rejecting virtual host - even request for my > legit virtual host names. But it is also the expected behaviour: > > Now when a request arrives, the server will first check if it is using an IP > > address that matches the > > NameVirtualHost<http://httpd.apache.org/docs/2.2/mod/core.html#namevirtualhost>. > > If it is, then it will look at each > > <VirtualHost><http://httpd.apache.org/docs/2.2/mod/core.html#virtualhost>section > > with a matching IP address and try to find one where the > > ServerName <http://httpd.apache.org/docs/2.2/mod/core.html#servername> or > > ServerAlias matches the requested hostname. If it finds one, then it uses > > the configuration for that server. > > > > Apache tries to find a suitable virtual host config by looking from up to > down. Of course, "*" matches everything, so the all-rejecting virtual host > config will catch all requests, the other virtual hosts won't be checked > ever.
All of this is correct as stated. So, if I've understood your problem
correctly, the solution is to put your bot-catcher first in the config,
but remove the wildcard in the ServerName. eg:
# Deny all unknown virtual host names
<VirtualHost *:80>
ServerName my-fantastic-bot-catcher
... Directives to handle bots here ...
</VirtualHost>
# Include the virtual host configurations:
Include sites-enabled/
So, if a request comes in that does match one of your real ServerNames,
the catch-all will be bypassed. But, if a request comes in that doesn't
match a real ServerName, the botcatcher will serve it.
This is the approach which I generally use, although I tend not to use
an asterisk in the VirtualHost directive - so that part is untested.
HTH,
Pete
--
Openstrike - improving business through open source
http://www.openstrike.co.uk/ or call 01722 770036 / 07092 020107
pgpiIM1S00upD.pgp
Description: PGP signature
