> Are there any lists of common robots on the net? Are there
> some regular expressions or searches that would help? Are
> there known IP addresses that are safe to discard?
I believe your question is off topic for this forum however I'll share our
joy with you.
Some are known by hostname:
htt
It's a shared apache 2 server that's set up to put daily log files in my home
directory. I can't muck with config files. What I'm trying to do is to remove
the entries due to spiders, robots and other requests that don't matter to me.
My perl script now looks for IP addresses used for /robots.t