> I just discovered that the "robotparser" module interprets
> a 403 ("Forbidden") status on a "robots.txt" file as meaning
> "all access disallowed". That's unexpected behavior.
That's specified in the "norobots RFC":
http://www.robotstxt.org/norobots-rfc.txt
- On server response indicating a
I just discovered that the "robotparser" module interprets
a 403 ("Forbidden") status on a "robots.txt" file as meaning
"all access disallowed". That's unexpected behavior.
A major site ("http://www.aplus.net/robot.txt";) has their
"robots.txt" file set up that way.
There's no real "robots