Re: robotparser behavior on 403 (Forbidden) robot.txt files

2008-06-02 Thread Martin v. Löwis
> I just discovered that the "robotparser" module interprets > a 403 ("Forbidden") status on a "robots.txt" file as meaning > "all access disallowed". That's unexpected behavior. That's specified in the "norobots RFC": http://www.robotstxt.org/norobots-rfc.txt - On server response indicating a

robotparser behavior on 403 (Forbidden) robot.txt files

2008-06-02 Thread John Nagle
I just discovered that the "robotparser" module interprets a 403 ("Forbidden") status on a "robots.txt" file as meaning "all access disallowed". That's unexpected behavior. A major site ("http://www.aplus.net/robot.txt";) has their "robots.txt" file set up that way. There's no real "robots