I just discovered that the "robotparser" module interprets
a 403 ("Forbidden") status on a "robots.txt" file as meaning
"all access disallowed". That's unexpected behavior.

  A major site ("http://www.aplus.net/robot.txt";) has their
"robots.txt" file set up that way.

  There's no real "robots.txt" standard, unfortunately.
So it's not definitively a bug.

                                John Nagle
                                SiteTruth
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to