Re: Python "robots.txt" parser broken since 2003

2007-04-22 Thread John Nagle
Steven Bethard wrote: > John Nagle wrote: > >> Terry Reedy wrote: >> >>> "John Nagle" <[EMAIL PROTECTED]> wrote in message >>> news:[EMAIL PROTECTED] >>> | This was reported in 2003, and a patch was uploaded in 2005, but >>> the patch >>> | never made it into Python 2.4 or 2.5. >>> >>> If the pa

Re: Python "robots.txt" parser broken since 2003

2007-04-22 Thread Steven Bethard
John Nagle wrote: > Terry Reedy wrote: >> "John Nagle" <[EMAIL PROTECTED]> wrote in message >> news:[EMAIL PROTECTED] >> | This was reported in 2003, and a patch was uploaded in 2005, but the >> patch >> | never made it into Python 2.4 or 2.5. >> >> If the patch is still open, perhaps you could r

Re: Python "robots.txt" parser broken since 2003

2007-04-22 Thread Nikita the Spider
In article <[EMAIL PROTECTED]>, John Nagle <[EMAIL PROTECTED]> wrote: > This bug, "[ 813986 ] robotparser interactively prompts for username and > password", has been open since 2003. It killed a big batch job of ours > last night. > > Module "robotparser" naively uses "urlopen" to read "robot

Re: Python "robots.txt" parser broken since 2003

2007-04-22 Thread John Nagle
Terry Reedy wrote: > "John Nagle" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > | This was reported in 2003, and a patch was uploaded in 2005, but the > patch > | never made it into Python 2.4 or 2.5. > > If the patch is still open, perhaps you could review it. > I tried

Re: Python "robots.txt" parser broken since 2003

2007-04-21 Thread Terry Reedy
"John Nagle" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | This was reported in 2003, and a patch was uploaded in 2005, but the patch | never made it into Python 2.4 or 2.5. If the patch is still open, perhaps you could review it. tjr -- http://mail.python.org/mailman/listi

Python "robots.txt" parser broken since 2003

2007-04-21 Thread John Nagle
This bug, "[ 813986 ] robotparser interactively prompts for username and password", has been open since 2003. It killed a big batch job of ours last night. Module "robotparser" naively uses "urlopen" to read "robots.txt" URLs. If the server asks for basic authentication on that file, "robotparse