New submission from Ambarish Malpani <[EMAIL PROTECTED]>:

Try the following code:
import urllib
import urllib2

url =
'http://features.us.reuters.com//autos/news/95ED98EE-A837-11DC-BCB3-4F218271.html'

data = urllib.urlopen(url).read()
data2 = urllib2.urlopen(url).read()

The attempt to get it with urllib works fine. With urllib2, the request
is malformed and I get back a HTTP 404

Request in the 2nd case is:
GET //autos/news/95ED98EE-A837-11DC-BCB3-4F218271.html HTTP/1.1\r\n
Accept-Encoding: identity\r\n
Host: autos\r\n
Connection: close\r\n
....

The host line seems to be looking for the last // rather than the first.

----------
components: Extension Modules
messages: 66334
nosy: ambarish
severity: normal
status: open
title: urllib2.urlopen() gets confused with path with // in it
type: behavior
versions: Python 2.5

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2776>
__________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to