Hi!
First of all sorry for my English, it's not my language...
When urllib2 libs visit a URL with 302 error follow the location
automatically. I need to get the location in order to get the full
URL (not relative URL) parsing html code.
You can see an example here:
http://friki.org/302.php raises a "302 error" with
=> Location: "http://friki.cat/test.html";
urllib2 follow the location
In http://friki.cat/test.html i can read a relative link
"/noticies/1.html" but it isn't http://friki.org/noticies/1.html it's
http://friki.cat/noticies/1.html.
I don't know that's the good way to handle this using these libs.
There are any way to handle a 302 error? Can I parse http headers of
the first petition (302 error+location url)?
Sorry again for my English. And sorry if I'm too newbie for this list -_-
Note: Be carefull: "friki.cat" != "friki.org" ;-)
I just write an example code, it get an 404 error because
"http://friki.org/noticia/1.html"; don't exist.
"http://friki.cat/noticia/1.html"; is the correct link.
302_test.py:
#!/usr/bin/python
import re, cookielib, urllib2
HOST = "http://friki.org";
#Get the homepage
cj = cookielib.MozillaCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open(HOST+"/302.php")
data= r.read()
#Follow the link on http source
p = re.compile('href=\"(.*)\"')
m = p.search(data)
LINK = m.group()
LINK = LINK[6:(len(LINK)-1)]
if ("http" != LINK[0:4]):#It just sucks
LINK = HOST+LINK
print "Visit: "+LINK
r = opener.open(LINK)
data= r.read()
#<--- End of code --->
--
"Boring two-person multiplayer may turn friends into enemies."
Antoni Villalonga i Noceras
#Bloc# ~> http://friki.CAT
#Jabber# ~> [EMAIL PROTECTED]
--
http://mail.python.org/mailman/listinfo/python-list