At Wednesday 4/10/2006 21:03, goyatlah wrote:
I'm trying to figure out how to get the exact opened url after a
urlopen in urllib2.
Say you have a link : http://myhost/mypath : what do I get back,
- the file mypath on myhost
- the file index.html on myhost/mypath,
- or maybe something else.
You get whatever the webserver chooses to serve at that URI.
Usually:
- if mypath is a directory (or assimilable to a directory), you get a
redirect to mypath/ (else relative references won't work)
- for mypath/ you get the default document for that directory, maybe
index.html or index.php or default.html or ...
- for mypath/myname you should get the best choice of documents
regarding the Accept, Accept-Language, Accept-Encoding (but few
people/servers use them completely).
Snd what about the following: http;//myhost/index.htm where index.htm
is actually a directory.
Probably you would get a redirect to http://myhost/index.htm/
With urllib2.geturl() I can find out if the name is changed to
mypath/ or index.htm/ but it seems that is the only thing I can find
out.
This is the
HTTPRedirectHandler doing its work. You could look at the
Content-Location header, but I doubt you could get much more info
about the actual object retrieved - there are proxies, rewrite rules,
virtual hosts...
Gabriel Genellina
Softlab SRL
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
--
http://mail.python.org/mailman/listinfo/python-list