Evens Fortuné added the comment:

Hi,

>From my limited experience reporting documentation issues, I see that it's 
>better to submit a patch than to only report an issue. So, I've tried to look 
>into the source code to figure out what was going on. I have attached a patch 
>that I'm submitting to you for review hoping I doing everything right.

What was reported as ambiguous in this issue is the description of the return 
value of the function urllib.request.urlopen() for http and https URLs. It was 
mentionned that it should be an http.client.HTTPResponse object but it implied 
that something may have been different about this object. 

To understand why I'm may now be able to assert what's being said in that 
patch, follow me in the source code. It's based on revision c499cc2c4a06. If 
you don't care about all the walkthrough you can skip to step 9.

   1. We want to describe the return value of the urllib.request.urlopen() for 
http and https URLs
   2. The urlopen() function is defined in Lib/urllib/request.py starting on 
line 138. Its return value is the return value function of the opener.open() 
method (line 153)
      * This opener object is defined in one of these locations:
         * On line 150 as the return value of the module function 
build_opener() (this return value is cached in the _opener module variable)
         * On line 152 as the value cached in the _opener module variable
         * On line 148 still as the return value of the module function 
build_opener() in the case if you want to access an HTTPS URL and you provide a 
cafile, capath or cadefault argument
      * So either way, the opener object come from the build_opener directly or 
indirectly.
   3. The build_opener() function is defined starting on line 505. Its return 
value (line 539) is an instance of the OpenerDirector class (line 514). The 
OpenerDirector class is defined starting on line 363.
      a. Before returning its return value, after some checks (lines 522-530, 
535-536), build_opener() calls the OpenerDirector().add_handler() with an 
instance of some of the classes defined in the default_classes list (line 
515-518). What matters to us is the HTTPHandler class and the HTTPSHandler 
class (line 520).
      b. The OpenerDirector().add_handler() method (line 375) takes the 
HTTPHandler class (line 1196) and:
         * Insert the HTTPHandler.http_open() method in the list stored as the 
value of OpenerDirector().handle_open['http'].
         * Insert the HTTPHandler.http_request() method in the list stored as 
the value of OpenerDirector().process_request['http'].
      c. For HTTPSHandler (line 1203) is the same thing but :
         * HTTPSHandler.https_open() for OpenerDirector().handle_open['https']
         * HTTPSHandler.https_request() for 
OpenerDirector().process_request['https']
   4. I remind you that we are looking for the return value of the method 
open() of an instance of the OpenerDirector class (see point number 2). This 
method is defined starting on line 437.
   5. The OpenDirector.open() method's return value is the response variable 
(line 463)
   6. This variable is defined on lines 461 and 455.
      a. The loop on lines 458-461 tries to find in his handlers (the 
OpenerDirector().process_response dictionary) a response processor (a 
XXX.http_response() method) which isn't defined in HTTPHandler or HTTPSHandler. 
(a http_response() method is defined in HTTPErrorProcessor [line 564] and in 
HTTPCookieProcessor [line 1231] but in each of these cases, these classes don't 
modify the response value)
      b. So response variable's value is the return value of 
OpenerDirector()._open(req, data) on line 455. 
         * The req argument is a Request instance (line 440) or something that 
has the same interface, I guess (line 442). The Request class is defined on 
line 253.
         * The data argument is included in the constructor of the Request 
instance (line 440 and then on line 262) or added to the object provided (line 
444). Afterwards, it won't be used directly (OpenerDirector()._open() receives 
it as an argument but won't use it in its body)
   7. OpenerDirector()._open() is defined on line 465. It will call 
OpenerDirector()._call_chain() up to three times depending on whether a result 
has been found (lines 468-469, 474-475). 
      * OpenerDirector()._call_chain() is defined on line 426. All it does is 
calling the handlers registered in the dictionnary provided (the chain 
argument) until one returns something else than None and returns it.
      * In our case (retrieving http and http resources):
         a. The first call (line 466) will return None since HTTPHandler or 
HTTPSHandler don't have a default_open() method (in fact, no handler defined in 
this file has a default_open() method)
         b. The second call will work since HTTPHandler.http_open() (line 1198) 
and HTTPSHandler.https_open() (line 1212) exists. Their return values will be 
enventually what we are looking for.
   8. HTTPHandler.http_open() and HTTPSHandler.https_open() returns the return 
value of do_open() method defined (on line 1134) in their mutual superclass 
AbstractHTTPHandler (line 1086). They will call it with 
http.client.HTTPConnection and req in the case of HTTPHandler and 
http.client.HTTPSConnection and req in the case of HTTPSHandler with a few 
other arguments.
   9. OpenerDirector().do_open() creates a http.client.HTTPSConnection object 
(line 1144) and calls its request() method (line 1173) and if it works, calls 
its getreponse() method (line 1178). This return value is the HTTPResponse 
object we are looking for.
   10. Finally we get our answer: 
      * On line 1186, an url attribute is added to this HTTPResponse object
      * On line 1192, the msg attribute is replaced by the reason attribute

I hope this is what was needed to close this issue. Otherwise, just tell me 
what is missing.

Oh and there seems that there are be many things that could be refactored. Can 
I do it and open issues about them ?

----------
keywords: +patch
Added file: http://bugs.python.org/file36539/issue21228.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21228>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to