[issue12319] [http.client] HTTPConnection.request not support "chunked" Transfer-Encoding to send data

Rolf Krahl Mon, 08 Aug 2016 05:19:47 -0700

Rolf Krahl added the comment:

Martin,


> I wonder if it would be safer to test for TextIOBase. With the
> read(0) hack, the zero size may be unexpected. A file object may
> need special handling to generate an empty result, or to not
> interpret it as an EOF indicator. See e.g. Issue 23804, about
> zero-sized SSL reads, where both of these problems happened. As long
> as we document that the text file logic has changed, I think it
> would be okay. IMO it is better to keep things simple and robust for
> byte files than add more hacks for text files.

The problem is that there is no accepted definition of what a file-like object 
is.  As a consequence, each test method will fail in some cases.  Not all 
file-like objects are derived from the base classes in the io module.  It is 
common praxis to simply define an object with a read() method and to consider 
this a file-like.  This pattern can even be found in the official Python test 
suite.  Testing for TextIOBase will fail in these cases.

There seem to be at least a minimal consensus that each file-like (that 
supports reading) must have the read() method implementing the interface as 
defined by the io base classes.  The read(0) test (I wouldn't call it a hack, 
as it only uses this documented API) has the advantage of making no other 
assumptions then this.  And I even would not accept Issue 23804 as a counter 
example as this was clearly a bug in the ssl module.

The current test using the mode attribute is certainly the worst one as it even 
fails for standard classes from the io module.

Anyway.  Even though I would prefer the read(0) test, I am happy to accept any 
other method.  Please tell me what I should use.


> However I still don’t see why the checks are worthwhile. Sure having
> both is a technical violation of the current HTTP protocols. But if
> a caller is likely to end up in one of these situations, it may
> indicate a problem with the API that we should fix instead. If the
> checks are unlikely to ever fail, then drop them.

The question is whether we should apply sanity checks on caller's input or 
whether we should rather send bullshit out and wait for the request to fail.  
I'd rather give the caller a somewhat helpful error message rather then a 
lesser helpful "400 Bad Request" from the server.  But if you really insist, I 
can also remove these checks.

I don't understand your remark on the API and what the likeliness of rubbish 
header values set by the caller has to do with it, sorry.

> Otherwise, I don’t think any of the bugs I pointed out have been
> addressed.

Could you elaborate on what bugs you are talking about?  I really don't know, 
sorry.

> I don’t like converting bytes-like to bytes. I understand one of the
> main purposes of accepting bytes-like objects is to avoid copying
> them, but converting to bytes defeats that. Also, the copying is
> inconsistent with the lack of copying in _read_iterable().

I added this conversion because you mentioned Issue 27340 and the problem to 
send byte-like objects with the ssl module.  I agree to leave this to Issue 
27340 to find a better solution and will remove the conversion.

> I am wondering if it would be best to keep new logic out of send();
> just put it into _send_output() instead. Once upon a time, send()
> was a simple wrapper around socket.sendall() without much bloat. But
> with the current patch, in order to send a single buffer, we have to
> jump through a bunch of checks, create a dummy lambda and singleton
> tuple, loop over the singleton, call the lambda, realize there is no
> chunked encoding, etc. IMO the existing support for iterables should
> have been put in _send_output() too (when that was an option). There
> is no need to use send() to send the whole body when endheaders()
> already supports that.

No problem, I can move all this to _send_output().  The only problem that I see 
then, is that the current library version of send() already accepts file-like 
(both text and byte streams), bytes, and iterables.  So we will end up with two 
different methods accepting this variety of data types and behaving differently 
in the respective cases.

Maybe, as a consequence, one should restrict send() to accept only bytes in 
order to tighten consistency.  But this is not in the scope of this Issue and 
will not be covered by my patch.


Please note that I will be in holidays starting with next Monday.  We should 
try to get this done during this week.

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue12319>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12319] [http.client] HTTPConnection.request not support "chunked" Transfer-Encoding to send data

Reply via email to