> On 5 May 2017, at 12:38, Phil Mayers <p.may...@imperial.ac.uk> wrote:
> 
>> As a third note, your code does not handle the possibility that
>> original._transport may not implement IPushProducer. While *in
> 
> IIUC IResponse.deliverBody *does* guarantee (currently) to provide a 
> transport implementing IPushProducer to the supplied IProtocol, at least as 
> far as the docstring says?
> 
> Obviously that doesn't make accessing the private _transport safe - an 
> implementation could wrap the transport to implement the IPushProducer 
> semantics.

Yeah, it does. Sadly, there’s some complexity hidden there.

In this case, the “transport” object being provided to the IProtocol is a proxy 
to the TCP transport. However, that object does not write directly to the 
provided IProtocol, it writes to the t.w._newclient.HTTPClientParser and that 
chucks all kinds of data through the system. One of these layers is buffering, 
and that layer is *not* being told about calls to the IPushProducer methods. 
Thus, if you call pauseProducing from the IProtocol you provide, you may still 
get calls to your dataReceived method even though the producer is paused.

Mostly this seems to happen when the first body bytes are delivered, which 
seems to call makeConnection and then immediately dispatch any body bytes that 
have been received, even if the makeConnection method called pauseProducing on 
its transport. Not so helpful!

This also makes it impossible to avoid buffering. There seems to be no way to 
prevent the Response object from delivering the first bytes of the body before 
startProducing is called by treq for the upload. That’s pretty frustrating. 
Worse, the Response will *also* fire the “body complete” callback. All of this 
can and does happen before treq.put ever gets a chance to call startProducing, 
which means that you just have to buffer all of these events. This is really 
quite painful.

This is a design pattern that is in a few different places in twisted.web: 
situations where the “transport” provided to a higher-level abstraction is a 
proxy to the TCP connection, but where intermediate layers are actually 
responsible for calling dataReceived on those higher level abstractions. This 
causes a real problem if there is any chance that intermediary layers will do 
any buffering at all, because those layers will not see calls to pauseProducing 
or anything similar, and so will not respect those calls. The Twisted HTTP/2 
server tries very hard to avoid this by ensuring that its classes actually 
masquerade as transports, rather than delegating to the lower-level TCP 
transport (though of course, they have to do this because HTTP/2 is framed!).

Fixing this is potentially a bit non-trivial. It involves changing 
t.w._newclient.Response to be able to act as a transport proxy (probably via 
some kind of mixin for the sake of code composition) and then to respect the 
producing state of the protocol receiving the body inside the _bodyDataReceived 
method, instead of just YOLOing data into that protocol assuming that it hasn’t 
immediately paused the production.

I consider this a bug in Twisted: I’ll open a ticket.

Cory

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to