Okay, figured this out. Abstract of the issue: I am a dumbass.
First, Itamar gets to thrash me soundly, as there was a bug in some code (not shown in my example) that is not properly tested. That code was responsible for "turning off" the protocol instance if connectionLost method was called, by doing some cleanup then redefining check_for_send in the instance as a no-op to stop it from pushing itself back on the reactor loop. Since the factory here is a reconnecting one, if the network or downstream server etc., glitches, we get a *new* connected protocol instance and an old, unconnected one. Due to the bug, both are consuming messages from the factory queue, calling the protocols' transport.write with the message data. This location has some issues, so the connection occasionally drops on the downstream end, which we don't see elsewhere. transport.write on a tcp connection looks like it just returns if the underlying fd object is closed. So messages picked up by the old object get bit-bucketed. http://twistedmatrix.com/documents/15.0.0/api/twisted.internet.tcp.Connection.html http://twistedmatrix.com/trac/browser/tags/releases/twisted-15.0.0/twisted/internet/abstract.py#L339 To dos: Fix bug, proper unit test, fix integration test so we test dropping connections under load... Question: I'm assuming there's a good reason transport.write is written so it doesn't error and fails silently even though its underlying connection is not connected anymore. As part of grokking the guts of this thing I've been using for a decade...I'm curious to know why. On Tue, Apr 28, 2015 at 7:43 AM, Brian Costlow <brian.cost...@gmail.com> wrote: > On Mon, Apr 27, 2015 at 4:55 PM, Glyph Lefkowitz <gl...@twistedmatrix.com> > wrote: > >> >> Nothing strikes me as obviously wrong about this code (except the >> "deferToThread" which seems *slightly* >> suspicious, since nothing in the example appears to have anything to do >> with threads, and whenever you get threads involved things get complicated). >> >> > The deferToThread just shoves the write of the message string to file > into the thread pool. It was added after this issue was observed. Using > deferToThread > is a hangover from attaching a logging callback when the file wrote, since > removed, so callInThread would work also. I didn't want to add the file > write onto the reactor thread here. > > I wish I could find a simpler case, but frankly, I have an integration > test system that uses a test app to generate upstream messages, and an open > source java app to simulate the downstream server. I can't reproduce this > there even with all the moving parts. > > Happening "in the wild" at one location, was hoping for some advice on > troubleshooting what happens between calling transport.write and seeing > bytes on the wire. I guess it's time to go digging into parts of twisted > I've always taken for granted, and learn something new. ;-) > >
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python