Darren Govoni <dar...@ontrenet.com> writes: > I spoke too fast. But pardon my noobiness. > > Ok, so I am using a simple protocol that is listening on a TCP port. > > One the client side, I write 4096 bytes using > self.transport.write(bytes) > > on dataReceived side, I get only 1448.
Quite possible, and even likely with a chunk of 4096, given likely network latencies and the physical packet sizes at each network hop along the way. However, dataReceived will eventually be called additional times until all of the 4096 bytes that was transmitted and received over the socket connection have been handed off to your protocol. That's just the nature of a stream protocol - it's a constant stream of data being fed by one end and drained on the other, without any natural boundaries or structures within (other than, I suppose, the boundary of an octet since you can't receive a partial octet). The alternative is to use a datagram protocol like UDP, but then you have all the negatives of no guaranteed delivery, out of order delivery, completely impossible delivery (when trying a datagram larger than the UDP limit), etc... Far easier to just handle the TCP stream properly. > Now, what I "want" to happen is when I issue a write of a known > number of bytes. I "want" those bytes to arrive in total because > they represent a pickled object. The server has no idea if the > bytes are split and scattered (again, I want the control protocol to > take affect). I suspect it may just be a difference in phrasing, but note that I consider "arrive in total" to be different from "arrive in the same number of I/O operations". TCP guarantees the former (sans dropped connections) but not the latter. It's a trade-off that you make in order to get the other benefits of guaranteed delivery with TCP, regardless of network disruptions, latency, etc... You're fine as long as you just accept up front that you can't make any assumptions as to how the data will arrive at the receiving end. So combine the data in whatever sizes it is received (and any number of received chunks) until you have it all. You can then de-pickle it or do anything else with it. As a comparison, that's really all PB is doing, although it's banana-encoding the object on the wire rather than pickling. Depending on the client/server interaction, you may also have the opposite problem - the final chunk of data received may cover more than one client transmission, and you'll have to split it up appropriately. That's why if you will be transmitting multiple sets of data over a single connection, you'll want some structure (unique boundary codes, encoded length information, parseable data like XML, etc...) in the wire protocol so your server knows when it is done. > 1) Am I doing something wrong here? Not so much wrong, as perhaps a little misguided in terms of trying to have a stream protocol work less as a stream than it does. I suspect you may also be over-estimating a little the complexity of handling this aspect of TCP in your own code. > 2) Can I force twisted to send ALL the bytes I issue in the write > without re-thinking TCP or forcing me to re-implement TCP? Again, distinguish between "send ALL the bytes" which *does* in fact happen, versus "receive bytes in identically sized chunks" which will not happen. Though I seriously doubt that your demands are such that it requires "re-thinking" or "re-implement[ing]" TCP. Much easier to stick with the TCP base (loads of benefits), and just encode enough structure into your stream to permit the server to identify the boundaries of the requests. Then, code the server to look for such boundaries while accepting data in any size chunks, and you're done. It's pretty much what every other TCP protocol that has structure to its data does, whether that's length counted, flag bytes, specific textual content (such as the final empty line in an HTTP request), etc... As has been posted in another response, you may find some of the existing protocols in twisted.protocol.basic to be helpful for this. The older posting of mine that you referenced used a subclass of LineReceiver to encode the length in ASCII as part of an initial header, for example, though it closed the connection when done. And, for example, Netstring or the Int##String classes takes care of the counting on your behalf, and even give subclasses a nice single entry point (stringReceived) to use instead of dataReceived, so your server need not think about the aggregation or splitting of chunks. If nothing else, reading the source to one of those receiver classes might help provide a concrete example of the aggregation (or splitting) of the stream data that I mention above. -- David _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python