get_slice() has two lines it it, a call to send_get_slice() and one to recv_get_slice() . send_get_slice() sends the request down the socket to the server and returns. recv_get_slice() take a blocking read (with timeout) against the socket, pulls the entire message, decodes it and returns it.
Without looking at the generated thrift code, this sounds dangerous.
What happens if send_get_slice() blocks? What happens if
recv_get_slice() has to block because you didn't happen to receive the
response in one packet?
Normally you're either doing blocking code or callback oriented
reactive code. It sounds like you're trying to use blocking calls in a
non-blocking context under the assumption that readable data on the
socket means the entire response is readable, and that the socket
being writable means that the entire request can be written without
blocking. This might seems to work and you may not block, or block
only briefly. Until, for example, a TCP connection stalls and your
entire event loop hangs due to a blocking read.
I'm not interrupting any of the work thrift is doing when reading or writing to the socket. Those functions still get to complete as normal. The goal is to let the tornado server work on another request while the first one is waiting for Cassandra to do its work. It's wasted time on the web heads that could otherwise be employed servicing other requests.
Once it detects the socket state has changed it will add the callback into the event loop. And I then ask the Cassandra client to read all the data from the socket. It's still a blocking call, just that we don't bother to call it unless we know there is data sitting there for it.
The recv could still bock hang etc, but will do that in a the normally blocking model. I'll need to test the timeouts and error propagation in these cases.
Thanks for the feedback
Aaron