Dear experts, :) Our application triggered an OOM error in Cassandra 0.6.5 by reading the same 1.7MB column repeatedly (~80k reads). I analyzed the heap dump, and it looks like the column value was queued 5400 times in an OutboundTcpConnection destined for the Cassandra instance that received the client request. Unfortunately, this intra-node connection goes across a 100Mb data center interconnect, so it was only a matter of time before the heap was exhausted.
Is there something I can do (other than change the application behavior) to avoid this failure mode? I'm not the first to run into this, am I? Thanks, Dan