On Thu, Jan 2, 2014 at 2:05 PM, Thunder Stumpges <thunder.stump...@gmail.com > wrote:
> I am seeing a read operation delay in our small (3 node) cluster where I > am testing. The "normal" latency for these operations is < 2ms as recorded > by our load client. This holds easily beyond several hundred qps. However > there are times when all incoming queries (on a node-by-node basis) are > stalled anywhere from ~100-500ms, and then all "clear" and return at the > same time. This behavior is independent of the amount of load applied; Just > more queries get stalled at higher loads :). It seems like a "stall" > condition happens maybe every 30 seconds or so. > What version of cassandra, what configuration for thrift server if relevant, what protocol being used? The JVM can be expected to pause for 100-500ms while doing GC, cassandra logs the various GC types, what do you see in those logs? What does "nodetool tpstats" say when it happens? =Rob