The write path for counters is different than non counter fields, for background http://www.datastax.com/wp-content/uploads/2011/07/cassandra_sf_counters.pdf The write is applied on the leader *and then* replicated to the other replicas. This was controlled by a config setting called replicate_on_write which IIRC has been removed because you always want to do this. You can see this traffic in the REPLICATE_ON_WRITE thread pool.
Have a look at the ROW stage and see it backing up. > 1) Is the whole of 7-8ms being spent in thrift overheads and > Scheduling delays ? (there is insignificant .1ms ping time between > machines) The storage proxy / jmx latency is the total latency for the coordinator after the thrift deserialisation (and before serialising the response). 7 to 8 ms sounds a little high considering the low local node latency. But it would make sense if the nodes were at peak throughput. At max throughput request latency is wait time + processing time. What happens to node local latency and cluster latency when the throughput goes down? Also this will be responsible for some of that latency… > (GC > stops threads for 100ms every 1-2 seconds, effectively pausing > cassandra 5-10% of its time, but this doesn't seem to be the reason) > 2) Do keeping a large number of CF(17 in our case) adversely affect > write performance? (except from the extreme flushing scenario) Should be fine with 17 > 3) I see a lot of threads(4,000-10,000) with names like > "pool-2-thread-*" These are connection threads. Use connecting pooling or try the thread pooled connection manager, see yaml for details. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/07/2012, at 3:48 PM, rohit bhatia wrote: > Hi > > As I understand that writes in cassandra are directly pushed to memory > and using counters with CL.ONE shouldn't take the read latency for > counters in account. So Writes for incrementing counters with CL.ONE > should basically be really fast. > > But in my 8 node cluster(16 core/32G ram/cassandra1.0.5/java7 each) > with RF=2, At a traffic of 55k qps = 14k increments per node/7k write > requests per node, the write latency(from jmx) increases to around 7-8 > ms from the low traffic value of 0.5ms. The Nodes aren't even pushed > with absent I/O, lots of free RAM and 30% CPU idle time/OS Load 20. > The write latency by cfstats (supposedly the latency for 1 node to > increment its counter) is a small amount (< 0.05ms). > > 1) Is the whole of 7-8ms being spent in thrift overheads and > Scheduling delays ? (there is insignificant .1ms ping time between > machines) > > 2) Do keeping a large number of CF(17 in our case) adversely affect > write performance? (except from the extreme flushing scenario) > > 3) I see a lot of threads(4,000-10,000) with names like > "pool-2-thread-*" (pointed out as client-connection-threads on the > mailing list before) periodically forming up. but with idle cpu time > and zero pending tasks in tpstats, why do requests keep piling up (GC > stops threads for 100ms every 1-2 seconds, effectively pausing > cassandra 5-10% of its time, but this doesn't seem to be the reason) > > Thanks > Rohit
