You want to balance first before you do any other performance tuning.
On Tuesday, January 17, 2012, Marcel Steinbach <marcel.steinb...@chors.de> wrote: > Hi, > we're running a 8 node cassandra-0.7.6 cluster, with avg. throughput of 5k reads/s and almost as much writes/s. The client API is pelops 1.1-0.7.x. > > Latencies in the CFs (RecentReadLatencyHistogramMicros) look fine with 99th percentile at 61ms. However, on the client side, p99 latency is at 1.1s (seconds!) and we only have 91% below 60ms! So there is a big difference between the numbers shown in the CF latencies and what the client actually experiences. > > The cluster is not very balanced currently, but nothing indicates a latency of 1.1 seconds. I also see high write latency in the clients, with about 4% taking 50+ ms. Whereas in the RecentWriteLatencyHistogramMicros, 99,9% of the latencies are below 1ms. > > I'm not sure where the additional latency is gained. Is it possible, the request spends some time in a queue before being processed? If so, is there a way to optimize that? I already increased core pool size for the ReadStage, which didn't improve things. > Any other ideas? > > Thanks! > Marcel