Looking at the histograms, it looks like the latency is in the hot write path not the request wait time. Because at first glance the request latency and the cf latency match up. Consider graphing the request and write latencies to see how much time is spent waiting.
So it could be waiting for the switch lock on the write path, which can get contended when there is a lot of flush activity. Look for blocked flush writers or a pattern in the log where an SSTable is queued for flush as soon as one starts writing. This is normally associated with lots of SSTable and/or lots of secondary indexes. I think it can also happen when a nodetool flush is run. Or it could be contention around the commit log. Request tracing will be your friend http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 7/03/2013, at 11:13 AM, Wei Zhu <wz1...@yahoo.com> wrote: > If you are tight about your SLA, try set socketTimeout from Hector with small > number so that it can retry faster given the assumption that your write is > idempotent. > > Regarding your write latency, don't have much insight. We see spike on the > reads due to GC/compaction etc. But not write latency. > > Thanks. > -Wei > > ----- Original Message ----- > From: "Jouni Hartikainen" <jouni.hartikai...@reaktor.fi> > To: user@cassandra.apache.org > Sent: Wednesday, March 6, 2013 10:44:49 PM > Subject: Write latency spikes > > Hi all, > > I'm experiencing strange latency spikes when writing and trying to figure out > what could cause them. > > My setup: > - 3 nodes, writing at CL.ONE using Hector client, no reads > - Writing simultaneously to 3 CFs, inserts with 25h TTL, no deletes, no > updates, RF 3 > - 2 CFs have small data (row count < 2000, row size < 500kB, column > count/row < 15 000) > - 1 CF has lots of binary data split into ~60kB columns (row count < 550 > 000, row sizes < 2MB, column count/row < 40) > - Write rate ~300 inserts / s for each CF, total write throughput ~25 MB > (bytes) / second > - data is time series using timestamp as column key > - Cassandra 1.2.2 with 256 vnodes on each machine > - Key cache at default 100MB, no row cache > - 1 x Xeon L5430 CPU, 16GB RAM, 2.3T disc on RAID10 (10k SAS), Sun/Oracle JDK > 1.6 (tried also 1.7), 4GB JVM heap, JNA enabled > - all nodes in the same DC, 1Gb network, sub ms latencies between nodes > > cassandra.yaml: http://pastebin.com/MSr2prpb > cfstats: http://pastebin.com/Ax5vPUcY > example cfhistograms: http://pastebin.com/qYSL1MX3 > example proxy histograms: http://pastebin.com/X3AGGEjh > > With this setup I usually get quite nice write latencies of less than 20ms, > but sometimes (~once in a every few minutes) latencies momentarily spike to > more than 300ms maxing out at ~2.5 seconds. Spikes are short (< 1 s) and > happen on all nodes (but not at the same time). Even if avg latencies are > very good, these spikes cause us headaches due to our SLA. > > While investigating I have learned the following: > - No evident GC pressure (nothing in C* logs, GC logging showing constantly < > 30ms collection pauses) > - No I/O bounds (disks provide ~1GB/s linear write and are mostly idle apart > from memtable flushes for every ~11s) > - No relation between spikes & compaction > - No queuing in memtable FlushWriter, no blocked memtable flushes > - Nothing alarming in logs > - No timeouts, no errors on the client side > - Each client (3 separate machines) experience latencies simultaneously which > points to cause being in C*, not in the client > - CPU load < 10% (< 20% while compacting) > - Latencies measured both from the client and observed using nodetool > cfhistograms > > Now I'm running out of ideas about what might cause the spikes as I have > understood that there is really not that many places on the write path that > could block. > > Any ideas? > > -Jouni