If you are tight about your SLA, try set socketTimeout from Hector with small number so that it can retry faster given the assumption that your write is idempotent.
Regarding your write latency, don't have much insight. We see spike on the reads due to GC/compaction etc. But not write latency. Thanks. -Wei ----- Original Message ----- From: "Jouni Hartikainen" <jouni.hartikai...@reaktor.fi> To: user@cassandra.apache.org Sent: Wednesday, March 6, 2013 10:44:49 PM Subject: Write latency spikes Hi all, I'm experiencing strange latency spikes when writing and trying to figure out what could cause them. My setup: - 3 nodes, writing at CL.ONE using Hector client, no reads - Writing simultaneously to 3 CFs, inserts with 25h TTL, no deletes, no updates, RF 3 - 2 CFs have small data (row count < 2000, row size < 500kB, column count/row < 15 000) - 1 CF has lots of binary data split into ~60kB columns (row count < 550 000, row sizes < 2MB, column count/row < 40) - Write rate ~300 inserts / s for each CF, total write throughput ~25 MB (bytes) / second - data is time series using timestamp as column key - Cassandra 1.2.2 with 256 vnodes on each machine - Key cache at default 100MB, no row cache - 1 x Xeon L5430 CPU, 16GB RAM, 2.3T disc on RAID10 (10k SAS), Sun/Oracle JDK 1.6 (tried also 1.7), 4GB JVM heap, JNA enabled - all nodes in the same DC, 1Gb network, sub ms latencies between nodes cassandra.yaml: http://pastebin.com/MSr2prpb cfstats: http://pastebin.com/Ax5vPUcY example cfhistograms: http://pastebin.com/qYSL1MX3 example proxy histograms: http://pastebin.com/X3AGGEjh With this setup I usually get quite nice write latencies of less than 20ms, but sometimes (~once in a every few minutes) latencies momentarily spike to more than 300ms maxing out at ~2.5 seconds. Spikes are short (< 1 s) and happen on all nodes (but not at the same time). Even if avg latencies are very good, these spikes cause us headaches due to our SLA. While investigating I have learned the following: - No evident GC pressure (nothing in C* logs, GC logging showing constantly < 30ms collection pauses) - No I/O bounds (disks provide ~1GB/s linear write and are mostly idle apart from memtable flushes for every ~11s) - No relation between spikes & compaction - No queuing in memtable FlushWriter, no blocked memtable flushes - Nothing alarming in logs - No timeouts, no errors on the client side - Each client (3 separate machines) experience latencies simultaneously which points to cause being in C*, not in the client - CPU load < 10% (< 20% while compacting) - Latencies measured both from the client and observed using nodetool cfhistograms Now I'm running out of ideas about what might cause the spikes as I have understood that there is really not that many places on the write path that could block. Any ideas? -Jouni