RE: Cassandra periodically stops responding to write requests under load

S C Fri, 14 Jun 2013 08:10:39 -0700

What version of Cassandra are you using? Did you look if Cassandra is under 
going GC? 
-SC
From: [email protected]
To: [email protected]
Subject: Cassandra periodically stops responding to write requests under load
Date: Fri, 14 Jun 2013 14:19:57 +0000










Hello,
 
I have been doing some scripted load testing of Cassandra as part of 
determining what hardware to deploy Cassandra on for the particular load 
profile our application will generate.  I’m seeing generally good performance, 
but with periods
 where the Cassandra node entirely stops responding to write requests for 
several seconds at a time.  I don’t have much experience of Cassandra 
performance tuning, and would very much appreciate some pointers on what I can 
do to improve matters.
 
 
__Load profile__
 
The load profile I’ve tested is the following:
-- A single Cassandra node
-- 40 keyspaces
-- Each keyspace has 2 small column families with a handful of rows, and 1 
large column family with approx 25,000 rows.  The rows are <1k in size.

The perf test then has 20 instances connect to the server (they use pycassa), 
doing the following operation:
-- Read a random row from the large column family on a random one of the 
keyspaces
-- Write that row into a random one of the other keyspaces (the test is 
arranged so that this is likely to be a non-existing row in the new keyspace)
 
I run the test so that 500 read/write cycles are generated per second (total 
across all instance).  The Cassandra server keeps up fine with this rate, and 
is using only a small fraction of the CPU/memory available.  Every several 
minutes
 though, there is a several-second period during which no writes are serviced.  
This seems to coincide with memtables being flushed to disk.
 
Note that this read/write rate is an order of magnitude lower than the maximum 
load this server is able to cope with if pushed as hard as possible by the 
clients.
 
 
__Tuning attempted__
 
I've tried making several changes to see if any of them improved matters:
-- I've tried putting the commitlog directory on a separate drive.  That didn't 
make any appreciable difference.
-- I've used a RAID array for the data directory to improve write performance.  
This significantly reduces the length of the slow period (from ~10s to ~2s), 
but doesn't eliminate it.  I've tried RAID10 and RAID0 using varying number
 of drives, but there doesn't seem to be a significant difference between the 
two.
-- I've used multiple drives for the data directory, symlinking the directories 
for different keyspaces to different drives.  That didn't improve things 
significantly compared to using a single drive.
-- Reducing the commitlog segment interval to force more frequent smaller 
flushes doesn't make any difference.
-- Increasing the memtable flush queue size doesn't make any difference.
-- Disabling compaction doesn't help either.
 
 
Any suggestions would be much appreciated.
 
Thanks,
James Lee

RE: Cassandra periodically stops responding to write requests under load

Reply via email to