Hi

I am investigating Cassandra write performance and see very heavy CPU usage 
from Cassandra. I have a single node Cassandra instance running on a dual core 
(2.66 Ghz Intel ) Ubuntu 9.10 server. The writes to Cassandra are being 
generated from the same server using BatchMutate(). The client makes exactly 
one RPC call at a time to Cassandra. Each BatchMutate() RPC contains 2 MB of 
data and once it is acknowledged by Cassandra, the next RPC is done. Cassandra 
has two separate disks, one for commitlog with a sequential b/w of 130MBps and 
the other a solid state disk for data with b/w of 90MBps. Tuning various 
parameters, I observe that I am able to attain a maximum write performance of 
about 45 to 50 MBps from Cassandra. I see that the Cassandra java process 
consistently uses 100% to 150% of CPU resources (as shown by top) during the 
entire write operation. Also, iostat clearly shows that the max disk bandwidth 
is not reached anytime during the write
 operation, every now and then the i/o activity on "commitlog" disk and the 
data disk spike but it is never consistently maintained by cassandra close to 
their peak. I would imagine that the CPU is probably the bottleneck here. Does 
anyone have any idea why Cassandra beats the heck out of the CPU here? Any 
suggestions on how to go about finding the exact bottleneck here?

Some more information about the writes: I have 2 column families, the data 
though is mostly written in one column family with column sizes of around 32k 
and each row having around 256 or 512 columns. I would really appreciate any 
help here.

Thanks,
Rishi



      

Reply via email to