Hey, we are considering using Cassandra for quite large project and because of that I made some tests with Cassandra. I was testing performance and stability mainly.
My main tool was stress.py for benchmarks (or equivalent written in C++ to deal with python2.5 lack of multiprocessing). I will focus only on reads (random with normal distribution, what is default in stress.py) because writes were /quite/ good. I have 8 machines (xen quests with dedicated pair of 2TB SATA disks combined in RAID-O for every guest). Every machine has 4 individual cores of 2.4 Ghz and 4GB RAM. Cassandra commitlog and data dirs were on the same disk, I gave 2.5GB for Heap for Cassandra, key and row cached were disabled (standard Keyspace1 schema, all tests use Standard1 CF). All other options were defaults. I've disabled cache because I was testing random (or semi random - normal distribution) reads so it wouldnt help so much (and also because 4GB of RAM is not a lot). For first test I installed Cassandra on only one machine to test it and remember results for further comparisons with large cluster and other DBs. 1) RF was set to 1. I've inserted ~20GB of data (this is number reported in load column form nodetool ring output) using stress.py (100 colums per row). Then I've tested reads and got 200 rows/second (reading 100 columns per row, CL=ONE, disks were bottleneck, util was 100%). There was no other operation pending during reads (compaction, insertion, etc..). 2) So I moved to bigger cluster, with 8 machines and RF set to 2. I've inserted about ~20GB data per node (so 20 GB * 8 / 2 = 80GB of "real data"). Then I've tested reads, exactly te same way as before, and got about 450 rows/second (reading 100 columns (but reading only 1 in fact makes no difference), CL=ONE, disks on every machine was 100% util because of random reads). 3) Then I changed RF from 2 to 3 on cluster described in 2). So I ended with every node loaded with about 30GB of data. Then as usual, I've tested reads, and got only 300 rows/second from whole cluster (100% util on every disk). 4) Last test was with RF=3 as before, but I've inserted even more data, so every node on 8-machines cluster had ~100GB of data (8 * 100GB / 3 = 266GB of real data). In this case I've got only 125 rows/second. I was using multiple processes and machines to test reads. *So my question is why these numbers are so low? What is especially suprising for me is that changing RF from 2 to 3 drops performance from 450 to 300 reads per second. Is this because of read repair?* PS. To compare Cassandra performance with other DBs, I've also tested MySQL with almost exact data (one table with two columns, key (int PK) and value(VARCHAR(500)) simulating 100 columns in Cassandra for single row). MySQL was installed on the same machine as Cassandra from test 1) (which is one of these 8 machines described before). I've inserted some data and then tested random reads (which was even worse for caching because I've used standard rand() from C++ to generate keys, not normal distribution). Here are results: size of data in db -> reads per second 21 GB -> 340 400 GB -> 200 So I've got more reads from single MySQL with 400GB of data than from 8 machines storing about 266GB. This doesn't look good. What am I doing wrong? :) Cheers, Kamil