Hello,
I am doing some academic work with Cassandra 1.1.6 (I am on an older
version because of a bunch of implemented modifications that have been
in the works for a while), and I am wondering if the list can help me
resolve some questions I have.
I am running a cluster of 27 nodes with the following configuration:
Intel Atom (2 core) @ 1.8 GHz
4 GB RAM
250 GB HDD
64 GB SSD
Gigabit Ethernet
With this cluster size, I currently have loaded 135 GB of data
(replicated * 3), giving me data of ~15 GB per node. I am using Leveled
Compaction with a 5mb SSTable size. Commitlog is in HDD, data is on SSD.
My workload is YCSB/uniform distribution/75% read-25% write.
My questions are:
- Is this a reasonable data size for this hardware?
- What should be compaction throughput be set to? I am targeting a 99th
percentile latency SLA, and it seems that compaction throughput greatly
affects the 99th percentile latency. The guideline seems to be 16-32x
insertion rate, but this slows down the 99th percentile time
dramatically. In addition, there seems to be a feedback loop where if
you insert faster, you need more compaction, but if you had more
compaction, you can't insert as fast. What is best practice on this?
- What is a reasonable operation throughput to expect from this
configuration?
Sorry for the info dump, but I have been fighting with this for a while
now. I've tried to read everything I can about tuning and provisioning,
but continue to have an issue where I can find a load rate that hits my
99th percentile SLA on average, but have large latency spikes that don't
seem to match a pattern.
Thanks in advance for any advice you can give, even if it is just "go
read this document".
Sincerely,
Bill Katsak
Ph.D. Student
Rutgers University