Provisioning/Configuration Question

William Katsak Sat, 01 Mar 2014 06:33:53 -0800

Hello,

I am doing some academic work with Cassandra 1.1.6 (I am on an older
version because of a bunch of implemented modifications that have been
in the works for a while), and I am wondering if the list can help me
resolve some questions I have.


I am running a cluster of 27 nodes with the following configuration:

Intel Atom (2 core) @ 1.8 GHz
4 GB RAM
250 GB HDD
64 GB SSD
Gigabit Ethernet

With this cluster size, I currently have loaded 135 GB of data
(replicated * 3), giving me data of ~15 GB per node. I am using Leveled
Compaction with a 5mb SSTable size. Commitlog is in HDD, data is on SSD.

My workload is YCSB/uniform distribution/75% read-25% write.

My questions are:

- Is this a reasonable data size for this hardware?

- What should be compaction throughput be set to?  I am targeting a 99th
percentile latency SLA, and it seems that compaction throughput greatly

affects the 99th percentile latency. The guideline seems to be 16-32xinsertion rate, but this slows down the 99th percentile timedramatically. In addition, there seems to be a feedback loop where ifyou insert faster, you need more compaction, but if you had morecompaction, you can't insert as fast. What is best practice on this?


- What is a reasonable operation throughput to expect from this
configuration?

Sorry for the info dump, but I have been fighting with this for a while
now. I've tried to read everything I can about tuning and provisioning,
but continue to have an issue where I can find a load rate that hits my
99th percentile SLA on average, but have large latency spikes that don't
seem to match a pattern.

Thanks in advance for any advice you can give, even if it is just "go
read this document".

Sincerely,

Bill Katsak
Ph.D. Student
Rutgers University

Provisioning/Configuration Question

Reply via email to