Hi all,

We have a 3 node cluster setup, single keyspace, about 500 tables. The
hardware is 2 cores + 16 GB RAM (Cassandra chose to have 4GB). Cassandra
version is 2.0.3. Our replication factor is 3, read/write consistency is
QUORUM. We've plugged it into our production environment as a cache in
front of postgres. Everything worked fine, we even stressed it by
explicitly propagating about 30G (10G/node) data from postgres to cassandra.

Then the problems came. Our nodes began showing high cpu usage (around 20).
The funny thing is that they were actually doing it one after another and
there was always only node with high cpu usage. Using OpsCenter we saw that
when the CPU was beginning to go high the node in question was performing
compaction. But even after the compaction was performed the cpu remained
still high, and in some cases didn't go down for hours. Our jmx monitoring
showed that it was presumably in constant garbage collection. During that
time cluster read latency goes from 2ms to 200ms

What can be the reason? Can it be high number of tables? Do we need to
adjust some settings for this setup? Is it ok to have so many tables?
Theoretically we can stuck them all in 3-4 tables.

Thanks in advance,
Alexander

Reply via email to