Related to our overstreaming, we have a cluster of about 25 nodes, with
most at about 1000 sstable files (Data + others).

And about four that are at 20,000 - 30,000 sstable files (Data+Index+etc).

We have vertically scaled the outlier machines and turned off compaction
throttling thinking it was compaction that couldn't keep up. That
stabilized the growth, but the sstable count is not going down.

The TWCS code seems to highly bias towards "recent" tables for compaction.
We figured we'd boost the throughput/compactors and that would solve the
more recent ones, and the older ones would fall off. But the number of
sstables has remained high on a daily basis on the couple "bad nodes".

Is this simply a lack of sufficient compaction throughput? Is there
something in TWCS that would force frequent flushing more than normal?

Reply via email to