Related to our overstreaming, we have a cluster of about 25 nodes, with most at about 1000 sstable files (Data + others).
And about four that are at 20,000 - 30,000 sstable files (Data+Index+etc). We have vertically scaled the outlier machines and turned off compaction throttling thinking it was compaction that couldn't keep up. That stabilized the growth, but the sstable count is not going down. The TWCS code seems to highly bias towards "recent" tables for compaction. We figured we'd boost the throughput/compactors and that would solve the more recent ones, and the older ones would fall off. But the number of sstables has remained high on a daily basis on the couple "bad nodes". Is this simply a lack of sufficient compaction throughput? Is there something in TWCS that would force frequent flushing more than normal?