And I forgot:

(6) It is fully expected that sstable counts spike during large
compactions that take a lot of time simply because smaller compactions
never get a chance to run. (There was just recently JIRA traffic that
added support for parallel compaction, but I'm not sure whether it
fully addresses this particular issue or not.) If you have a lot rows
that are written incrementally and thus span multiple sstables, and
your data size is truly large and written to fairly quickly, that
means you will have a lot of data in sstables spread out over smaller
ones that won't get compacted for extended periods once larger
multi-hundreds-of-gig sstables are being compacted. However, that
said, if you are just continually increasing your sstable count
(rather than there just being spikes) that indicates compaction is not
keeping up with write traffic.

-- 
/ Peter Schuller

Reply via email to