Exactly, but why would reads be significantly slower over time when
including just one more, although sometimes large, SSTable in the read?
Ji Cheng skrev 2012-04-26 11:11:
I'm also quite interested in this question. Here's my understanding on
this problem.
1. If your workload is append-only, doing a major compaction shouldn't
affect the read performance too much, because each row appears in one
sstable anyway.
2. If your workload is mostly updating existing rows, then more and
more columns will be obsoleted in that big sstable created by major
compaction. And that super big sstable won't be compacted until you
either have another 3 similar-sized sstables or start another major
compaction. But I am not very sure whether this will be a major
problem, because you only end up with reading one more sstable. Using
size-tiered compaction against mostly-update workload itself may
result in reading multiple sstables for a single row key.
Please correct me if I am wrong.
Cheng
On Thu, Apr 26, 2012 at 3:50 PM, Fredrik
<fredrik.l.stigb...@sitevision.se
<mailto:fredrik.l.stigb...@sitevision.se>> wrote:
In the tuning documentation regarding Cassandra, it's recomended
not to run major compactions.
I understand what a major compaction is all about but I'd like an
in depth explanation as to why reads "will continually degrade
until the next major compaction is manually invoked".
From the doc:
"So while read performance will be good immediately following a
major compaction, it will continually degrade until the next major
compaction is manually invoked. For this reason, major compaction
is NOT recommended by DataStax."
Regards
/Fredrik