Exactly, but why would reads be significantly slower over time when including just one more, although sometimes large, SSTable in the read?

Ji Cheng skrev 2012-04-26 11:11:
I'm also quite interested in this question. Here's my understanding on this problem.

1. If your workload is append-only, doing a major compaction shouldn't affect the read performance too much, because each row appears in one sstable anyway.

2. If your workload is mostly updating existing rows, then more and more columns will be obsoleted in that big sstable created by major compaction. And that super big sstable won't be compacted until you either have another 3 similar-sized sstables or start another major compaction. But I am not very sure whether this will be a major problem, because you only end up with reading one more sstable. Using size-tiered compaction against mostly-update workload itself may result in reading multiple sstables for a single row key.

Please correct me if I am wrong.

Cheng


On Thu, Apr 26, 2012 at 3:50 PM, Fredrik <fredrik.l.stigb...@sitevision.se <mailto:fredrik.l.stigb...@sitevision.se>> wrote:

    In the tuning documentation regarding Cassandra, it's recomended
    not to run major compactions.
    I understand what a major compaction is all about but I'd like an
    in depth explanation as to why reads "will continually degrade
    until the next major compaction is manually invoked".

    From the doc:
    "So while read performance will be good immediately following a
    major compaction, it will continually degrade until the next major
    compaction is manually invoked. For this reason, major compaction
    is NOT recommended by DataStax."

    Regards
    /Fredrik



Reply via email to