Re: Question regarding major compaction.

Fredrik Thu, 26 Apr 2012 02:37:49 -0700

Exactly, but why would reads be significantly slower over time whenincluding just one more, although sometimes large, SSTable in the read?


Ji Cheng skrev 2012-04-26 11:11:

I'm also quite interested in this question. Here's my understanding onthis problem.
1. If your workload is append-only, doing a major compaction shouldn'taffect the read performance too much, because each row appears in onesstable anyway.
2. If your workload is mostly updating existing rows, then more andmore columns will be obsoleted in that big sstable created by majorcompaction. And that super big sstable won't be compacted until youeither have another 3 similar-sized sstables or start another majorcompaction. But I am not very sure whether this will be a majorproblem, because you only end up with reading one more sstable. Usingsize-tiered compaction against mostly-update workload itself mayresult in reading multiple sstables for a single row key.
Please correct me if I am wrong.

Cheng
On Thu, Apr 26, 2012 at 3:50 PM, Fredrik<fredrik.l.stigb...@sitevision.se<mailto:fredrik.l.stigb...@sitevision.se>> wrote:
    In the tuning documentation regarding Cassandra, it's recomended
    not to run major compactions.
    I understand what a major compaction is all about but I'd like an
    in depth explanation as to why reads "will continually degrade
    until the next major compaction is manually invoked".

    From the doc:
    "So while read performance will be good immediately following a
    major compaction, it will continually degrade until the next major
    compaction is manually invoked. For this reason, major compaction
    is NOT recommended by DataStax."

    Regards
    /Fredrik

Re: Question regarding major compaction.

Reply via email to