I think this JIRA answers your question: https://issues.apache.org/jira/browse/CASSANDRA-2610
which in order not to duplicate work (creation of Merkle trees) repair is done on all replicas for a range. Cheers, Omid On Tue, Sep 25, 2012 at 8:27 AM, Sergey Tryuber <stryu...@gmail.com> wrote: > Hi Radim > > Unfortunately number of compaction tasks is not overestimated. The number is > decremented one-by-one and this process takes several hours for our 40GB > node(( Also, when a lot of compaction tasks appears, we see that total disk > space used (via JMX) is doubled and Cassandra really tries to compact > something. When compactions are done, "total disk space used" is back to > normal. > > > On 24 September 2012 19:04, Radim Kolar <h...@filez.com> wrote: >> >> >>> Repair process by itself is going well in a background, but the issue I'm >>> concerned is a lot of unnecessary compaction tasks >> >> number in compaction tasks counter is over estimated. For example i have >> 1100 tasks left and if I will stop inserting data, all tasks will finish >> within 30 minutes. >> >> I suppose that this counter is incremented for every sstable which needs >> compaction, but its not decremented properly because you can compact about >> 20 sstables at once, and this reduces counter only by 1. > >