On 23 Oct 2014, at 21:29 , Robert Coli <rc...@eventbrite.com> wrote:
> On Thu, Oct 23, 2014 at 9:33 AM, Sean Bridges <sean.brid...@gmail.com> wrote: > The change from parallel to sequential is very dramatic. For a small cluster > with 3 nodes, using cassandra 2.0.10, a parallel repair takes 2 hours, and > io throughput peaks at 6 mb/s. Sequential repair takes 40 hours, with > average io around 27 mb/s. Should I file a jira? > > As you are an actual user actually encountering the problem I had only > conjectured about, you are the person best suited to file such a ticket on > the reasonableness of the -par default. :D Hm? I’ve been banging my head against the exact same problem (cluster size five nodes, RF=3, ~40GB/node) - paraller repair takes about 6 hrs whereas serial takes some 48 hours or so. In addition, the compaction impact is roughly the same - that is, there’s the same number of compactions triggered per minute, but serial runs eight times more of them. There does not seem to be a difference between the node response latency during parallel or serial repair. NB: We do increase our compaction throughput during calmer times, and lower it through busy times, and the serial compaction takes enough time to hit the busy period - that might also have an impact to the overall performance. If I had known that this had so far been a theoretical problem, I would’ve spoken up earlier. Perhaps serial repair is not the best default. /Janne