Re: Question about node tool repair

Artur Kronenberg Wed, 22 Jan 2014 01:32:54 -0800

About repairs,

we encountered a similar problem with our setup where repairs would takeages to complete. Based on your setup you can try loading data into pagecache before running repairs. Depending on how much data you can hold incache, this will speed up your repairs massively.


-- artur

On 21/01/14 20:33, Logendran, Dharsan (Dharsan) wrote:

Thanks Rob,

Dharsan

*From:*Robert Coli [mailto:rc...@eventbrite.com]
*Sent:* January-21-14 2:26 PM
*To:* user@cassandra.apache.org
*Subject:* Re: Question about node tool repair
On Mon, Jan 20, 2014 at 2:47 PM, Logendran, Dharsan (Dharsan)<dharsan.logend...@alcatel-lucent.com<mailto:dharsan.logend...@alcatel-lucent.com>> wrote:
We have a two node cluster with the replication factor of 2. The dbhas more than 2500 column families(tables). The nodetool -pr repairon an empty database(one or table has a litter data) takes about 30hours to complete. We are using Cassandra Version 2.0.4. Is thereany way for us to speed up this?.
Cassandra 2.0.2 made aspects of repair serial and therefore logicallymuch slower as a function of replication factor. Yours is not thefirst report I have heard of >= 2.0.2 era repair being unreasonably slow.
https://issues.apache.org/jira/browse/CASSANDRA-5950
You can use -par (not at all confusingly named with -pr!) to get theold parallel behavior.
Cassandra 2.1 has this ticket to improve repair with vnodes.

https://issues.apache.org/jira/browse/CASSANDRA-5220
But really you should strongly consider how much you need to runrepair, and at very least probably increase gc_grace_seconds from theunreasonably low default of 10 days to 32 days, and then run yourrepair on the first of each month.
https://issues.apache.org/jira/browse/CASSANDRA-5850
IMO it is just a complete and total error if repair of an actuallyempty database is anything but a NO-OP. I would file a JIRA ticket,were I you.
=Rob

Re: Question about node tool repair

Reply via email to