On Mon, Jan 20, 2014 at 2:47 PM, Logendran, Dharsan (Dharsan) < dharsan.logend...@alcatel-lucent.com> wrote:
> We have a two node cluster with the replication factor of 2. The db > has more than 2500 column families(tables). The nodetool -pr repair on an > empty database(one or table has a litter data) takes about 30 hours to > complete. We are using Cassandra Version 2.0.4. Is there any way for us > to speed up this?. > Cassandra 2.0.2 made aspects of repair serial and therefore logically much slower as a function of replication factor. Yours is not the first report I have heard of >= 2.0.2 era repair being unreasonably slow. https://issues.apache.org/jira/browse/CASSANDRA-5950 You can use -par (not at all confusingly named with -pr!) to get the old parallel behavior. Cassandra 2.1 has this ticket to improve repair with vnodes. https://issues.apache.org/jira/browse/CASSANDRA-5220 But really you should strongly consider how much you need to run repair, and at very least probably increase gc_grace_seconds from the unreasonably low default of 10 days to 32 days, and then run your repair on the first of each month. https://issues.apache.org/jira/browse/CASSANDRA-5850 IMO it is just a complete and total error if repair of an actually empty database is anything but a NO-OP. I would file a JIRA ticket, were I you. =Rob