The more often you repair, the quicker it will be. The more often your nodes go down the longer it will be.
Repair streams data that is missing between nodes. So the more data that is different the longer it will take. Your workload is impacted because the node has to scan the data it has to be able to compare with other nodes, and if there are differences, it has to send/receive data from other nodes. -----Original Message----- From: A J [mailto:s5a...@gmail.com] Sent: Monday, July 11, 2011 2:43 PM To: user@cassandra.apache.org Subject: Node repair questions Hello, Have the following questions related to nodetool repair: 1. I know that Nodetool Repair Interval has to be less than GCGraceSeconds. How do I come up with an exact value of GCGraceSeconds and 'Nodetool Repair Interval'. What factors would want me to change the default of 10 days of GCGraceSeconds. Similarly what factors would want me to keep Nodetool Repair Interval to be just slightly less than GCGraceSeconds (say a day less). 2. Does a Nodetool Repair block any reads and writes on the node, while the repair is going on ? During repair, if I try to do an insert, will the insert wait for repair to complete first ? 3. I read that repair can impact your workload as it causes additional disk and cpu activity. But any details of the impact mechanism and any ballpark on how much the read/write performance deteriorates ? Thanks.