My cluster is very small (300 MB) and compact was taking more than 2 hours.
I ended up bouncing all the nodes. After that, I was able to run repair on all nodes, and each one takes less than a minute. If this happens again I will be sure to run compactionstats and netstats. Thanks for that tip. Bill On Wed, Apr 25, 2012 at 11:49 AM, Gregg Ulrich <gulr...@netflix.com> wrote: > How much data do you have and how long is "a while"? In my experience > repairs can take a very long time. Check to see if validation compactions > are running (nodetool compactionstats) or if files are streaming (nodetool > netstats). If either of those are in progress then your repair should be > running. I've seen 12 node, 50G clusters take days to repair to a new data > center. > > Not sure if 1.0 is different but in 0.X I don't believe killing the > nodetool process stops the repair. When we need to stop a repair we have > bounced all of the participating nodes. I've been told that there is no > harm in stopping repairs. > > On Apr 24, 2012, at 2:55 PM, Bill Au wrote: > > > I am running 1.0.8. I am adding a new data center to an existing > cluster. Following steps outlined in another thread on the mailing list, > things went fine except for the last step, which is to run repair on all > the nodes in the new data center. Repair seems to be hanging indefinitely. > There is no activity in system.log. I did notice that the node being > repair is requesting ranges from nodes in both the existing and new data > center. Since there is not data in the new data center initially, I though > that it may be why repair is hanging. So I break out of the repair with a > control-C after waiting for a while. I do see data being added to the new > nodes. When I ran repair for the second time it is still hanging. > > > > Why is repair hanging? Is it save to use control-C to break out of it. > How do I recover from this? > > > > Bill > >