Wouldn't it be a better idea to issue removenode on the crashed node, wipe the whole data directory (including system) and let it bootstrap cleanly so that it's not part of the cluster while it gets back up to speed?
On Tue, Nov 11, 2014, 12:32 PM Robert Coli <rc...@eventbrite.com> wrote: > On Tue, Nov 11, 2014 at 10:48 AM, venkat sam <samvenkat...@outlook.com> > wrote: > >> >> I have a 5 node cluster. In one node one of the data directory partition >> got crashed. After disk replacement I restarted the Cassandra daemon and >> gave nodetool repair to restore the missing replica’s. But nodetool repair >> is getting stuck after syncing one of the columnfamily >> > > Yes, nodetool repair often hangs. Search through the archives, but the > summary is. > > 1) try to repair CFs one at a time > 2) it's worse with vnodes > 3) try tuning the phi detector or network stream timeouts > > =Rob > >