> Is it possible to use nodetool repair to fix this with the current data set? > > I issued a repair command and the other nodes seem to be doing the > correct things but I concerned by this: "Uncaught exception in thread > Thread[ROW-READ-STAGE:4327,5,main]" > > Will the affect node ever be able to do anything?
Since it seems you're willing to keep the node up with the missing data, I would remove (MOVE just to be safe) the bf+index files corresponding to the over-written data. You definitely don't want a bf/index files that does not match the data. After that, a repair will propagate the missing data from other nodes. (Implicit is that you do this with the node turned off; not just "live" while the node is running.) As to whether or not the exception you're seeing is expected when you have a bf/index that is out of synch with the data file - I don't know, and one would have to either know or look at the 0.6.6 codebase, but it seems like a plausible error to trigger under such conditions. But that's speaking solely based on the context and the stack trace, not looking at the code. But note: Removing data from a noder "under it's feet" *will* violate consistency since the node will be missing data without "knowing" it's missing data. So for example (but not limited to) a read at CL.ONE that goes to that node will fail to return data, or maybe return old data if the missing data files contained newer versions of data that exists elsewhere in sstables on the node. -- / Peter Schuller (@scode on twitter)