I'm having trouble completing a repair on several of my nodes due to errors during compaction. This is a 6 node cluster using the simple replication strategy, rf=3, with each node assigned a single token. I'm running "nodetool repair -pr" on node1, which progresses until a specific keyspace then appears to hang. On the replicas, nodes 2 and 3, I find errors that seem related to compaction. I'm considering running a scrub if I can determine the column family where the errors occur. But I'm not confident I understand the problem, and I'm wary of making it worse. What's the best way to recover from these errors? Note, this cluster was recently upgraded from 1.1.6 to 1.2.1, then to 1.2.2.
node2 ERROR [Thread-97275] 2013-03-13 23:51:30,359 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-97275,5,main] java.lang.RuntimeException: Last written key DecoratedKey(161894077670705622023702574770140080251, 757365723a3a313a3a373537363636393130) >= current key DecoratedKey( ERROR [CompactionExecutor:7697] 2013-03-15 21:45:59,584 CassandraDaemon.java (line 133) Exception in thread Thread[Compactio nExecutor:7697,1,main] java.lang.AssertionError: originally calculated column size of 321455446 but now it is 321455483 node3 ERROR [Thread-97525] 2013-03-13 23:51:44,788 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-97525,5,main] java.lang.RuntimeException: Last written key DecoratedKey(161894077670705622023702574770140080251, 757365723a3a313a3a373537363636393130) >= current key DecoratedKey ERROR [Thread-97564] 2013-03-13 23:54:03,403 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-97564,5,main] java.lang.RuntimeException: Last written key DecoratedKey(152706250731373455824787766459206671594, 757365723a3a313a3a333434313038323239) >= current key DecoratedKey( ERROR [Thread-661] 2013-03-15 21:02:05,981 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-661,5,main] java.lang.NegativeArraySizeException Details: 6 node cluster cassandra 1.2.2 - single token per node RandomPartitioner, EC2Snitch Replication: SimpleStrategy, rf=3 Ubuntu 10.10 x86_64 EC2 m1.large Dane