Hi list, we added a new node to existing 8-nodes cluster with C* 1.2.9 without vnodes and because we are almost totally out of space, we are shuffling the token fone node after another (not in parallel). During one of this move operations, the receiving node died and thus the streaming failed:
WARN [Streaming to /X.Y.Z.18:2] 2014-12-19 19:25:56,227 StorageService.java (line 3703) Streaming to /X.Y.Z.18 failed INFO [RMI TCP Connection(12940)-X.Y.Z.17] 2014-12-19 19:25:56,233 ColumnFamilyStore.java (line 629) Enqueuing flush of Memtable-local@433096244(70/70 serialized/live bytes, 2 ops) INFO [FlushWriter:3772] 2014-12-19 19:25:56,238 Memtable.java (line 461) Writing Memtable-local@433096244(70/70 serialized/live bytes, 2 ops) ERROR [Streaming to /X.Y.Z.18:2] 2014-12-19 19:25:56,246 CassandraDaemon.java (line 192) Exception in thread Thread[Streaming to /X.Y.Z.18:2,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Broken pipe at com.google.common.base.Throwables.propagate(Throwables.java:160) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcherImpl.write0(Native Method) After restart of the receiving node, we tried to perform the move again, but it failed with: Exception in thread "main" java.io.IOException: target token 113427455640312821154458202477256070486 is already owned by another node. at org.apache.cassandra.service.StorageService.move(StorageService.java:2930) So we tried to move it with a token just 1 higher, to trigger the movement. This didn't move anything, but finished successfully: INFO [Thread-5520] 2014-12-19 20:00:24,689 StreamInSession.java (line 199) Finished streaming session 4974f3c0-87b1-11e4-bf1b-97d9ac6bd256 from /X.Y.Z.18 Now, it is quite improbable that the first streaming was done and it died just after copying everything, as the ERROR was the last message about streaming in the logs. Is there any way how to make sure the data are really moved and thus running nodetool cleanup is safe? Thank you. Jiri Hoky