I am trying to repair a single CF using nodetool. It seems like the request to limit the repair to one CF is not being respected. Here is my current situation:
- Run "nodetool repair KEYSPACE CF_A" on node 3 - Validation compaction runs on nodes 2,3,4 for CF_A only (expected) - Node 3 streams SSTables from CF_A only to nodes 2 and 4 (expected) - Nodes 2 and 4 stream SSTables from ALL column families in the keyspace to node 2 (VERY unexpected) - Node 2 runs out of disk space before SSTable rebuild for all cfs can complete. Presumably this is a bug (?). This is the first time in quite awhile that I have run a repair (I don't perform deletes, just use expiring columns). I have included pertinent log entries below: Log entry from node 3 after validation compaction (I would say expected): INFO [AntiEntropyStage:1] 2011-01-24 13:36:06,235 AntiEntropyService.java (line 220) Queueing comparison #<Differencer #<TreeRequest manual-repair-ef684731-cb9b-408c-acaa-094234e58979, /192.168.4.16, (kikmetrics,UserEventsByUser)>> INFO [AntiEntropyStage:1] 2011-01-24 13:36:06,624 AntiEntropyService.java (line 481) Endpoints /192.168.4.17 and /192.168.4.16 have 1 range(s) out of sync for (kikmetrics,UserEventsByUser) INFO [AntiEntropyStage:1] 2011-01-24 13:36:06,625 AntiEntropyService.java (line 498) Performing streaming repair of 1 ranges for #<TreeRequest manual-repair-ef684731-cb9b-408c-acaa-094234e58979, /192.168.4.16, (kikmetrics,UserEventsByUser)> INFO [AntiEntropyStage:1] 2011-01-24 13:36:06,655 StreamOut.java (line 173) Stream context metadata [/var/lib/cassandra/data/kikmetrics/UserEventsByUser-e-3-Data.db/(0,12858523 5) progress=0/128585235 - 0%, /var/lib/cassandra/data/kikmetrics/UserEventsByUser-e-2-Data.db/(0,34069035) progress=0/34069035 - 0%, /var/lib/cassandra/data/kikmetrics/UserEventsByUser-e-1-Data.db/(0,31673166) progress=0/31673166 - 0%, /var/lib/cassandra/data/kikmetrics/UserEventsByUser-e-4-Data.db/(0,25515348) progress=0/25515348 - 0%], 4 sstables. Log entry from node 2/4 after validation (unexpected, it is sending sstables for all CFs, not just the one being repaired): INFO [StreamStage:1] 2011-01-24 13:36:42,815 StreamOut.java (line 173) Stream context metadata [/var/lib/cassandra/data/kikmetrics/PacketEventsByEvent-e-1573-Data.db/(8501 3299,170188890) progress=0/85175591 - 0%, /var/lib/cassandra/data/kikmetrics/PacketEventsByEvent-e-1559-Data.db/(86608 8590,1569557337) progress=0/703468747 - 0%, /var/lib/cassandra/data/kikmetrics/PacketEventsByEvent-e-1519-Data.db/(57433 2782,1027659278) .... progress=0/5158259489 - 0%], 77 sstables. Log entry from node 3 after running out of disk space: INFO [Thread-3462] 2011-01-24 16:00:40,658 StreamInSession.java (line 124) Streaming of file /var/lib/cassandra/data/kikmetrics/UserEventsByEvent-e-1313-Data.db/(0,28295 156514) progress=15267921920/28295156514 - 53% from org.apache.cassandra.streaming.StreamInSession@8df7e0c failed: requesting a retry. INFO [Thread-3423] 2011-01-24 16:00:40,658 StreamInSession.java (line 124) Streaming of file /var/lib/cassandra/data/kikmetrics/UserEventsByEvent-e-1348-Data.db/(2953062 0905,58066315307) progress=18898300928/28535694402 - 66% from org.apache.cassandra.streaming.StreamInSession@8df7e0c failed: requesting a retry. Dan Hendry (403) 660-2297