Ps. In cassandra.yaml, I did set: stream_throughput_outbound_megabits_per_sec: 700
From: Jason Tyler <jaty...@yahoo-inc.com<mailto:jaty...@yahoo-inc.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Wednesday, June 4, 2014 at 2:34 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Cc: Francois Richard <frich...@yahoo-inc.com<mailto:frich...@yahoo-inc.com>> Subject: nodetool move seems slow Hello, We have a 5-node cluster runing cassandra 1.2.16, with a significant amount of data: Address Rack Status State Load Owns Token 6783174585269344219 10.198.xx.xx1 rack1 Up Normal 2.59 TB 60.00% -9223372036854775808 10.198.xx.xx2 rack1 Up Normal 1.49 TB 40.00% -5534023222112865485 10.198.xx.xx3 rack1 Up Normal 2.18 TB 53.23% -1844674407370955162 10.198.xx.xx4 rack1 Up Normal 2.86 TB 80.00% 5534023222112865484 10.198.xx.xx5 rack1 Up Moving 2.32 TB 66.77% 6783174585269344219 The first three nodes (.xx1 - .xx3 above) were at the desired tokens, so I issued a move on .xx4: nodetool move 1844674407370955161 That was about 40hrs ago! When I do nodetool netstats, I do see apparent progress: jatyler@xx4:~$ nodetool netstats Mode: MOVING Not sending any streams. Streaming from: /10.198.xx.xx2 SyncCore: /var/cassandra/data/SyncCore/file-ic-31475-Data.db sections=1 progress=0/77699597 - 0% … SyncCore: /var/cassandra/data/SyncCore/anotherFile-ic-32252-Data.db sections=1 progress=0/1254063427 - 0% Read Repair Statistics: Attempted: 8047367 Mismatch (Blocking): 97327 Mismatch (Background): 74369 Pool Name Active Pending Completed Commands n/a 0 472255111 Responses n/a 1 749751322 I wrote 'apparent progress' because it reports “MOVING” and the Pending Commands/Responses are changing over time. However, I haven’t seen the individual .db files progress go above 0%. Meanwhile, the system appears to have plenty of unused bandwidth, from 'iostat -x -m 1': Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 56.00 1338.00 171.00 57.59 0.89 79.36 0.57 0.38 0.17 25.30 avg-cpu: %user %nice %system %iowait %steal %idle 22.77 1.82 2.35 0.20 0.00 72.86 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 785.00 0.00 33.80 0.00 88.17 0.27 0.35 0.18 14.10 avg-cpu: %user %nice %system %iowait %steal %idle 20.16 2.05 2.22 0.20 0.00 75.37 Is 40 hours too long for this move? Should I be seeing individual .db files report more progress? Should I start with the first box (even though the token appears correct)? Any thoughts would be greatly appreciated. THX Cheers, ~Jason *******