> You would have to iterate through all sstables on the system to repair one > vnode, yes: but building the tree for just one range of the data means that > huge portions of the sstables files can be skipped. It should scale down > linearly as the number of vnodes increases (ie, with 100 vnodes, it will > take 1/100th the time to repair one vnode).
The story is less good for "nodetool cleanup" however, which still has to truck over the entire dataset. (The partitions/buckets in my crush-inspired scheme addresses this by allowing that each ring segment, in vnode terminology, be stored separately in the file system.) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)