On Fri, Nov 2, 2012 at 12:38 AM, Manu Zhang <owenzhang1...@gmail.com> wrote:
>> It splits into a contiguous range, because truly upgrading to vnode
>> functionality is another step.
>
> That confuses me. As I understand it, there is no point in having 256 tokens
> on same node if I don't commit the shuffle

This isn't exactly true.  By-partition operations (think repair,
streaming, etc) will be more reliable in the sense that if they fail
and need to be restarted, there is less that is lost/needs redoing.
Also, if all you did was migrate from 1-token-per-node to 256
contiguous tokens per node, normal topology changes (bootstrapping new
nodes, decommissioning old ones), would gradually work to redistribute
the partitions.  And, from a topology perspective, splitting the one
partition into many contiguous partition is a no-op; it's safe to do
and there is no cost to speak of from a computational or IO
perspective.

On the other hand, shuffling requires moving tokens around the
cluster.  If you completely randomize placement, it follows that you
will need to relocate all of the clusters data, so it's quite costly.
It's also precedent setting, and not thoroughly tested yet.

--
Eric Evans
Acunu | http://www.acunu.com | @acunu

Reply via email to