sorry, I missed it since it's not executable by default.

On Thu, Jan 10, 2013 at 10:05 AM, Jason Wee <peich...@gmail.com> wrote:

> It should be in the trunk, check it
> https://github.com/apache/cassandra/blob/trunk/bin/cassandra-shuffle
>
>
> On Thu, Jan 10, 2013 at 1:18 AM, Manu Zhang <owenzhang1...@gmail.com>wrote:
>
>> Is cassandra-shuffle command in the trunk? Or it is only included in the
>> Debian package? I don't find it in the trunk.
>>
>>
>> On Sat, Nov 3, 2012 at 2:18 AM, Eric Evans <eev...@acunu.com> wrote:
>>
>>> On Fri, Nov 2, 2012 at 12:38 AM, Manu Zhang <owenzhang1...@gmail.com>
>>> wrote:
>>> >> It splits into a contiguous range, because truly upgrading to vnode
>>> >> functionality is another step.
>>> >
>>> > That confuses me. As I understand it, there is no point in having 256
>>> tokens
>>> > on same node if I don't commit the shuffle
>>>
>>> This isn't exactly true.  By-partition operations (think repair,
>>> streaming, etc) will be more reliable in the sense that if they fail
>>> and need to be restarted, there is less that is lost/needs redoing.
>>> Also, if all you did was migrate from 1-token-per-node to 256
>>> contiguous tokens per node, normal topology changes (bootstrapping new
>>> nodes, decommissioning old ones), would gradually work to redistribute
>>> the partitions.  And, from a topology perspective, splitting the one
>>> partition into many contiguous partition is a no-op; it's safe to do
>>> and there is no cost to speak of from a computational or IO
>>> perspective.
>>>
>>> On the other hand, shuffling requires moving tokens around the
>>> cluster.  If you completely randomize placement, it follows that you
>>> will need to relocate all of the clusters data, so it's quite costly.
>>> It's also precedent setting, and not thoroughly tested yet.
>>>
>>> --
>>> Eric Evans
>>> Acunu | http://www.acunu.com | @acunu
>>>
>>
>>
>

Reply via email to