Re: Clarification on Ring operations in Cassandra 0.5.1

Jonathan Ellis Wed, 21 Apr 2010 09:09:06 -0700

Yes, that looks right, where "token really close" means "slightly less
than" (more than would move it into a different node's range).


You can't really migrate via scp since only one node with a given
token can exist in the cluster at a time.

-Jonathan

On Wed, Apr 21, 2010 at 11:02 AM, Anthony Molinaro
<antho...@alumni.caltech.edu> wrote:
> Hi,
>
>  I'm still curious if I got the data movement right in this email from
> before?  Anyone?  Also, anyone know if I can scp the data directory from
> a node I want to replace to a new machine?  The cassandra streaming seems
> much slower than scp.
>
> -Anthony
>
> On Mon, Apr 19, 2010 at 04:48:23PM -0700, Anthony Molinaro wrote:
>>
>> On Mon, Apr 19, 2010 at 03:28:26PM -0500, Jonathan Ellis wrote:
>> > > Can I then 'nodeprobe move <token for range I want to take over>', and
>> > > achieve the same as step 2 above?
>> >
>> > You can't have two nodes with the same token in the ring at once.  So,
>> > you can removetoken the old node first, then bootstrap the new one
>> > (just specify InitialToken in the config to avoid having it guess
>> > one), or you can make it a 3 step process (bootstrap, remove, move) to
>> > avoid transferring so much data around.
>>
>> So I'm still a little fuzzy for your 3 step case on why less data moves,
>> but let me run through the two scenarios and see where we get.  Please
>> correct me if I'm wrong on some point.
>>
>> Let say I have 3 nodes with random partitioner and rack unaware strategy.
>> Which means I have something like
>>
>> Node  Size   Token  KeyRange (self + next in ring)
>> ----  ----   -----  ------------------------------
>> A     5 G      33    1 -> 66
>> B     6 G      66       34 -> 0
>> C     2 G       0          67 -> 33
>>
>> Now lets say Node B is giving us some problems, so we want to replace it
>> with another node D.
>>
>> We've outlined 2 processes.
>>
>> In the first process you recommend
>>
>> 1. removetoken on node B
>> 2. wait for data to move
>> 3. add InitialToken of 66 and AutoBootstrap = true to node D storage-conf.xml
>>    then start it
>> 4. wait for data to move
>>
>> So when you do the removetoken, this will cause the following transfers
>> at stage 2
>>   Node A sends 34->66 to Node C
>>   Node C sends 67->0  to Node A
>> at stage 4
>>   Node A sends 34->66 to Node D
>>   Node C sends 67->0  to Node D
>>
>> In the second process I assume you pick a token really close to another 
>> token?
>>
>> 1. add InitialToken of 34 and AutoBootstrap to true to node D 
>> storage-conf.xml
>>    then start it
>> 2. wait for data to move
>> 3. removetoken on node B
>> 4. wait for data to move
>> 5. movetoken on node D to 66
>> 6. wait for data to move
>>
>> This results in the following moves
>> at stage 2
>>   Node A/B sends 33->34 to Node D (primary token range)
>>   Node B sends 34->66 to Node D   (replica range)
>> at stage 4
>>   Node C sends 66->0 to Node D (replica range)
>> at stage 6
>>   No data movement as D already had 33->0
>>
>> So seems like you move all the data twice for process 1 and only a small
>> portion twice for process 2 (which is what you said, so hopefully I've
>> outlined correctly what is happening).  Does all that sound right?
>>
>> Once I've run bootstrap with the InitialToken value set in the config is
>> it then ignored in subsequent restarts, and if so can I just remove it
>> after that first time?
>>
>> Thanks,
>>
>> -Anthony
>>
>> --
>> ------------------------------------------------------------------------
>> Anthony Molinaro                           <antho...@alumni.caltech.edu>
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <antho...@alumni.caltech.edu>
>

Re: Clarification on Ring operations in Cassandra 0.5.1

Reply via email to