Hi,

  I'm still curious if I got the data movement right in this email from 
before?  Anyone?  Also, anyone know if I can scp the data directory from
a node I want to replace to a new machine?  The cassandra streaming seems
much slower than scp.

-Anthony

On Mon, Apr 19, 2010 at 04:48:23PM -0700, Anthony Molinaro wrote:
> 
> On Mon, Apr 19, 2010 at 03:28:26PM -0500, Jonathan Ellis wrote:
> > > Can I then 'nodeprobe move <token for range I want to take over>', and
> > > achieve the same as step 2 above?
> > 
> > You can't have two nodes with the same token in the ring at once.  So,
> > you can removetoken the old node first, then bootstrap the new one
> > (just specify InitialToken in the config to avoid having it guess
> > one), or you can make it a 3 step process (bootstrap, remove, move) to
> > avoid transferring so much data around.
> 
> So I'm still a little fuzzy for your 3 step case on why less data moves,
> but let me run through the two scenarios and see where we get.  Please
> correct me if I'm wrong on some point.
> 
> Let say I have 3 nodes with random partitioner and rack unaware strategy.
> Which means I have something like
> 
> Node  Size   Token  KeyRange (self + next in ring)
> ----  ----   -----  ------------------------------
> A     5 G      33    1 -> 66
> B     6 G      66       34 -> 0
> C     2 G       0          67 -> 33
> 
> Now lets say Node B is giving us some problems, so we want to replace it
> with another node D.
> 
> We've outlined 2 processes.
> 
> In the first process you recommend
> 
> 1. removetoken on node B
> 2. wait for data to move
> 3. add InitialToken of 66 and AutoBootstrap = true to node D storage-conf.xml
>    then start it
> 4. wait for data to move
> 
> So when you do the removetoken, this will cause the following transfers
> at stage 2
>   Node A sends 34->66 to Node C
>   Node C sends 67->0  to Node A
> at stage 4
>   Node A sends 34->66 to Node D
>   Node C sends 67->0  to Node D
> 
> In the second process I assume you pick a token really close to another token?
> 
> 1. add InitialToken of 34 and AutoBootstrap to true to node D storage-conf.xml
>    then start it
> 2. wait for data to move
> 3. removetoken on node B
> 4. wait for data to move
> 5. movetoken on node D to 66
> 6. wait for data to move
> 
> This results in the following moves
> at stage 2
>   Node A/B sends 33->34 to Node D (primary token range)
>   Node B sends 34->66 to Node D   (replica range)
> at stage 4
>   Node C sends 66->0 to Node D (replica range)
> at stage 6
>   No data movement as D already had 33->0
> 
> So seems like you move all the data twice for process 1 and only a small
> portion twice for process 2 (which is what you said, so hopefully I've
> outlined correctly what is happening).  Does all that sound right?
> 
> Once I've run bootstrap with the InitialToken value set in the config is
> it then ignored in subsequent restarts, and if so can I just remove it
> after that first time?
> 
> Thanks,
> 
> -Anthony
> 
> -- 
> ------------------------------------------------------------------------
> Anthony Molinaro                           <antho...@alumni.caltech.edu>

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <antho...@alumni.caltech.edu>

Reply via email to