Re: how to decommission two slow nodes?

Jonathan Ellis Thu, 20 May 2010 17:03:15 -0700

One possibility:

rsync the data to the next node in the ring that is in the same DC.
(specifically, rsync once, then flush on the source node and rsync
again.)  Then stop the entire cluster, and restart everyone but those
two nodes.  Then run nodetool repair on each machine.


If your client is not reading at CL.ALL during repair, it could miss
data that was written after the rsync.  Your call if that's
acceptable.

On Wed, May 19, 2010 at 11:57 PM, Ran Tavory <ran...@gmail.com> wrote:
> In my cluster setup I have two datacenters with 5 hosts in one DC and 3 in
> the other.
> In the 5 hosts DC I'd like to remove two hosts so I'd get 3 and 3 in each.
> The two nodes I'd like to decommission have less RAM than the other 3 so
> they operate slower.
> What's the most effective way to decommission them?
> At first I thought I'd decommission the first and then when it's done,
> decommission the second, but the problem was that when I decommissioned the
> first it started streaming its data to the second node (as well as others I
> think) and since the second node was under heavy load, and not enough ram,
> it was busy GCing and worked horribly slow. Eventually, after almost 24h of
> horribly slow streaming I gave up. This also caused the entire cluster to
> operate horribly slow.
> So, is there a better way to decommission the two under provisioned nodes
> without slowing down the cluster, or at least with a minimum effect?
> My replication is 2 and I'm using a RackAwareStrategy so (if everything is
> configured correctly with the EndPointSnitch) then at any given time, two
> copies of the data exist, one in each DC.
> Thanks
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: how to decommission two slow nodes?

Reply via email to