Re: Question about node failure...

Ned Wolpert Mon, 29 Mar 2010 10:41:15 -0700

So,  what does "anti-entropy repair" do then?

Sounds like you have to 'decommission' the dead node, then I thought run
'nodeprobe repair' to get the data adjusted back to a replication factor of
3, right?


Also, what is the method to decommission a dead node? pass in the IP address
of the dead node to nodeprobe on a member of the cluster? I've only used
'decommission' to remove the node I ran it on from the cluster... not a
different node.

It seems like if you decommission a node it should fix the replication
factor for data that was on that node in this case...

On Mon, Mar 29, 2010 at 10:32 AM, Jonathan Ellis <jbel...@gmail.com> wrote:

> On Mon, Mar 29, 2010 at 12:27 PM, Ned Wolpert <ned.wolp...@imemories.com>
> wrote:
> > Folks-
> >
> > Can someone point out what happens during a node failure. Here is the
> > Specific usecase:
> >
> >   - Cassandra cluster with 4 nodes, replication factor of 3
> >   - One node fails.
> >   - At this point, data that existed on the one failed node has copies on
> 2
> > live nodes.
> >   - The failed node never comes back
> >
> > First question: At what point does Cassandra re-migrate that data that
> only
> > exists on 2 nodes to another node to retain the replication factor of 3?
>
> When you tell it to decommission the dead one.
>
> > Second question: Given the above case, if a brand new node is added to
> the
> > cluster, does anything happen to the data that now only exists on 2
> nodes?
>
> No, Cassandra doesn't automatically assume that "this node is never
> coming back" w/o intervention, by design.  (Temporary failures are
> much more common than permanent ones.)
>
> -Jonathan
>



-- 
Virtually, Ned Wolpert

"Settle thy studies, Faustus, and begin..."   --Marlowe

Re: Question about node failure...

Reply via email to