Re: replace dead node? " token -1 "

Jim Cistaro Wed, 15 Aug 2012 10:31:56 -0700

I have not viewed the code, but it would seem that replace_token does not 
"remove token", because that would spread the data and then "unspread" it when 
the new node joins.  But like I said, I have not read the code.

>From our standpoint, we want the tokens to stay the same when possible due to 
>the way our backups are tagged.

As for "old nodes staying around", you are correct, we never remove token 
(because we replace node for that same token) and the gossip-ing keeps 
knowledge of that old node.

Sorry if this explanation is not that clear.  This issue is a little unclear 
and we are dealing wth it from an ops POV rather than a dev understanding of 
the code.

As for the attractiveness of the T-1 approach.  If you don't have the need for 
token consistency, then it might be more attractive for you.  We don't use it, 
so I cannot say if that approach has any issues, etc.

Jim

From: Yang <[email protected]<mailto:[email protected]>>
Reply-To: <[email protected]<mailto:[email protected]>>
Date: Wed, 15 Aug 2012 02:00:55 -0700
To: <[email protected]<mailto:[email protected]>>
Subject: Re: replace dead node? " token -1 "

considering there is this minor "old node hanging around" issue, would the old 
T-1 approach sound more attractive?
that way you don't necessarily have to remove the dead token immediately, but 
could come back the next day, or even a week  later. T-1 would behave 
essentially the same in terms of partitioning the data range.

Thanks
Yang

On Wed, Aug 15, 2012 at 1:39 AM, Yang 
<[email protected]<mailto:[email protected]>> wrote:
ok,  I see, the cassandra.replace_token  setting essentially  executes the 
manual removeToken step. so the dead node should be removed.

is this the "old node hanging around" issue that you described?
https://issues.apache.org/jira/browse/CASSANDRA-3259
looks this JIRA is fixed in 1.0x already, so it's another issue?

Thanks
Yang

On Tue, Aug 14, 2012 at 11:03 PM, Yang 
<[email protected]<mailto:[email protected]>> wrote:
Jim:

thanks a  lot for the info.

when you say "old nodes sometimes hanging around as "unreachable nodes" when 
describing cluster", you mean after the new node boots up and assumes ownership 
of the same token, you have not manually run nodetool removeToken, right? this 
kind of makes sense --- since it seems that the membership being gossiped 
around still contains the dead node (which is represented by a different AWS 
internal ip), though the same token is being associated to both dead and new 
nodes ??? I'm getting a bit confused here....

I think previously when I boot up a new node with the same token, while the old 
host is dead, the other nodes on the
ring says something like "this token xxxxxx is already owned by 
old_node_ip_here,...... ".  I don't remember exactly the behavior now, that's 
why I'm cautious of using T instead of T-1.

I'm doing more tests to confirm this behavior

Thanks
Yang

On Tue, Aug 14, 2012 at 10:17 PM, Jim Cistaro 
<[email protected]<mailto:[email protected]>> wrote:
We use priam to replace nodes using replace_token.  We do see some issues 
(currently on 1.0.9, as well as earlier versions) with replace_token.

Apparently there are some known issues with replace_token.  We have experienced 
the old nodes sometimes hanging around as "unreachable nodes" when describing 
cluster.  Also, we have experienced problems where moving the new node causes 
the old "replaced" node to resurrect for the token that was outgoing during the 
move.

You can notice these old nodes hanging around in logs.  You will see messages 
like:
StorageService.java (line 1020) Nodes /<old_ip> and /<new_ip> have the same 
token NNNNNNNNNN.  Ignoring /<old_ip>.

We have then had to "nt removetoken" to clean things up after moves.  We are 
also investigating using method unsafeAssassinateEndpoint (via jmx) to clean up 
some of the unreachables.

Like I said, we still use replace_token, but be aware of these possible 
inconveniences.

Jim Cistaro
Netflix Cassandra Operations

From: Yang <[email protected]<mailto:[email protected]>>
Reply-To: <[email protected]<mailto:[email protected]>>
Date: Tue, 14 Aug 2012 21:58:30 -0700
To: <[email protected]<mailto:[email protected]>>
Subject: Re: replace dead node? " token -1 "

thanks Aaron, it has been a while since i last checked the code,  I'll read it 
to understand it more

On Aug 14, 2012 8:48 PM, "aaron morton" 
<[email protected]<mailto:[email protected]>> wrote:
Using this method, when choosing the new <Token>, should we still use the T-1 ?
(AFAIK) No.
replace_token is used when you want to replace a node that is dead. In this 
case the dead node will be identified by its token.

if so, would the duplicate token (same token but different ip) cause problems?
If the nodes are bootstrapping an error is raised.
Otherwise the token ownership is passed to the new node.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/08/2012, at 11:07 AM, Yang 
<[email protected]<mailto:[email protected]>> wrote:

previously when a node dies, I remember the documents describes that it's 
better to assign T-1 to the new node,
where T was the token of the dead node.

the new doc for 1.x here

http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node

shows a new way to  pass in cassandra.replace_token=<Token>
for the new node.
Using this method, when choosing the new <Token>, should we still use the T-1 ?

Also in Priam code:
https://github.com/Netflix/Priam/blob/master/priam/src/main/java/com/netflix/priam/identity/InstanceIdentity.java

line 148, it does not seem that Priam does the "-1" thing, but assigns the 
original token T to the new node.
if so, would the duplicate token (same token but different ip) cause problems?

Thanks
Yang

Re: replace dead node? " token -1 "

Reply via email to