Re: Logs appear to contradict themselves during bootstrap steps

Sotirios Delimanolis Fri, 06 Jan 2017 15:46:46 -0800

I forgot to check nodetool gossipinfo. Still, why does the first check think 
that the address exists, but the second doesn't?


    On Friday, January 6, 2017 1:11 PM, David Berry <dbe...@blackberry.com> 
wrote:
 

 #yiv4782259727 #yiv4782259727 -- _filtered #yiv4782259727 {panose-1:2 4 5 3 5 
4 6 3 2 4;} _filtered #yiv4782259727 {font-family:Calibri;panose-1:2 15 5 2 2 2 
4 3 2 4;} _filtered #yiv4782259727 {font-family:Georgia;panose-1:2 4 5 2 5 4 5 
2 3 3;}#yiv4782259727 #yiv4782259727 p.yiv4782259727MsoNormal, #yiv4782259727 
li.yiv4782259727MsoNormal, #yiv4782259727 div.yiv4782259727MsoNormal 
{margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv4782259727 h2 
{margin-top:34.5pt;margin-right:0in;margin-bottom:10.5pt;margin-left:0in;font-size:15.0pt;color:#143470;font-weight:normal;}#yiv4782259727
 a:link, #yiv4782259727 span.yiv4782259727MsoHyperlink 
{color:blue;text-decoration:underline;}#yiv4782259727 a:visited, #yiv4782259727 
span.yiv4782259727MsoHyperlinkFollowed 
{color:purple;text-decoration:underline;}#yiv4782259727 
p.yiv4782259727msonormal0, #yiv4782259727 li.yiv4782259727msonormal0, 
#yiv4782259727 div.yiv4782259727msonormal0 
{margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv4782259727 
span.yiv4782259727EmailStyle18 {color:windowtext;}#yiv4782259727 
span.yiv4782259727Heading2Char {color:#143470;}#yiv4782259727 
span.yiv4782259727z-TopofFormChar {display:none;}#yiv4782259727 
span.yiv4782259727z-BottomofFormChar {display:none;}#yiv4782259727 
.yiv4782259727MsoChpDefault {font-size:10.0pt;} _filtered #yiv4782259727 
{margin:1.0in 1.0in 1.0in 1.0in;}#yiv4782259727 div.yiv4782259727WordSection1 
{}#yiv4782259727 I’ve encountered this previously where after removing a node, 
gossip info is retained for 72 hours which doesn’t allow the IP to be reused 
during that period.   You can check how long gossip will retain this 
information using “nodetool gossipinfo” where the epoch time will be shown with 
status    For example….    Nodetool gossipinfo    /10.236.70.199   
generation:1482436691   heartbeat:3942407   
STATUS:3942404:LEFT,3074457345618261000,1483995662276   
LOAD:3942267:3.60685807E8   SCHEMA:223625:acbf0adb-1bbe-384a-acd7-6a46609497f1  
 DC:20:orion   RACK:22:r1   RELEASE_VERSION:4:2.1.16   
RPC_ADDRESS:3:10.236.70.199   SEVERITY:3942406:0.25094103813171387   
NET_VERSION:1:8   HOST_ID:2:cd2a767f-3716-4717-9106-52f0380e6184   
TOKENS:15:<hidden>    Converting it from epoch…..    local@img2116saturn101:~$ 
date -d @$((1483995662276/1000)) Mon Jan  9 21:01:02 UTC 2017    At the time we 
waited the 72 hour period before reusing the IP, I’ve not used replace_address 
previously.       From: Sotirios Delimanolis [mailto:sotodel...@yahoo.com]
Sent: Friday, January 6, 2017 2:38 PM
To: User <user@cassandra.apache.org>
Subject: Logs appear to contradict themselves during bootstrap steps    We had 
a node go down in our cluster and its disk had to be wiped. During that time, 
all nodes in the cluster have restarted at least once.    We want to add the 
bad node back to the ring. It has the same IP/hostname. I follow the steps here 
for "Adding nodes to an existing cluster."    When the process is started up, 
it reports    A node with address <hostname>/<address> already exists, 
cancelling join. Use cassandra.replace_address if you want to replace this 
node.    I found this error message in theStorageService using theGossiper 
instance to look up the node's state. Apparently, the node knows about it. So I 
followed the instructions and added thecassandra.replace_address system 
property and restarted the process.    But it reports    Cannot replace_address 
/<address> because it doesn't exist in gossip    So which one is it? Does the 
ring know about it or not? Running "nodetool ring" does show it on all other 
nodes.    I've seen CASSANDRA-8138 andthe conditions are the same, but I can't 
understand why it thinks it's not part of gossip. What's the difference between 
the gossip check used to make this determination and the gossip check used for 
the first error message? Can someone explain?    I've since retrieved the 
node's id and used it to "nodetool removenode". After rebalancing, I added the 
node back and "nodetool cleaned" up. Everything's up and running, but I'd like 
to understand what Cassandra was doing.

Re: Logs appear to contradict themselves during bootstrap steps

Reply via email to