[ 
https://issues.apache.org/jira/browse/CASSANDRA-6554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13864672#comment-13864672
 ] 

Brandon Williams edited comment on CASSANDRA-6554 at 1/7/14 8:49 PM:
---------------------------------------------------------------------

The strange this is, it does mark the other non-upgraded nodes as up:

{noformat}
TRACE [GossipStage:1] 2014-01-07 20:07:53,456 GossipDigestAck2VerbHandler.java 
(line 38) Received a GossipDigestAck2Message from /10.180.236.244
DEBUG [GossipStage:1] 2014-01-07 20:07:53,456 Gossiper.java (line 790) Clearing 
interval times for /10.180.236.244 due to generation change
TRACE [GossipStage:1] 2014-01-07 20:07:53,457 FailureDetector.java (line 189) 
reporting /10.180.236.244
DEBUG [GossipStage:1] 2014-01-07 20:07:53,467 Gossiper.java (line 790) Clearing 
interval times for /10.182.208.161 due to generation change
TRACE [GossipStage:1] 2014-01-07 20:07:53,468 FailureDetector.java (line 189) 
reporting /10.182.208.161
TRACE [GossipStage:1] 2014-01-07 20:07:53,468 Gossiper.java (line 932) 
/10.180.236.244local generation 0, remote generation 1389124753
TRACE [GossipStage:1] 2014-01-07 20:07:53,468 Gossiper.java (line 937) Updating 
heartbeat state generation to 1389124753 from 0 for /10.180.236.244
 INFO [GossipStage:1] 2014-01-07 20:07:53,468 Gossiper.java (line 868) Node 
/10.180.236.244 has restarted, now UP
TRACE [GossipStage:1] 2014-01-07 20:07:53,468 Gossiper.java (line 873) Adding 
endpoint state for /10.180.236.244
TRACE [GossipStage:1] 2014-01-07 20:07:53,506 Gossiper.java (line 815) Sending 
a EchoMessage to /10.180.236.244
 INFO [HANDSHAKE-/10.180.236.244] 2014-01-07 20:07:53,534 
OutboundTcpConnection.java (line 386) Handshaking version with /10.180.236.244
TRACE [GossipStage:1] 2014-01-07 20:07:53,559 TokenSerializer.java (line 56) 
Reading token of 8 bytes
 INFO [GossipStage:1] 2014-01-07 20:07:53,562 StorageService.java (line 1445) 
Node /10.180.236.244 state jump to normal
TRACE [GossipStage:1] 2014-01-07 20:07:53,570 Gossiper.java (line 932) 
/10.182.208.161local generation 0, remote generation 1389124753
TRACE [GossipStage:1] 2014-01-07 20:07:53,571 Gossiper.java (line 937) Updating 
heartbeat state generation to 1389124753 from 0 for /10.182.208.161
 INFO [GossipStage:1] 2014-01-07 20:07:53,571 Gossiper.java (line 868) Node 
/10.182.208.161 has restarted, now UP
{noformat}

And never marks them down after that.  Can you get a gms trace from one of the 
other nodes too?


was (Author: brandon.williams):
The strange this is, it does mark the other non-upgraded nodes as up:

{{noformat}}
TRACE [GossipStage:1] 2014-01-07 20:07:53,456 GossipDigestAck2VerbHandler.java 
(line 38) Received a GossipDigestAck2Message from /10.180.236.244
DEBUG [GossipStage:1] 2014-01-07 20:07:53,456 Gossiper.java (line 790) Clearing 
interval times for /10.180.236.244 due to generation change
TRACE [GossipStage:1] 2014-01-07 20:07:53,457 FailureDetector.java (line 189) 
reporting /10.180.236.244
DEBUG [GossipStage:1] 2014-01-07 20:07:53,467 Gossiper.java (line 790) Clearing 
interval times for /10.182.208.161 due to generation change
TRACE [GossipStage:1] 2014-01-07 20:07:53,468 FailureDetector.java (line 189) 
reporting /10.182.208.161
TRACE [GossipStage:1] 2014-01-07 20:07:53,468 Gossiper.java (line 932) 
/10.180.236.244local generation 0, remote generation 1389124753
TRACE [GossipStage:1] 2014-01-07 20:07:53,468 Gossiper.java (line 937) Updating 
heartbeat state generation to 1389124753 from 0 for /10.180.236.244
 INFO [GossipStage:1] 2014-01-07 20:07:53,468 Gossiper.java (line 868) Node 
/10.180.236.244 has restarted, now UP
TRACE [GossipStage:1] 2014-01-07 20:07:53,468 Gossiper.java (line 873) Adding 
endpoint state for /10.180.236.244
TRACE [GossipStage:1] 2014-01-07 20:07:53,506 Gossiper.java (line 815) Sending 
a EchoMessage to /10.180.236.244
 INFO [HANDSHAKE-/10.180.236.244] 2014-01-07 20:07:53,534 
OutboundTcpConnection.java (line 386) Handshaking version with /10.180.236.244
TRACE [GossipStage:1] 2014-01-07 20:07:53,559 TokenSerializer.java (line 56) 
Reading token of 8 bytes
 INFO [GossipStage:1] 2014-01-07 20:07:53,562 StorageService.java (line 1445) 
Node /10.180.236.244 state jump to normal
TRACE [GossipStage:1] 2014-01-07 20:07:53,570 Gossiper.java (line 932) 
/10.182.208.161local generation 0, remote generation 1389124753
TRACE [GossipStage:1] 2014-01-07 20:07:53,571 Gossiper.java (line 937) Updating 
heartbeat state generation to 1389124753 from 0 for /10.182.208.161
 INFO [GossipStage:1] 2014-01-07 20:07:53,571 Gossiper.java (line 868) Node 
/10.182.208.161 has restarted, now UP
{{noformat}

And never marks them down after that.  Can you get a gms trace from one of the 
other nodes too?

> During upgrade from 1.2 -> 2.0, upgraded node sees other nodes as Down
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-6554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6554
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: EC2 Ubuntu Precise 12.04
> Oracle JRE 1.7_25
> C* 1.2.13 upgrade to 2.0.4
>            Reporter: Michael Shuler
>         Attachments: 6554_trace_system.log
>
>
> During an upgrade from 1.2.13 to 2.0.3/2.0.4, the upgraded node sees the 
> remaining nodes of the cluster as Down.
> {code}
> automaton@ip-10-139-1-113:~$ nodetool status
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address         Load       Owns   Host ID                               
> Token                                    Rack
> UN  10.139.1.113    98.94 MB   33.3%  33b1cd06-e17b-4332-8066-0c6c401e0cf3  
> -9223372036854775808                     rack1
> DN  10.139.11.168   97.51 MB   33.3%  ec97c163-8f2d-4019-a3d1-55df5e4037d4  
> -3074457345618258603                     rack1
> DN  10.238.221.115  97.34 MB   33.3%  73a76d3f-73ef-481d-b603-0833c0ff80c2  
> 3074457345618258602                      rack1
> automaton@ip-10-139-1-113:~$ nodetool gossipinfo
> /10.238.221.115
>   SEVERITY:0.0
>   RPC_ADDRESS:0.0.0.0
>   DC:datacenter1
>   RELEASE_VERSION:1.2.13
>   LOAD:1.02066255E8
>   STATUS:NORMAL,3074457345618258602
>   SCHEMA:8b351435-81ef-3a14-adf7-8555e2f19ecd
>   NET_VERSION:6
>   RACK:rack1
>   HOST_ID:73a76d3f-73ef-481d-b603-0833c0ff80c2
> /10.139.1.113
>   RPC_ADDRESS:0.0.0.0
>   SEVERITY:0.0
>   DC:datacenter1
>   RELEASE_VERSION:2.0.4
>   LOAD:1.03750451E8
>   STATUS:NORMAL,-9223372036854775808
>   SCHEMA:dfafb212-5b8f-31cb-a80b-2ba58fcef73d
>   NET_VERSION:7
>   RACK:rack1
>   HOST_ID:33b1cd06-e17b-4332-8066-0c6c401e0cf3
> /10.139.11.168
>   SEVERITY:0.0
>   RPC_ADDRESS:0.0.0.0
>   DC:datacenter1
>   RELEASE_VERSION:1.2.13
>   LOAD:1.02245066E8
>   STATUS:NORMAL,-3074457345618258603
>   SCHEMA:8b351435-81ef-3a14-adf7-8555e2f19ecd
>   NET_VERSION:6
>   RACK:rack1
>   HOST_ID:ec97c163-8f2d-4019-a3d1-55df5e4037d4
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to