[ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884166#comment-13884166
 ] 

Jonathan Ellis commented on CASSANDRA-6619:
-------------------------------------------

Before we add a workaround, do we understand why the CASSANDRA-5692 approach 
"reconnect and use the version that the other side sent us in the meantime" 
doesn't work?  1.1 and 1.2 are compatible enough to see each others' version, 
by design.  Specifically, in both versions the first two ints sent are MAGIC (a 
sanity check) and "header" (compression + version flags).

> Race condition issue during upgrading 1.1 to 1.2
> ------------------------------------------------
>
>                 Key: CASSANDRA-6619
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Minh Do
>            Assignee: Minh Do
>            Priority: Minor
>             Fix For: 1.2.14
>
>         Attachments: patch.txt
>
>
> There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
> One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
> some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
> suffer in high read latency and be unable to fulfill some write requests.  It 
> won't be a problem if there is a small cluster but it is a problem in a large 
> cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
> day(s) to complete.
> Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
> already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to