[ 
https://issues.apache.org/jira/browse/CASSANDRA-20984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032913#comment-18032913
 ] 

Vivekanand Koya commented on CASSANDRA-20984:
---------------------------------------------

Looks like the Streaming part of cassandra is not robust like the Message 
handling code side of cassandra. The message handling part of the cassandra 
server code i.e org.apache.cassandra.net.OutboundConnection has all available 
information to handle every condition within code. Esp, the class 
org.apache.cassandra.net.OutboundConnectionInitiator$Result.Outcome is package 
private in  org.apache.cassandra.net. 

I've taken a closer look at the code. I've been able to reproduce the issue 
with the unit test and have produced a fix. Here are my observations.

The streaming side of code mostly located in class 
org.apache.cassandra.streaming.async.NettyStreamingConnectionFactory invokes 
the method initiateStreaming in class 
org.apache.cassandra.net.OutboundConnectionInitiator. Since they are located in 
different packages, the streaming code lacks the ability to perform any checks 
based on org.apache.cassandra.net.OutboundConnectionInitiator$Result.Outcome.  
Put simply the outcome field & enum of Result is inaccessible in 
NettyStreamingConnectionFactory class.

 

I patched the code to perform checks based on Result.Outcome overcoming the 
limitation. When working on the Unit test, I also saw the inconsistency in the 
way casts are performed between retry, success and incompatible.

 

In NettyStreamingConnectionFactory, there appears to be some confusion in the 
invocation of isSuccess() method. It actually is making the invocation on Netty 
Future. It should have been on the Result object. On making a successful 
connect, NettyStreamingConnectionFactory calls success() on Future' s getNow() 
without checking the type of the cast.
There are no tests for initiateStreaming() method of 
OutboundConnectionInitiator as there are for initiateMessaging() method of 
OutboundConnectionInitiator.

Reproduction

I wrote a test (StreamingTest) that reproduces the issue in 
https://lists.apache.org/thread/ykkwhjdpgyqzw5xtol4v5ysz664bxxl3.

Code Change

I used the instanceof in https://openjdk.org/jeps/394 to make incorrect 
comparisons a compile-time error. This is done in OutboundConnection where I 
check if result.success() instanceof MessagingSuccess and 
OutboundConnectionInitiator where I return Success safely instead.

 

GitHub Pull request: https://github.com/apache/cassandra/pull/4438


Please note: this change makes use of a feature in JDK 16 and thus needs a 
higher minimum JDK.

> Fix java.lang.ClassCastException: Streaming Incompatible versions
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-20984
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20984
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Consistency/Bootstrap and Decommission, 
> Consistency/Streaming
>            Reporter: Vivekanand Koya
>            Assignee: Vivekanand Koya
>            Priority: Normal
>             Fix For: 5.0.3, 5.0.4, 5.0.5
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> On version mismatch or protocol incompatibility between the two communicating 
> nodes, cassandra server does not handle the errors correctly. There is no 
> proper error handling when ClassCastException occurs, since it is not 
> consistently reproducible. 
> In 
> [https://lists.apache.org/thread/ykkwhjdpgyqzw5xtol4v5ysz664bxxl3.,|https://lists.apache.org/thread/ykkwhjdpgyqzw5xtol4v5ysz664bxxl3.]
>  the "Stream failed" message points to a streaming operation, which is used 
> for processes like node repairs or adding new nodes (bootstrapping). I found 
> a similar Jira that was already raised - 
> https://issues.apache.org/jira/browse/CASSANDRA-19218.  Not sure what to do 
> with this JIRA. Looks like a duplicate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to