Hello, I am trying to upgrade Apache Cassandra from 2.1.16 to 3.11.3, the regular rolling upgrade process works fine without any issues.
However, I am running into an issue where if there is a node with older version dies (hardware failure) and a new node comes up and tries to bootstrap, it's failing. I tried two combinations: 1. Joining replacement node with 2.1.16 version of cassandra In this case nodes with 2.1.16 version are able to stream data to the new node, but the nodes with 3.11.3 version are failing with the below error. > ERROR [STREAM-INIT-/10.x.x.x:40296] 2019-07-26 17:45:17,775 > IncomingStreamingConnection.java:80 - Error while reading from socket from > /10.y.y.y:40296. > java.io.IOException: Received stream using protocol version 2 (my version > 4). Terminating connection 2. Joining replacement node with 3.11.3 version of cassandra In this case the nodes with 3.11.3 version of cassandra are able to stream the data but it's not able to stream data from the 2.1.16 nodes and failing with the below error. > ERROR [STREAM-IN-/10.z.z.z:7000] 2019-07-26 18:08:10,380 > StreamSession.java:593 - [Stream #538c6900-afd0-11e9-a649-ab2e045ee53b] > Streaming error occurred on session with peer 10.z.z.z > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_151] > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > ~[na:1.8.0_151] > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.8.0_151] > at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_151] > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > ~[na:1.8.0_151] > at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:206) > ~[na:1.8.0_151] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.8.0_151] > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > ~[na:1.8.0_151] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:311) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151] Note: In both cases I am using replace_address to replace dead node, as I am running into some issues with "nodetool removenode" . I use ephemeral disk, so replacement node always comes up with empty data dir and bootstrap. Any other work around to mitigate this problem? I am worried about any nodes going down while we are in the process of upgrade, as it could take several hours to upgrade depending on the cluster size.