Just a quick update, I was able to fix the problem by reverting the patch
CASSANDRA-8336 in our custom cassandra build. I don't know the root cause
yet though. I will open a JIRA ticket and post here for reference later.

On Fri, Jun 12, 2015 at 11:31 AM, Paulo Ricardo Motta Gomes <
paulo.mo...@chaordicsystems.com> wrote:

> Hello,
>
> We recently upgraded a cluster from 2.0.12 to 2.0.15 and now whenever we
> stop/kill a cassandra process, some other nodes keep a connection with the
> dead node in the CLOSE_WAIT state on port 7000 for about 5-20 minutes.
>
> So, if I start the killed node again, it cannot handshake with the nodes
> which have a connection on the CLOSE_WAIT state until that connection is
> closed, so they remain on the down state to each other for 5-20 minutes,
> until they can handshake again.
>
> I believe this is somehow related to the fixes CASSANDRA-8336 and
> CASSANDRA-9238, and also could be a duplicate of CASSANDRA-8072. I will
> continue to investigate to see if I find more evidences, but any help at
> this point would be appreciated, or at least a confirmation that it could
> be related to any of these tickets.
>
> Cheers,
>
> --
> *Paulo Motta*
>
> Chaordic | *Platform*
> *www.chaordic.com.br <http://www.chaordic.com.br/>*
> +55 48 3232.3200
>



-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br <http://www.chaordic.com.br/>*
+55 48 3232.3200

Reply via email to