Cassandra timeout on node failure

Ankit Patel Thu, 23 Jan 2014 08:55:35 -0800

                    
We are seeing a weird issue with our Cassandra cluster(version 
1.0.10). We have 6 nodes(DC1:3, DC2:3) in our cluster. So all 6 nodes 
are replicas of each other. All reads and writes are LOCAL_QOURUM. We 
see that when one of the node in DC1 fails, we see timeout errors on the
 second node for reads. When we turned on DEBUG level logs, we see the 
following error in the Cassandra logs –



DEBUG [Thrift:322] 2013-12-20 14:30:20,123
 StorageProxy.java (line 676) Read timeout: 
java.util.concurrent.TimeoutException: Operation timed out - received 
only 2 responses from / xxx.xxx.xxx.IP1, xxx.xxx.xxx.IP2, .


Considering that for LOCAL_QOURUM, we only need 2 nodes out of the 3 
in the DC, I am surprised we are seeing this issue. The log clearly says
 it has received 2 responses. Interestingly, when we connect to the 
third node after the second node returned timeout error, it works as 
expected. Has anyone else faced this issue?

Cassandra timeout on node failure

Reply via email to