Hello,
I put two nodes cluster on Azure. Each node in its own DC (ping about 10 ms.), 
inter-node connection (SSL port 7001) is going throw external IPs, i.e.

 listen_interface: eth0broadcast_address: 1.1.1.1
Cluster is starting, cqlsh can connect, stress-tool survives night of writes 
with replication factor two, all seems to be fine. But when cluster is leaved 
without load it becomes nonfunctional after several minutes of idle. Attempt to 
connect fails with error
Connection error: ('Unable to connect to any servers', {'1.1.1.1': 
OperationTimedOut('errors=Timed out creating connection (10 seconds), 
last_host=None',)})

There is messageWARN  10:06:32 RequestExecutionException READ_TIMEOUT: 
Operation timed out - received only 1 responses.

on one node six minutes after start (no load or connect in this time).

nodetool status shows both nodes as UN (Up and Normal, I guess) 

I suspected connectivity problem, but tcpdump shows constant traffic on port 
7001 between nodes. Restarting OTHER node than I'm connection to solves the 
problem for another several minutes. I increased  TCP idle time in Azure IP 
address setting to 30 minutes, but it had no effect.

Thanks, Vlad


Reply via email to