Hello, I put two nodes cluster on Azure. Each node in its own DC (ping about 10 ms.), inter-node connection (SSL port 7001) is going throw external IPs, i.e.
listen_interface: eth0broadcast_address: 1.1.1.1 Cluster is starting, cqlsh can connect, stress-tool survives night of writes with replication factor two, all seems to be fine. But when cluster is leaved without load it becomes nonfunctional after several minutes of idle. Attempt to connect fails with error Connection error: ('Unable to connect to any servers', {'1.1.1.1': OperationTimedOut('errors=Timed out creating connection (10 seconds), last_host=None',)}) There is messageWARN 10:06:32 RequestExecutionException READ_TIMEOUT: Operation timed out - received only 1 responses. on one node six minutes after start (no load or connect in this time). nodetool status shows both nodes as UN (Up and Normal, I guess) I suspected connectivity problem, but tcpdump shows constant traffic on port 7001 between nodes. Restarting OTHER node than I'm connection to solves the problem for another several minutes. I increased TCP idle time in Azure IP address setting to 30 minutes, but it had no effect. Thanks, Vlad