We had a similar problem when our nodes could not sync using ntp due to VPC ACL settings. -ml
On Mon, Nov 18, 2013 at 8:49 PM, Steven A Robenalt <srobe...@stanford.edu>wrote: > Hi all, > > I am attempting to bring up our new app on a 3-node cluster and am having > problems with frequent read timeouts and slow inter-node replication. > Initially, these errors were mostly occurring in our app server, affecting > 0.02%-1.0% of our queries in an otherwise unloaded cluster. No exceptions > were logged on the servers in this case, and reads in a single node > environment with the same code and client driver virtually never see > exceptions like this, so I suspect problems with the inter-cluster > communication between nodes. > > The 3 nodes are deployed in a single AWS VPC, and are all in a common > subnet. The Cassandra version is 2.0.2 following an upgrade this past > weekend due to NPEs in a secondary index that were affecting certain > queries under 2.0.1. The servers are m1.large instances running AWS Linux > and Oracle JDK7u40. The first 2 nodes in the cluster are the seed nodes. > All database contents are CQL tables with replication factor of 3, and the > application is Java-based, using the latest Datastax 2.0.0-rc1 Java Driver. > > In testing with the application, I noticed this afternoon that the > contents of the 3 nodes differed in their respective copies of the same > table for newly written data, for time periods exceeding several minutes, > as reported by cqlsh on each node. Specifying different hosts from the same > server using cqlsh also exhibited timeouts on multiple attempts to connect, > and on executing some queries, though they eventually succeeded in all > cases, and eventually the data in all nodes was fully replicated. > > The AWS servers have a security group with only ports 22, 7000, 9042, and > 9160 open. > > At this time, it seems that either I am still missing something in my > cluster configuration, or maybe there are other ports that are needed for > inter-node communication. > > Any advice/suggestions would be appreciated. > > > > -- > Steve Robenalt > Software Architect > HighWire | Stanford University > 425 Broadway St, Redwood City, CA 94063 > > srobe...@stanford.edu > http://highwire.stanford.edu > > > > > >