Yes the line is : Datacenter: datacenter1 which matches with my create keyspace command. As for the NodeDiscoveryType, we will follow it but I don't believe it to be the root of my issue here because the nodes start up atleast 6 hours before the UnavailableException and as far as adding nodes is concerned we would only do it after hours.
On Mon, Jul 14, 2014 at 2:34 PM, Chris Lohfink <clohf...@blackbirdit.com> wrote: > If you list all 12 nodes in seeds list, you can try using > NodeDiscoveryType.NONE instead of RING_DESCRIBE. > > Its been recommended that way by some anyway so if you add nodes to > cluster your app wont start using it until all bootstrapping and > everythings settled down. > > Chris > > On Jul 14, 2014, at 12:04 PM, Ruchir Jha <ruchir....@gmail.com> wrote: > > Mark, > > Here you go: > > *NodeTool status:* > > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID > Rack > UN 10.10.20.15 1.62 TB 256 8.1% > 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 > UN 10.10.20.19 1.66 TB 256 8.3% > 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 > UN 10.10.20.35 1.62 TB 256 9.0% > 17cb8772-2444-46ff-8525-33746514727d rack1 > UN 10.10.20.31 1.64 TB 256 8.3% > 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 > UN 10.10.20.52 1.59 TB 256 9.1% > 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 > UN 10.10.20.27 1.66 TB 256 7.7% > 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 > UN 10.10.20.22 1.66 TB 256 8.9% > 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 > UN 10.10.20.39 1.68 TB 256 8.0% > b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 > UN 10.10.20.45 1.49 TB 256 7.7% > 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 > UN 10.10.20.47 1.64 TB 256 7.9% > bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 > UN 10.10.20.62 1.59 TB 256 8.2% > 84b47313-da75-4519-94f3-3951d554a3e5 rack1 > UN 10.10.20.51 1.66 TB 256 8.9% > 0343cd58-3686-465f-8280-56fb72d161e2 rack1 > > > *Astyanax Connection Settings:* > > seeds :12 > maxConns :16 > maxConnsPerHost :16 > connectTimeout :2000 > socketTimeout :60000 > maxTimeoutCount :16 > maxBlockedThreadsPerHost:16 > maxOperationsPerConnection:16 > DiscoveryType: RING_DESCRIBE > ConnectionPoolType: TOKEN_AWARE > DefaultReadConsistencyLevel: CL_QUORUM > DefaultWriteConsistencyLevel: CL_QUORUM > > > > On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy <mark.re...@boxever.com> > wrote: > >> Can you post the output of nodetool status and your Astyanax connection >> settings? >> >> >> On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha <ruchir....@gmail.com> wrote: >> >>> This is how we create our keyspace. We just ran this command once >>> through a cqlsh session on one of the nodes, so don't quite understand what >>> you mean by "check that your DC names match up" >>> >>> CREATE KEYSPACE prod WITH replication = { >>> 'class': 'NetworkTopologyStrategy', >>> 'datacenter1': '3' >>> }; >>> >>> >>> >>> On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink <clohf...@blackbirdit.com >>> > wrote: >>> >>>> What replication strategy are you using? if using >>>> NetworkTopolgyStrategy double check that your DC names match up (case >>>> sensitive) >>>> >>>> Chris >>>> >>>> On Jul 11, 2014, at 9:38 AM, Ruchir Jha <ruchir....@gmail.com> wrote: >>>> >>>> Here's the complete stack trace: >>>> >>>> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: >>>> TokenRangeOfflineException: >>>> [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), >>>> attempts=3]UnavailableException() >>>> at >>>> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) >>>> at >>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) >>>> at >>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) >>>> at >>>> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) >>>> at >>>> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) >>>> at >>>> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) >>>> at >>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) >>>> at >>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) >>>> at >>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) >>>> Caused by: UnavailableException() >>>> at >>>> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) >>>> at >>>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) >>>> at >>>> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) >>>> at >>>> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) >>>> at >>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129) >>>> at >>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126) >>>> at >>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60) >>>> ... 12 more >>>> >>>> >>>> >>>> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ipremya...@gmail.com> >>>> wrote: >>>> >>>>> Please post the full exception. >>>>> >>>>> >>>>> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ruchir....@gmail.com> >>>>> wrote: >>>>> >>>>>> We have a 12 node cluster and we are consistently seeing this >>>>>> exception being thrown during peak write traffic. We have a replication >>>>>> factor of 3 and a write consistency level of QUORUM. Also note there is >>>>>> no >>>>>> unusual Or Full GC activity during this time. Appreciate any help. >>>>>> >>>>>> Sent from my iPhone >>>>> >>>>> >>>>> >>>> >>>> >>> >> > >