Further information, in AZ1, when 143, 145, and 146 are up, all goes well. But when, say 143, fails, the client receives a TIMEOUT failure – even though 145 and 146 are up.
From: Derek Williams [mailto:de...@fyrie.net] Sent: Wednesday, March 20, 2013 11:50 AM To: user@cassandra.apache.org Subject: Re: Question regarding multi datacenter and LOCAL_QUORUM I'm think I need help with pointing out what the problem is. The log you posted only contains references to 143, 145, and 146, which all appear to be in the same datacenter as 146? On Wed, Mar 20, 2013 at 11:29 AM, Dwight Smith <dwight.sm...@genesyslab.com<mailto:dwight.sm...@genesyslab.com>> wrote: Hi I have 2 data centers – with 3 nodes in each DC – version 1.1.6 - replication factor 2 - topology properties: # Cassandra Node IP=Data Center:Rack xx.yy.zz.143=AZ1:RAC1 xx.yy.zz.145=AZ1:RAC1 xx.yy.zz.146=AZ1:RAC1 xx.yy.zz.147=AZ2:RAC2 xx.yy.zz.148=AZ2:RAC2 xx.yy.zz.149=AZ2:RAC2 Using LOCAL_QUORUM, my understanding was that reads/writes would process locally ( for the coordinator ) and send requests to the remaining nodes in the DC, but in the system log for 146 I observe that this is not the case, extract from the log: DEBUG [Thrift:1] 2013-03-19 00:00:53,312 CassandraServer.java (line 306) get_slice DEBUG [Thrift:1] 2013-03-19 00:00:53,313 ReadCallback.java (line 79) Blockfor is 2; setting up requests to /xx.yy.zz.146,/xx.yy.zz.143,/xx.yy.zz.145 DEBUG [Thrift:1] 2013-03-19 00:00:53,334 CassandraServer.java (line 306) get_slice DEBUG [Thrift:1] 2013-03-19 00:00:53,334 ReadCallback.java (line 79) Blockfor is 2; setting up requests to /xx.yy.zz.146,/xx.yy.zz.143 DEBUG [Thrift:1] 2013-03-19 00:00:53,366 CassandraServer.java (line 306) get_slice DEBUG [Thrift:1] 2013-03-19 00:00:53,367 ReadCallback.java (line 79) Blockfor is 2; setting up requests to /xx.yy.zz.146,/xx.yy.zz.143,/xx.yy.zz.145 DEBUG [Thrift:1] 2013-03-19 00:00:53,391 CassandraServer.java (line 589) batch_mutate DEBUG [Thrift:1] 2013-03-19 00:00:53,418 CassandraServer.java (line 589) batch_mutate DEBUG [Thrift:1] 2013-03-19 00:00:53,429 CassandraServer.java (line 306) get_slice DEBUG [Thrift:1] 2013-03-19 00:00:53,429 ReadCallback.java (line 79) Blockfor is 2; setting up requests to /xx.yy.zz.146,/xx.yy.zz.145 DEBUG [Thrift:1] 2013-03-19 00:00:53,441 CassandraServer.java (line 306) get_slice DEBUG [Thrift:1] 2013-03-19 00:00:53,441 ReadCallback.java (line 79) Blockfor is 2; setting up requests to /xx.yy.zz.146,/xx.yy.zz.143 The batch mutates are as expected – locally, two replicas, and hints to DC AZ2, but why the unexpected behavior for the get_slice requests. This is observed throughout the log. Thanks much -- Derek Williams