I've been bugging you a few times, but now I've got trace data for a query with LOCAL_QUORUM that is being sent to a remove data center.
The setup is as follows: NetworkTopologyStrategy: {"DC1":"1","DC2":"2"} Both DC1 and DC2 have 2 nodes. In DC2, one node is currently being rebuilt, and therefore does not contain all data (yet). The client app connects to a node in DC1, and sends a SELECT query with CL LOCAL_QUORUM, which in this case means ((1/2)+1=1. If all is ok, the query always produces a result, because the requested rows are guaranteed to be available in DC1. However, the query sometimes produces no result. I've been able to record the traces of these queries, and it turns out that the coordinator node in DC1 sometimes sends the query to DC2, to the node that is being rebuilt, and does not have the requested rows. I've included an example trace below. The coordinator node is 10.55.156.67, which is in DC1. The 10.88.4.194 node is in DC2. I've verified that the CL=LOCAL_QUORUM by printing it when the query is sent (I'm using the datastax java driver). activity | source | source_elapsed | thread ---------------------------------------------------------------------------+--------------+----------------+----------------------------------------- Message received from /10.55.156.67 | 10.88.4.194 | 48 | MessagingService-Incoming-/10.55.156.67 Executing single-partition query on aggregate | 10.88.4.194 | 286 | SharedPool-Worker-2 Acquiring sstable references | 10.88.4.194 | 306 | SharedPool-Worker-2 Merging memtable tombstones | 10.88.4.194 | 321 | SharedPool-Worker-2 Partition index lookup allows skipping sstable 107 | 10.88.4.194 | 458 | SharedPool-Worker-2 Bloom filter allows skipping sstable 1 | 10.88.4.194 | 489 | SharedPool-Worker-2 Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | 10.88.4.194 | 496 | SharedPool-Worker-2 Merging data from memtables and 0 sstables | 10.88.4.194 | 500 | SharedPool-Worker-2 Read 0 live and 0 tombstone cells | 10.88.4.194 | 513 | SharedPool-Worker-2 Enqueuing response to /10.55.156.67 | 10.88.4.194 | 613 | SharedPool-Worker-2 Sending message to /10.55.156.67 | 10.88.4.194 | 672 | MessagingService-Outgoing-/10.55.156.67 Parsing SELECT * FROM Aggregate WHERE type=? AND typeId=?; | 10.55.156.67 | 10 | SharedPool-Worker-4 Sending message to /10.88.4.194 | 10.55.156.67 | 4335 | MessagingService-Outgoing-/10.88.4.194 Message received from /10.88.4.194 | 10.55.156.67 | 6328 | MessagingService-Incoming-/10.88.4.194 Seeking to partition beginning in data file | 10.55.156.67 | 10417 | SharedPool-Worker-3 Key cache hit for sstable 389 | 10.55.156.67 | 10586 | SharedPool-Worker-3 My question is: how is it possible that the query is sent to a node in DC2? Since DC1 has 2 nodes and RF 1, the query should always be sent to the other node in DC1 if the coordinator does not have a replica, right? Thanks, Tom