Range queries do not currently read repair, although there is a ticket on this. If you want them to be consistent do them at QUORUM, or all. But in a strange quirk since get_range_slice does not repair those operations are not "eventually consistent"
On Thu, Feb 7, 2013 at 10:20 AM, Sergey Olefir <solf.li...@gmail.com> wrote: > Hi, > > I'm somewhat lost in regards to the results I can expect from running range > queries in a (temporarily) 'inconsistent' cluster (e.g. if node has been > down for some time and hasn't caught up yet). > > Suppose I have 4 nodes in 2 DCs (cassandra 1.1.7): > DCa: a1 and a2 > DCb: b1 and b2 > I'm using ByteOrdered partitioner and nodes are balanced (tokens are set > properly to split data evenly in each DC, tokens in DCb are [DCa + 1]). > > I'm running with replication DCa:2, DCb:2 (each node contains full data). > I'm using counters only and I'm putting heavy load (say 10k increments per > second). The writes are directed to a1 and a2 only, b1 and b2 are for backup > and possibly for running queries against (haven't decided yet). I monitor > cluster via nodetool and see that data load is even on all nodes (as is > expected). > > Now a2 goes down. I can immediately see that a1 data load grows very-very > rapidly (because of hints for a2). After half an hour a2 comes back up. I > know from experience that it'll take hours before all hints from a1 will be > sent to a2. > > What is going to happen with range queries directed to a1 & a2 while a2 > catches up? > > As far as I understand, there's no read-repair when doing range queries, so > there's no usual assurance of "wrong once, correct next time around". > > - Does consistency level setting apply to range queries? > - If I direct query to a1 (which is up-to-date), will it go to a2 for the > slice that 'belongs' to a2? (even though a1 has full replica of data) > - If I direct query to a2 (which is NOT up-to-date), is it smart enough to > go to a1 for data? > - In general, considering I have a cluster with 3 nodes up-to-date and one > that is not -- is there a way to run a query that'll return up-to-date data > (i.e. will not use data from a2)? > > > Also, what if a2 has been down for longer than hints window (1 hour by > default)? Is Cassandra smart enough to avoid using a2 for range queries > while it is inconsistent? > > Thanks in advance, > Sergey > > > > -- > View this message in context: > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Range-Queries-consistency-in-an-inconsistent-cluster-tp7585400.html > Sent from the cassandra-u...@incubator.apache.org mailing list archive at > Nabble.com.