Re: Range Queries consistency in an inconsistent cluster.

Edward Capriolo Thu, 07 Feb 2013 07:52:27 -0800

Range queries do not currently read repair, although there is a ticket
on this. If you want them to be consistent do them at QUORUM, or all.
But in a strange quirk since get_range_slice does not repair those
operations are not "eventually consistent"


On Thu, Feb 7, 2013 at 10:20 AM, Sergey Olefir <solf.li...@gmail.com> wrote:
> Hi,
>
> I'm somewhat lost in regards to the results I can expect from running range
> queries in a (temporarily) 'inconsistent' cluster (e.g. if node has been
> down for some time and hasn't caught up yet).
>
> Suppose I have 4 nodes in 2 DCs (cassandra 1.1.7):
> DCa: a1 and a2
> DCb: b1 and b2
> I'm using ByteOrdered partitioner and nodes are balanced (tokens are set
> properly to split data evenly in each DC, tokens in DCb are [DCa + 1]).
>
> I'm running with replication DCa:2, DCb:2 (each node contains full data).
> I'm using counters only and I'm putting heavy load (say 10k increments per
> second). The writes are directed to a1 and a2 only, b1 and b2 are for backup
> and possibly for running queries against (haven't decided yet). I monitor
> cluster via nodetool and see that data load is even on all nodes (as is
> expected).
>
> Now a2 goes down. I can immediately see that a1 data load grows very-very
> rapidly (because of hints for a2). After half an hour a2 comes back up. I
> know from experience that it'll take hours before all hints from a1 will be
> sent to a2.
>
> What is going to happen with range queries directed to a1 & a2 while a2
> catches up?
>
> As far as I understand, there's no read-repair when doing range queries, so
> there's no usual assurance of "wrong once, correct next time around".
>
> - Does consistency level setting apply to range queries?
> - If I direct query to a1 (which is up-to-date), will it go to a2 for the
> slice that 'belongs' to a2? (even though a1 has full replica of data)
> - If I direct query to a2 (which is NOT up-to-date), is it smart enough to
> go to a1 for data?
> - In general, considering I have a cluster with 3 nodes up-to-date and one
> that is not -- is there a way to run a query that'll return up-to-date data
> (i.e. will not use data from a2)?
>
>
> Also, what if a2 has been down for longer than hints window (1 hour by
> default)? Is Cassandra smart enough to avoid using a2 for range queries
> while it is inconsistent?
>
> Thanks in advance,
> Sergey
>
>
>
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Range-Queries-consistency-in-an-inconsistent-cluster-tp7585400.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.

Re: Range Queries consistency in an inconsistent cluster.

Reply via email to