Re: Consitency level ONE

Sylvain Lebresne Tue, 29 Jun 2010 04:05:45 -0700

> Hi all,
>
> I'm having some issues with read consistency level ONE. The Wiki (and other 
> sources) say the following:
>
> Will return the record returned by the first node to respond. A consistency 
> check is always done in a background thread to fix any consistency issues 
> when ConsistencyLevel.ONE is used. This means subsequent calls will have 
> correct data even if the initial read gets an older value. (This is called 
> read repair.)
>
> However, when looking at the code, it seems that the read is only directed 
> towards the first node that is suitable (and alive). This means that a slow 
> node will cause slow responses even though my replication factor is > 1. I 
> would expect the read to go to all the suitable nodes and as soon as one of 
> those nodes responds, the reply is used (just as the documentation says).
>
> Moving to Quorum reads would solve part of this problem, but with one server 
> down and 1 slow one, I'm back to square one.


This would not solve part of the problem.

When you do a QUORUM read, the value(s) of asked column(s) are not requested
from each replica. Instead, the value is asked to one node and only a digest
of the value is asked to the other nodes. This is done to avoid too much
inter-cluster transfer (and thus save bandwidth, and thus make it more
efficient) as in normal condition you expect all value to be exactly the same
and thus transferring all those data would be wasteful. If ever the value and
the digest doesn't match, then only are the actual value requested.

Same thing for CL.ONE. The background consistency check only really ask for
digests, which save a lot of internal bandwidth.

Now, back to the slow node problem. The code already do it's best to ask the
best suited node. First by retrieving the data locally if possible, then using
the EndpointSnitch that you can configure to tell Cassandra what is this best
suited node.
There is the problem of slow node because of temporary problem, either network
problem or because this node is too loaded and cannot keep-up. But Cassandra
choose to optimize for the normal case rather than the error case, which I
believe is the right choice.

--
Sylvain

>
> Greetings,
>
> Wouter
>

Re: Consitency level ONE

Reply via email to