Tyler,
thanks for the detail explanation.
Still have few questions in my mind....

#
When you said send "read digest request" to the rest of the replica, do you
mean all replica(s) in current and other DC? or just the one last replica
in my current DC and one of the co-ordinate node in other DC?

(our read and write is all "local_quorum" of replication factor of 3,
local_dc_repair_chance=0))

#
Sending "read digest request" to other DC, happen sequently correct? If
network latency between DC is bad during time, will that affect overall
read latency?

#
We observe that one of our cql query perform okay during normal load, but
degrade greatly when we have batch of  same cql(looking for the exact
columns and key) sending to server in short period of time(say 100 of them
within a sec).
Our other table or keyspace don't see any latency drop during the time, so
i am not sure we are hitting the capacity yet. So we suspect read_repair
chance may have something to do wit it.
Anything we can look into and see what may cause the latency spike when we
have large number of same cql hitting the server?

Thanks






On Wed, Nov 19, 2014 at 7:49 AM, Tyler Hobbs <ty...@datastax.com> wrote:

>
> On Sun, Nov 16, 2014 at 5:13 PM, Jimmy Lin <y2klyf+w...@gmail.com> wrote:
>
>> I have read  that read repair suppose to be running as background, but
>> does the co-ordinator node need to wait for the response(along with other
>> normal read tasks) before return the entire result back to the caller?
>>
>
> For the 10% of requests where read repair is triggered, the coordinator
> will send a request to every replica.  (A data request to two replicas,
> digest requests to the rest.)  Once enough replicas have replied to satisfy
> the consistency level, the result will be returned to the client; if
> there's a mismatch in the responses from the replicas, a blocking repair
> will be performed before responding to the client.  Later, in the
> background, the coordinator will check the remaining responses from
> replicas to see if they match up.  If any of them do not, they will be
> repaired in the background.
>
>
>>
>> #
>> how a high rate of read repair impact performance? I read something that
>> it will impact through put but not latency, how so?
>>
>
> That's correct, it should impact throughput but not necessarily latency.
> Throughput is lower because more replicas have to do work, but latency is
> unaffected (unless you're hitting capacity) because blocking repair only
> happens under the same conditions that it normally does.
>
>
>>
>> #
>> is it safe to even just  make read_repair_chance = 0?
>> (since we are mostly talking to one DC, the other DC most of the time
>> serve as backup/emergency )
>>
>
> Sure, it's safe enough.  People use read repair for different reasons.
> Some would say that RR keeps their other datacenter's caches warm. Others
> rely on it in place of normal repairs (which is not particularly safe, but
> if your consistency requirements allow for it, it's fine).  If you're
> running regular repairs anyway, it's safe to turn off read repair.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Reply via email to