Start with http://wiki.apache.org/cassandra/ReadRepair.  Read repair
count increasing just means you were doing reads at < CL.ALL, and had
the CF configured to perform RR.

On Wed, Oct 5, 2011 at 12:37 PM, Ramesh Natarajan <rames...@gmail.com> wrote:
> I have a 12 node cassandra cluster running with RF=3.  I have severl
> clients ( all running on a single node ) connecting to the cluster (
> fixed client - node mapping ) and try to do a insert, update , select
> and delete. Each client has a fixed mapping of the row-keys and always
> connect to the same node. The timestamp on the client node is used for
> all operations.  All operations are done using CL QUORUM.
>
> When  I run a tpstats I see the ReadRepair count consistently
> increasing. i need to figure out why ReadRepair is happening..
>
> One scenario I can think of is, it could happen when there is a delay
> in updating the nodes to reach eventual consistency..
>
> Let's say I have 3 nodes (RF=3)  A,B,C. I insert  <key> with timestamp
> <ts1> to A and the call will return as soon as it inserts the record
> to A and B. At some later point this information is sent to C...
>
> A while later A,B,C have the same data with the same timestamp.
>
> A <key,ts1>
> B <key, ts1> and
> C <key, ts1>
>
> When I update <key> on A with timestamp <ts2> to A, the call will
> return as soon as it inserts the record to A and B.
> Now the data is
>
> A <key,ts2>
> B <key,ts2>
> C <key,ts1>
>
> Assuming I query for <key>  A,C respond and since there is no QUORUM,
> it waits for B to respond and when A,B match, the response is returned
> to the client and ReadRepair is sent to C.
>
> This could happen only when C is running behind in catching up the
> updates to A,B.  Are there any stats that would let me know if the
> system is in a consistent state?
>
> thanks
> Ramesh
>
>
> tpstats_2011-10-05_12:50:01:ReadRepairStage                   0
>  0       43569781         0                 0
> tpstats_2011-10-05_12:55:01:ReadRepairStage                   0
>  0       43646420         0                 0
> tpstats_2011-10-05_13:00:02:ReadRepairStage                   0
>  0       43725850         0                 0
> tpstats_2011-10-05_13:05:01:ReadRepairStage                   0
>  0       43790047         0                 0
> tpstats_2011-10-05_13:10:02:ReadRepairStage                   0
>  0       43869704         0                 0
> tpstats_2011-10-05_13:15:01:ReadRepairStage                   0
>  0       43945635         0                 0
> tpstats_2011-10-05_13:20:01:ReadRepairStage                   0
>  0       44020406         0                 0
> tpstats_2011-10-05_13:25:02:ReadRepairStage                   0
>  0       44093227         0                 0
> tpstats_2011-10-05_13:30:01:ReadRepairStage                   0
>  0       44167455         0                 0
> tpstats_2011-10-05_13:35:02:ReadRepairStage                   0
>  0       44247519         0                 0
> tpstats_2011-10-05_13:40:01:ReadRepairStage                   0
>  0       44312726         0                 0
> tpstats_2011-10-05_13:45:01:ReadRepairStage                   0
>  0       44387633         0                 0
> tpstats_2011-10-05_13:50:01:ReadRepairStage                   0
>  0       44443683         0                 0
> tpstats_2011-10-05_13:55:02:ReadRepairStage                   0
>  0       44499487         0                 0
> tpstats_2011-10-05_14:00:01:ReadRepairStage                   0
>  0       44578656         0                 0
> tpstats_2011-10-05_14:05:01:ReadRepairStage                   0
>  0       44647555         0                 0
> tpstats_2011-10-05_14:10:02:ReadRepairStage                   0
>  0       44716730         0                 0
> tpstats_2011-10-05_14:15:01:ReadRepairStage                   0
>  0       44776644         0                 0
> tpstats_2011-10-05_14:20:01:ReadRepairStage                   0
>  0       44840237         0                 0
> tpstats_2011-10-05_14:25:01:ReadRepairStage                   0
>  0       44891444         0                 0
> tpstats_2011-10-05_14:30:01:ReadRepairStage                   0
>  0       44931105         0                 0
> tpstats_2011-10-05_14:35:02:ReadRepairStage                   0
>  0       44976801         0                 0
> tpstats_2011-10-05_14:40:01:ReadRepairStage                   0
>  0       45042220         0                 0
> tpstats_2011-10-05_14:45:01:ReadRepairStage                   0
>  0       45112141         0                 0
> tpstats_2011-10-05_14:50:02:ReadRepairStage                   0
>  0       45177816         0                 0
> tpstats_2011-10-05_14:55:02:ReadRepairStage                   0
>  0       45246675         0                 0
> tpstats_2011-10-05_15:00:01:ReadRepairStage                   0
>  0       45309533         0                 0
> tpstats_2011-10-05_15:05:01:ReadRepairStage                   0
>  0       45357575         0                 0
> tpstats_2011-10-05_15:10:01:ReadRepairStage                   0
>  0       45405943         0                 0
> tpstats_2011-10-05_15:15:01:ReadRepairStage                   0
>  0       45458435         0                 0
> tpstats_2011-10-05_15:20:01:ReadRepairStage                   0
>  2       45508253         0                 0
> tpstats_2011-10-05_15:25:01:ReadRepairStage                   0
>  0       45570375         0                 0
> tpstats_2011-10-05_15:30:01:ReadRepairStage                   0
>  0       45628426         0                 0
> tpstats_2011-10-05_15:35:01:ReadRepairStage                   0
>  0       45688694         0                 0
> tpstats_2011-10-05_15:40:01:ReadRepairStage                   0
>  3       45743029         0                 0
> tpstats_2011-10-05_15:45:02:ReadRepairStage                   0
>  0       45801167         0                 0
> tpstats_2011-10-05_15:50:02:ReadRepairStage                   0
>  0       45837329         0                 0
> tpstats_2011-10-05_15:55:01:ReadRepairStage                   0
>  0       45890326         0                 0
> tpstats_2011-10-05_16:00:01:ReadRepairStage                   0
>  0       45951703         0                 0
> tpstats_2011-10-05_16:05:02:ReadRepairStage                   0
>  0       46010736         0                 0
> tpstats_2011-10-05_16:10:01:ReadRepairStage                   0
>  0       46063294         0                 0
> tpstats_2011-10-05_16:15:01:ReadRepairStage                   0
>  0       46108327         0                 0
> tpstats_2011-10-05_16:20:01:ReadRepairStage                   0
>  0       46142291         0                 0
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to