Start with http://wiki.apache.org/cassandra/ReadRepair. Read repair count increasing just means you were doing reads at < CL.ALL, and had the CF configured to perform RR.
On Wed, Oct 5, 2011 at 12:37 PM, Ramesh Natarajan <rames...@gmail.com> wrote: > I have a 12 node cassandra cluster running with RF=3. I have severl > clients ( all running on a single node ) connecting to the cluster ( > fixed client - node mapping ) and try to do a insert, update , select > and delete. Each client has a fixed mapping of the row-keys and always > connect to the same node. The timestamp on the client node is used for > all operations. All operations are done using CL QUORUM. > > When I run a tpstats I see the ReadRepair count consistently > increasing. i need to figure out why ReadRepair is happening.. > > One scenario I can think of is, it could happen when there is a delay > in updating the nodes to reach eventual consistency.. > > Let's say I have 3 nodes (RF=3) A,B,C. I insert <key> with timestamp > <ts1> to A and the call will return as soon as it inserts the record > to A and B. At some later point this information is sent to C... > > A while later A,B,C have the same data with the same timestamp. > > A <key,ts1> > B <key, ts1> and > C <key, ts1> > > When I update <key> on A with timestamp <ts2> to A, the call will > return as soon as it inserts the record to A and B. > Now the data is > > A <key,ts2> > B <key,ts2> > C <key,ts1> > > Assuming I query for <key> A,C respond and since there is no QUORUM, > it waits for B to respond and when A,B match, the response is returned > to the client and ReadRepair is sent to C. > > This could happen only when C is running behind in catching up the > updates to A,B. Are there any stats that would let me know if the > system is in a consistent state? > > thanks > Ramesh > > > tpstats_2011-10-05_12:50:01:ReadRepairStage 0 > 0 43569781 0 0 > tpstats_2011-10-05_12:55:01:ReadRepairStage 0 > 0 43646420 0 0 > tpstats_2011-10-05_13:00:02:ReadRepairStage 0 > 0 43725850 0 0 > tpstats_2011-10-05_13:05:01:ReadRepairStage 0 > 0 43790047 0 0 > tpstats_2011-10-05_13:10:02:ReadRepairStage 0 > 0 43869704 0 0 > tpstats_2011-10-05_13:15:01:ReadRepairStage 0 > 0 43945635 0 0 > tpstats_2011-10-05_13:20:01:ReadRepairStage 0 > 0 44020406 0 0 > tpstats_2011-10-05_13:25:02:ReadRepairStage 0 > 0 44093227 0 0 > tpstats_2011-10-05_13:30:01:ReadRepairStage 0 > 0 44167455 0 0 > tpstats_2011-10-05_13:35:02:ReadRepairStage 0 > 0 44247519 0 0 > tpstats_2011-10-05_13:40:01:ReadRepairStage 0 > 0 44312726 0 0 > tpstats_2011-10-05_13:45:01:ReadRepairStage 0 > 0 44387633 0 0 > tpstats_2011-10-05_13:50:01:ReadRepairStage 0 > 0 44443683 0 0 > tpstats_2011-10-05_13:55:02:ReadRepairStage 0 > 0 44499487 0 0 > tpstats_2011-10-05_14:00:01:ReadRepairStage 0 > 0 44578656 0 0 > tpstats_2011-10-05_14:05:01:ReadRepairStage 0 > 0 44647555 0 0 > tpstats_2011-10-05_14:10:02:ReadRepairStage 0 > 0 44716730 0 0 > tpstats_2011-10-05_14:15:01:ReadRepairStage 0 > 0 44776644 0 0 > tpstats_2011-10-05_14:20:01:ReadRepairStage 0 > 0 44840237 0 0 > tpstats_2011-10-05_14:25:01:ReadRepairStage 0 > 0 44891444 0 0 > tpstats_2011-10-05_14:30:01:ReadRepairStage 0 > 0 44931105 0 0 > tpstats_2011-10-05_14:35:02:ReadRepairStage 0 > 0 44976801 0 0 > tpstats_2011-10-05_14:40:01:ReadRepairStage 0 > 0 45042220 0 0 > tpstats_2011-10-05_14:45:01:ReadRepairStage 0 > 0 45112141 0 0 > tpstats_2011-10-05_14:50:02:ReadRepairStage 0 > 0 45177816 0 0 > tpstats_2011-10-05_14:55:02:ReadRepairStage 0 > 0 45246675 0 0 > tpstats_2011-10-05_15:00:01:ReadRepairStage 0 > 0 45309533 0 0 > tpstats_2011-10-05_15:05:01:ReadRepairStage 0 > 0 45357575 0 0 > tpstats_2011-10-05_15:10:01:ReadRepairStage 0 > 0 45405943 0 0 > tpstats_2011-10-05_15:15:01:ReadRepairStage 0 > 0 45458435 0 0 > tpstats_2011-10-05_15:20:01:ReadRepairStage 0 > 2 45508253 0 0 > tpstats_2011-10-05_15:25:01:ReadRepairStage 0 > 0 45570375 0 0 > tpstats_2011-10-05_15:30:01:ReadRepairStage 0 > 0 45628426 0 0 > tpstats_2011-10-05_15:35:01:ReadRepairStage 0 > 0 45688694 0 0 > tpstats_2011-10-05_15:40:01:ReadRepairStage 0 > 3 45743029 0 0 > tpstats_2011-10-05_15:45:02:ReadRepairStage 0 > 0 45801167 0 0 > tpstats_2011-10-05_15:50:02:ReadRepairStage 0 > 0 45837329 0 0 > tpstats_2011-10-05_15:55:01:ReadRepairStage 0 > 0 45890326 0 0 > tpstats_2011-10-05_16:00:01:ReadRepairStage 0 > 0 45951703 0 0 > tpstats_2011-10-05_16:05:02:ReadRepairStage 0 > 0 46010736 0 0 > tpstats_2011-10-05_16:10:01:ReadRepairStage 0 > 0 46063294 0 0 > tpstats_2011-10-05_16:15:01:ReadRepairStage 0 > 0 46108327 0 0 > tpstats_2011-10-05_16:20:01:ReadRepairStage 0 > 0 46142291 0 0 > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com