There are two processes in cassandra that trigger Read Repair like behaviour.
During a DigestMismatchException is raised if the responses from the replicas do not match. In this case another read is run that involves reading all the data. This is the CL level agreement kicking in. The other "Read Repair" is the one controlled by the "read_repair_chance". When RR is active on a request ALL up replicas are involved in the read. When RR is not active only CL replicas are involved. When test for CL agreement occurs synchronously to the request; the RR check waits asynchronously to the request for all nodes in the request to return. It then checks for consistency and repairs differences. > From looking at the source code, I do not understand how this set is built > and I do not understand how the reconciliation is executed. When a DigestMismatch is detected a read is run using RepairCallback. The callback will call the RowRepairResolver.resolve() when enough responses have been collected. resolveSuperset() picks one response to the baseline, and then calls delete() to apply row level deletes from the other responses (ColumnFamily's). It collects the other CF's into an iterator with a filter that returns all columns. The columns are then applied to the baseline CF which may result in reconcile() being called. reconcile() is used when a AbstractColumnContainer has two versions of a column and it wants to only have one. RowRepairResolve.scheduleRepairs() works out the delta for each node by calling ColumnFamily.diff(). The delta is then sent to the appropriate node. Hope that helps. ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/10/2012, at 6:33 AM, Markus Klems <markuskl...@gmail.com> wrote: > Hi guys, > > I am looking through the Cassandra source code in the github trunk to better > understand how Cassandra's fault-tolerance mechanisms work. Most things make > sense. I am also aware of the wiki and DataStax documentation. However, I do > not understand what read repair does in detail. The method > RowRepairResolver.resolveSuperset(Iterable<ColumnFamily> versions) seems to > do the trick of merging conflicting versions of column family replicas and > builds the set of columns that need to be "repaired". From looking at the > source code, I do not understand how this set is built and I do not > understand how the reconciliation is executed. ReadRepair does not seem to > trigger a Column.reconcile() to reconcile conflicting column versions on > different servers. Does it? > > If this is not what read repair does, then: What kind of inconsistencies are > resolved by read repair? And: How are the inconsistencies resolved? > > Could someone give me a hint? > > Thanks so much, > > -Markus