There are two processes in cassandra that trigger Read Repair like behaviour. 

During a DigestMismatchException is raised if the responses from the replicas 
do not match. In this case another read is run that involves reading all the 
data. This is the CL level agreement kicking in. 

The other "Read Repair" is the one controlled by the "read_repair_chance". When 
RR is active on a request ALL up replicas are involved in the read. When RR is 
not active only CL replicas are involved. When test for CL agreement occurs 
synchronously to the request; the RR check waits asynchronously to the request 
for all nodes in the request to return. It then checks for consistency and 
repairs differences. 

> From looking at the source code, I do not understand how this set is built 
> and I do not understand how the reconciliation is executed.
When a DigestMismatch is detected a read is run using RepairCallback. The 
callback will call the RowRepairResolver.resolve() when enough responses have 
been collected. 

resolveSuperset() picks one response to the baseline, and then calls delete() 
to apply row level deletes from the other responses (ColumnFamily's). It 
collects the other CF's into an iterator with a filter that returns all 
columns. The columns are then applied to the baseline CF which may result in 
reconcile() being called. 

reconcile() is used when a AbstractColumnContainer has two versions of a column 
and it wants to only have one. 

RowRepairResolve.scheduleRepairs() works out the delta for each node by calling 
ColumnFamily.diff(). The delta is then sent to the appropriate node.


Hope that helps. 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/10/2012, at 6:33 AM, Markus Klems <markuskl...@gmail.com> wrote:

> Hi guys,
> 
> I am looking through the Cassandra source code in the github trunk to better 
> understand how Cassandra's fault-tolerance mechanisms work. Most things make 
> sense. I am also aware of the wiki and DataStax documentation. However, I do 
> not understand what read repair does in detail. The method 
> RowRepairResolver.resolveSuperset(Iterable<ColumnFamily> versions) seems to 
> do the trick of merging conflicting versions of column family replicas and 
> builds the set of columns that need to be "repaired". From looking at the 
> source code, I do not understand how this set is built and I do not 
> understand how the reconciliation is executed. ReadRepair does not seem to 
> trigger a Column.reconcile() to reconcile conflicting column versions on 
> different servers. Does it?
> 
> If this is not what read repair does, then: What kind of inconsistencies are 
> resolved by read repair? And: How are the inconsistencies resolved?
> 
> Could someone give me a hint?
> 
> Thanks so much,
> 
> -Markus

Reply via email to