Re: recovering from network partition

aaron morton Mon, 30 Jan 2012 10:49:57 -0800

If you are working at CF ONE you are accepting that *any* value for a key+col 
combination stored on a replica for a row is a valid response, and that 
includes no value.

After the nodes have detected the others are UP they will start their HH in a 
staggered fashion, and will rate limit themselves to avoid overwhelming the 
node. It may take some time to complete. 

>  Otherwise, clients of A may see a
> discontinuity where data that was available during the partition see it
> go away and then come back.
If you are concerned about reads been consistent, then use CL QUORUM.

If you are reading at CL ONE (in 1.0* ) the read will go one replica 90%  of 
the time, and you will only get the result from that one replica. Which may be 
any value the key+col has been set to including no value. 

The other 10% of the time Read Repair will kick in (this is the configured 
value for read_repair_chance in 1.0, you can change this value). The purpose of 
RR is to make is so that the next time a read happens the data is consistent. 
So reading the CL ONE the read will go to all nodes, you will get a response 
from one and only one of them. In the background the responses from the others 
will be checked and consistency repaired. 

If you were working at a higher CL the responses from CL nodes are checked as 
part of the read request, synchronous to the read, and you get a consistent 
result from all nodes. RR may still run in the background and CL nodes may be 
less than RF nodes.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 31/01/2012, at 6:51 AM, Thorsten von Eicken wrote:

> I'm trying to work through various failure modes to figure out the
> proper operating procedure and proper client coding practices. I'm a
> little unclear about what happens when a network partition gets
> repaired. Take the following scenario:
> - cluster with 5 nodes: A thru E; RF = 3; read_cf = 1; write_cf = 1
> - network partition divides A-C off from D-E
> - operation continues on both sides, obviously some data is unavailable
> from D-E
> - hinted handoffs accumulate
> 
> Now the network partition is repaired. The question I have is what is
> the sequencing of events, in particular between processing HH and
> forwarding read requests across the former partition. I'm hoping that
> there is a time period to process HH *before* nodes forward requests.
> E.g. it would be really good for A not to forward read requests to D
> until D is done with HH processing. Otherwise, clients of A may see a
> discontinuity where data that was available during the partition see it
> go away and then come back.
> 
> Is there a manual or wiki section that discusses some of this and I just
> missed it?
>

Re: recovering from network partition

Reply via email to