Hi Peter - 

Thanks for the response and adding the FAQ.   Really great answers.

So if I understand correctly, the nodes getting the replica copies are 
predetermined, based on the replication strategy.

One thing is still a bit unclear.   So once a node establishes who its 
replicants are,  are those "replicant nodes" always used?    In other words, 
given a RF=3, if the other two replica nodes go down, the "original" node will 
not automatically pick 2 new nodes for its replica copies?


Jon



On Mar 26, 2011, at 12:05 AM, Peter Schuller wrote:

>> Does anyone know how cassandra chooses the nodes for its other replicant 
>> copies?
> 
> This keeps coming up so I added a FAQ entry:
> 
>   http://wiki.apache.org/cassandra/FAQ#replicaplacement
> 
> I don't quite like the phrasing but I couldn't come up anything that
> was sufficiently clear and complete right now.
> 
>> The first node gets the first copy because its token is assigned for that 
>> key.   But what about the other copies of the data?
>> Do the replicant nodes stay the same based on the token range?  Or are the 
>> other copies send to any random node based on its load and availability?
>> I think this is important in order to understand because it affects how to 
>> plan for situations where a significant number of nodes are suddenly 
>> unavailable, such as the loss of a data center.
> 
> I hope the above is answered by the FAQ. If it's unclear please say so
> and we can clarify.
> 
>> If the replicants are copied just based on random availability, then quorum 
>> writes could survive on the remaining nodes.  But if the replicant nodes are 
>> somehow pre-determined, those replicants may node be available and writes 
>> will fail.
> 
> I'm not really following this though. Why would you ever want data to
> be placed based on "random availability"?
> 
> If you are writing at QUORUM, a quorum of nodes in the replicate set
> must have ack:ed the write in order for the read to be considered
> successful (similar for reads). If a sufficient amount of nodes are
> up, you're fine. If not, then no - fundamentally that would violate
> the requirement of quorum.
> 
> For example, if you're at RF=3, at least two nodes (in the replica set
> for a given key) must be responding to your request in order for them
> to succeed at QUORUM.
> 
> -- 
> / Peter Schuller

Reply via email to