My understanding..

1 read repair involves the coordinator sending a full data read to CL nodes, 
resolving the differences and sending writes back. For CL one this happens 
after returning, for higher CL this happens before. (my understanding of the 
internals of RR are a little rough though)

2 not sure

3) RR is not used in write, hinted handoff is.

4) e node responsible for the key is often the node asked for the full data of 
the request, the other nodes are asked for a digest of their response. However 
the dynamic snitch can re-order the nodes based on load. It's also the starting 
point when the partitioner is working which nodes replicas shoud be stored on. 
It's not a point of failure. 

5) partitioner knows where the data was written to. 
http://thelastpickle.com/2011/02/07/Introduction-to-Cassandra/

Aaron

On 19/02/2011, at 6:28 AM, Anthony John <chirayit...@gmail.com> wrote:

> K - let me state the facts first (As I see know them)
> - I do not know the inner workings, so interpret my response with that 
> caveat. Although, at an architectural level, one should be able to keep 
> detailed implementation at bay
> - Quorum is (N+!)/2 where N is the Replication Factor (RF)
> - And consistency is a guarantee if R(ead) + W(rite) > RF (Which Quorum gives 
> you, but can be achieved via other permutations, depending on whether Read or 
> Write performance is desired)
> 
> No getting to your questions:- 
> 1. If Read at Q is nondeterministic, it would likely have to read the other 
> (RF-Q) nodes to achieve Quorum on a deterministic value. At which point - 
> sync'ing all with writes should not be that expensive. But at what point 
> precisely the read is returned - do not know - you will have to look at the 
> code. IMO - at this level it should not matter.
> 2. Should be at the granularity of data divergence
> 3. Read Repair or Nodetool (which ever comes first)
> 4. All peer - there is no primary. There might be a connected node - but no 
> special role/privileges
> 5. Tries to Q - returns on deterministic read. If not - see (1)
> 6. Writer supplies timestamp value - can be any value that makes sense within 
> the scope of data/application.
> 
> HTH,
> 
> -JA
> 
> On Fri, Feb 18, 2011 at 10:28 AM, A J <s5a...@gmail.com> wrote:
> Couple of more related questions:
> 
> 5. For reads, does Cassandra first read N nodes or just the R nodes it
> selects ? I am thinking unless it reads all the N nodes, how will it
> know which node has the latest write.
> 
> 6. Who decides the timestamp that gets inserted into the timestamp
> field of every column. I would guess the coordinator node picks up its
> system's timestamp.  If that is true, the clocks on all the nodes
> should be synchronized, right ? Otherwise conflict resolution cannot
> be done correctly.
> For a distributed system, this is not always possible. How do folks
> get around this issue ?
> 
> Thanks.
> 
> 
> 
> On Fri, Feb 18, 2011 at 10:23 AM, A J <s5a...@gmail.com> wrote:
> > Questions about R and N (and W):
> > 1. If I set R to Quorum and cassandra identifies a need for read
> > repair before returning, would the read repair happen on R nodes (I
> > mean subset of R that needs repair) or N nodes before the data is
> > delivered to the client ?
> > 2. Also does the repair happen at level of row (key) or at level of column ?
> >
> > 3. During write, if W is met but N-W is not met for some reason; would
> > cassandra try to repair N-W nodes in the background as and when it
> > can. Or the N-W are only repaired when a read is issued ?
> >
> > 4. What is the significance of the 'primary' replica for writes from
> > usage point ? Writes to primary and non-primary replicas all happen
> > simultaneously. Ensuring W is decided irrespective of it being primary
> > or not. Ensuring R is decided by any of the R nodes out of N.
> > I know the tokens are divided per the primary replica. But other than
> > that, for read and write operations, do the primary replica play any
> > special role ?
> >
> > Thanks.
> >
> 

Reply via email to