> That's finally a precise statement! :) I was wondering what " to at least 1 
> replica's commit log" is supposed to actually mean: 
> http://wiki.apache.org/cassandra/API

The main idea is that it has been "officially delivered" to one
replicate. If Cassandra only did batch-wise commit such that a write
was never ACK:ed until it was durable, it would mean that it had been
durably written to 1 replica set.

I suspect the phrasing is to get around the fact that it is not
actually durably written if nodes are configured to use periodic sync
mode.

> Does quorum mean that data is replicated to q nodes or to at least q nodes?

That it is replicated to at least a quorom of nodes before the write
is considered successful. This does not prevent further propagation to
all nodes; data always gets replicated according to replication
factor. Consistency levels only affect the consistency requirements of
the particular request.

>  I just added another blank machine to my cluster. Nothing happened as 
> expected (stopped writing to the cluster) but after I ran nodetool repair it 
> held more data than all other nodes. So it copied data from the other nodes 
> to this one? I assumed that data is replicated to q nodes not to all, is 
> quorum 'only' about consistency and not about saving storage space?

The new node should have gotten its appropriate amount according to
the ring responsibility (i.e., tokens). I'm not sure why a new node
would get more than its fair share (according to tokens) of data
though.

There is one extreme case which would be if the cluster has seen lots
of writes in degraded states so that there is a lot of data around the
cluster that has not yet reached their full replica sets. A repair on
a new node might make the new node be the only one that has all the
data it should have... but you'd have to have written data at low
consistency level during pretty shaky periods for this to have a
significant effect (especially if hinted handoff is turned on).

-- 
/ Peter Schuller

Reply via email to