> Thanks a lot for elaborating on repairs.    Still, it's a bit fuzzy to me why 
> it is so important to run a repair before the GCGraceSeconds kicks in.   Does 
> this mean a delete does not get "replicated" ?   In other words when I delete 
> something on a node, doesn't cassandra set tombstones on its replica copies?

Deletes are replicated, but deletes are special in that unlike actual
data, you're wanting to *remove* something, but the information that
says "stuff is gone" is information in and of itself. Clearly you
don't want to forever and ever keep track of anything ever removed in
the cluster, so this has to expire somehow. For that reason, there is
a requirement that tombstones are replicated prior to their expiry.
See:

      http://wiki.apache.org/cassandra/DistributedDeletes

> And technically, isn't repair only needed for cases where things weren't 
> properly propogated in the cluster?  If all writes are written to the right 
> replicas, and all deletes are written to all the replicas, and all nodes were 
> available at all times, then everything should work as designed -  without 
> manual intervention, right?

Yes, but you can assume that doesn't happen in real life for extended
periods of time. It doesn't take a lot at all for a *few* writes not
getting replicated (for example, just restarting a Cassandra node will
cause some writes to be dropped - hinted handoff is not a guarantee,
only an optimization).

-- 
/ Peter Schuller

Reply via email to