Re: Cassandra ACID

Peter Schuller Sat, 25 Jun 2011 01:27:02 -0700

> We do always provide atomicity of updates in the same batch_mutate call
> under a given key. Which means that for a given key, all update of the batch
> will be applied, or none of them. This is *always* true and this does not 
> depend
> on the commit log (and granted, if the write timeout, you won't know which one
> it is, but you are still guaranteed that it is either all or none).
>
> That being said, we do not provide isolation, which means in particular that
> reads *can* return a state where only parts of a batch update seems applied
> (and it would clearly be cool to have isolation and I'm not even
> saying this will
> never happen). But atomicity guarantee you that even though you may observe
> such a state (and honestly the window during which you can is uber small),
> eventually you will observe that all have been applied (or none if you're in 
> the
> business of questioning durability (see below) but never "part of").


You're right of course. I was playing loose with terms. So in the
terms of what is durable (for whatever definition of durable you have
decided to adopt for the cluster), atomicity is preserved with
periodic commit. I stand corrected.

> As for durability, it is true that in periodic commit log mode, durability on 
> a
> single node is subject to a small window of time. But true, serious durability
> in the real world really only come from replication, and that's why we
> use periodic
> mode for the commit log by default (and you can always switch to batch if you
> so wish). Which is not to say that Peter statement is technically wrong, but 
> if
> what we're doing is assess Cassandra durability, I'll argue that because it 
> does
> replication well (including across data center) while still having
> strong single-node
> durability guarantee, it has among the best durability story out there
> (even with
> periodic commit log).

I agree, but with one caveat:

The operator has to be aware that if the application is also using
CL.ONE, killing/restarting a node (unless done softly by 'nodetool
drain' or disablinb thrift/rpc) may result in lost writes even though
no node actually had a "real problem". What I mean is, you may decide
that hardware failures are sufficiently uncommon that you're fine
doing CL.ONE on writes for some particular application. However you
may not expect regular cluster operations like node restarts to affect
durability. In that sense, replication is subtly less effective than
what one may think as an alternative to single-node durability, for
applications writing at CL.ONE (or CL.ANY).

(The probability of actually loosing writes in practice may be low,
and I have never made measurements (and even if I did they would be
subject to random details that could change at any time, such as
timing).)

Do you agree?

-- 
/ Peter Schuller

Re: Cassandra ACID

Reply via email to