> We do always provide atomicity of updates in the same batch_mutate call > under a given key. Which means that for a given key, all update of the batch > will be applied, or none of them. This is *always* true and this does not > depend > on the commit log (and granted, if the write timeout, you won't know which one > it is, but you are still guaranteed that it is either all or none). > > That being said, we do not provide isolation, which means in particular that > reads *can* return a state where only parts of a batch update seems applied > (and it would clearly be cool to have isolation and I'm not even > saying this will > never happen). But atomicity guarantee you that even though you may observe > such a state (and honestly the window during which you can is uber small), > eventually you will observe that all have been applied (or none if you're in > the > business of questioning durability (see below) but never "part of").
You're right of course. I was playing loose with terms. So in the terms of what is durable (for whatever definition of durable you have decided to adopt for the cluster), atomicity is preserved with periodic commit. I stand corrected. > As for durability, it is true that in periodic commit log mode, durability on > a > single node is subject to a small window of time. But true, serious durability > in the real world really only come from replication, and that's why we > use periodic > mode for the commit log by default (and you can always switch to batch if you > so wish). Which is not to say that Peter statement is technically wrong, but > if > what we're doing is assess Cassandra durability, I'll argue that because it > does > replication well (including across data center) while still having > strong single-node > durability guarantee, it has among the best durability story out there > (even with > periodic commit log). I agree, but with one caveat: The operator has to be aware that if the application is also using CL.ONE, killing/restarting a node (unless done softly by 'nodetool drain' or disablinb thrift/rpc) may result in lost writes even though no node actually had a "real problem". What I mean is, you may decide that hardware failures are sufficiently uncommon that you're fine doing CL.ONE on writes for some particular application. However you may not expect regular cluster operations like node restarts to affect durability. In that sense, replication is subtly less effective than what one may think as an alternative to single-node durability, for applications writing at CL.ONE (or CL.ANY). (The probability of actually loosing writes in practice may be low, and I have never made measurements (and even if I did they would be subject to random details that could change at any time, such as timing).) Do you agree? -- / Peter Schuller