I don't think it's just clock drift. There is also the period of time between when the client selects a timestamp, and when the data ends up committed to cassandra. That drift seems harder to control, when the nodes and/or clients are under load.
I agree that it would be nice to have something like this in Cassandra core, but from the JIRA tickets it looks like this has been tried before, and for various reasons was not added. It's definitely non-trivial to get right. On Fri, 6 Jan 2012 13:33:02 -0800 Mohit Anchlia <mohitanch...@gmail.com> wrote: > This looks like right way to do it. But remember this still doesn't > gurantee if your clocks drifts way too much. But it's trade-off with > having to manage one additional component or use something internal to > C*. It would be good to see similar functionality implemented in C* so > that clients don't have to deal with it explicitly. > > On Fri, Jan 6, 2012 at 1:16 PM, Bryce Allen <bal...@ci.uchicago.edu> > wrote: > > This looks like it: > > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Implementing-locks-using-cassandra-only-tp5527076p5527076.html > > > > There's also some interesting JIRA tickets related to locking/CAS: > > https://issues.apache.org/jira/browse/CASSANDRA-2686 > > https://issues.apache.org/jira/browse/CASSANDRA-48 > > > > -Bryce > > > > On Fri, 06 Jan 2012 14:53:21 -0600 > > Jeremiah Jordan <jeremiah.jor...@morningstar.com> wrote: > >> Correct, any kind of locking in Cassandra requires clocks that are > >> in sync, and requires you to wait "possible clock out of sync time" > >> before reading to check if you got the lock, to prevent the issue > >> you describe below. > >> > >> There was a pretty detailed discussion of locking with only > >> Cassandra a month or so back on this list. > >> > >> -Jeremiah > >> > >> On 01/06/2012 02:42 PM, Bryce Allen wrote: > >> > On Fri, 6 Jan 2012 10:38:17 -0800 > >> > Mohit Anchlia<mohitanch...@gmail.com> wrote: > >> >> It could be as simple as reading before writing to make sure > >> >> that email doesn't exist. But I think you are looking at how to > >> >> handle 2 concurrent requests for same email? Only way I can > >> >> think of is: > >> >> > >> >> 1) Create new CF say tracker > >> >> 2) write email and time uuid to CF tracker > >> >> 3) read from CF tracker > >> >> 4) if you find a row other than yours then wait and read again > >> >> from tracker after few ms > >> >> 5) read from USER CF > >> >> 6) write if no rows in USER CF > >> >> 7) delete from tracker > >> >> > >> >> Please note you might have to modify this logic a little bit, > >> >> but this should give you some ideas of how to approach this > >> >> problem without locking. > >> > Distributed locking is pretty subtle; I haven't seen a correct > >> > solution that uses just Cassandra, even with QUORUM read/write. I > >> > suspect it's not possible. > >> > > >> > With the above proposal, in step 4 two processes could both have > >> > inserted an entry in the tracker before either gets a chance to > >> > check, so you need a way to order the requests. I don't think the > >> > timestamp works for ordering, because it's set by the client > >> > (even the internal timestamp is set by the client), and will > >> > likely be different from when the data is actually committed and > >> > available to read by other clients. > >> > > >> > For example: > >> > > >> > * At time 0ms, client 1 starts insert of u...@example.org > >> > * At time 1ms, client 2 also starts insert for u...@example.org > >> > * At time 2ms, client 2 data is committed > >> > * At time 3ms, client 2 reads tracker and sees that it's the only > >> > one, so enters the critical section > >> > * At time 4ms, client 1 data is committed > >> > * At time 5ms, client 2 reads tracker, and sees that is not the > >> > only one, but since it has the lowest timestamp (0ms vs 1ms), it > >> > enters the critical section. > >> > > >> > I don't think Cassandra counters work for ordering either. > >> > > >> > This approach is similar to the Zookeeper lock recipe: > >> > http://zookeeper.apache.org/doc/current/recipes.html#sc_recipes_Locks > >> > but zookeeper has sequence nodes, which provide a consistent way > >> > of ordering the requests. Zookeeper also avoids the busy waiting. > >> > > >> > I'd be happy to be proven wrong. But even if it is possible, if > >> > it involves a lot of complexity and busy waiting it's probably > >> > not worth it. There's a reason people are using Zookeeper with > >> > Cassandra. > >> > > >> > -Bryce
signature.asc
Description: PGP signature