Thanks for the explanation. Does mongo provide any guarantees about the shipping of the transactional logs to other nodes before returning a successful write to clients? Unless it does (like chain replication), you can't trust the values on the new master until the old node has come back up and completed shipping it's transactional log. I don't think that would give you the level of consistency required for a locking system.
BR, Jon. On Tue, Aug 2, 2011 at 10:59 AM, Jeffrey Kesselman <jef...@nphos.com> wrote: > Replicas mirror the master. In a master/replica system changes are always > made to the mater. The replicas just follow behind implementing the > transaction log after commit. They only become directly adressabel if the > master dies, in which case one of the replicas is nominated as new master. > > You can implement a single lock on a single object because it exists for > update on only one shard (that shard's master.) To try to get transactional > syntax across multipel objects is going to take a > multi-phase locking protocol that you implement yourself. > > > On Tue, Aug 2, 2011 at 11:28 AM, Jon Meredith <jmered...@basho.com> wrote: > >> Thanks for the info. >> >> I haven't analyzed mongo's replication model and only did a quick scan of >> the doc. It isn't clear how would you use the atomic test and set operation >> you mentioned with sharding and replication to implement locks. >> >> Does the atomic test and set coordinate with all members in a replica set >> to ensure a consistent update. If not, would you restrict yourself to a >> single node in each shard to do the locking? Then you have to deal with >> failover (client 1 discovers the locking node somehow, acquires a lock, the >> locking node dies before the other members of the replica set are updated, >> client 2 somehow discovers the replacement locking node and as the update >> didn't make it out of the original lock server it also grants a lock). >> >> I can see how it could work with a chain replication scheme like Hibari >> uses, but I'm not sure that's what mongo is doing. You take a performance >> hit to do so. >> >> BR, >> Jon >> >> On Tue, Aug 2, 2011 at 9:04 AM, Jeffrey Kesselman <jef...@nphos.com>wrote: >> >>> Mongo does its clustering a bit differently then Riak. >>> >>> It clusters in two dimensions. It shards for scalability, >>> and replicates for reliability. In any group of shards, only one shard has >>> a given piece of data. But that shard can be replicated in a master/slave >>> ,manner for failover. >>> >>> See: >>> >>> http://www.mongodb.org/display/DOCS/Sharding+Introduction#ShardingIntroduction-ShardinginaNutshell >>> >>> >>> On Tue, Aug 2, 2011 at 10:50 AM, Jon Meredith <jmered...@basho.com>wrote: >>> >>>> Hi Jeffrey, >>>> >>>> Do you know if Mongo provides locks that can be used on clusters of >>>> machines and in the presence of network partitions/failures? Riak could >>>> probably get close if you created a cluster with a single node and >>>> performed >>>> all accesses with N=R=W=1 as updating a single vnode is atomic, it's only >>>> when the order of vnode requests can be interleaved that you get problems. >>>> Of course you'd have a single point of failure.... >>>> >>>> Soren: It may be worth looking at a separate lock service along the >>>> lines of Zookeeper - you could take a look at the work Joe Blomstedt did on >>>> riak_zab https://github.com/jtuple/riak_zab but as the FAQ suggests do >>>> not use it in production. >>>> >>>> BR, >>>> Jon. >>>> >>>> >>>> On Tue, Aug 2, 2011 at 8:40 AM, Jeffrey Kesselman <jef...@nphos.com>wrote: >>>> >>>>> jon gave a much better and more detailed description, >>>>> but fundamentally no true lock is possible without an atomic test and >>>>> set operation. >>>>> >>>>> So far, of all the No Sql DBs I've looked at, only Mongo has that >>>>> capability. >>>>> >>>>> >>>>> >>>>> On Sun, Jul 31, 2011 at 4:55 PM, Soren Hansen <so...@linux2go.dk>wrote: >>>>> >>>>>> I've seen a couple of posts here and there on the subject of a locking >>>>>> mechanism for Riak, most notably: >>>>>> >>>>>> >>>>>> http://riak-users.197444.n3.nabble.com/Riak-and-Locks-td1866960.html >>>>>> >>>>>> While it would only serve as an advisory locking mechanism, wouldn't a >>>>>> bucket with a reasonably high n, w and dw set equal to n, a >>>>>> deterministic naming scheme for the object being locked, and a locking >>>>>> algorithm such as: >>>>>> >>>>>> 1. PUT /locks/object_id >>>>>> If-None-Match: * >>>>>> Body: <some unique identifier for the client thread> >>>>>> >>>>>> 1a. If this fails, wait for a while, then try again. >>>>>> 1b. If it succeeds, proceed to 2. >>>>>> >>>>>> 2. The doc for If-None-Match says "this does not prevent concurrent >>>>>> writes; it is possible for the condition to evaluate to true for >>>>>> multiple requests if the requests occur at the same time." I'm not >>>>>> completely sure if n=w=dw protects me from concurrent writes (I'm not >>>>>> familiar with the locking semantics of a single riak instance). >>>>>> Anyway, if I'm in fact not protected, the next step is to read the >>>>>> value back to make sure we're actually the ones holding the key. If >>>>>> not, go back to step 1. If yes, proceed as planned. >>>>>> >>>>>> 3. Once you're done with the lock, just DELETE it. >>>>>> >>>>>> If this were really that simple, someone would have suggested it. So, >>>>>> what is this Riak rookie (i.e. I) missing? >>>>>> >>>>>> >>>>>> -- >>>>>> Soren Hansen | http://linux2go.dk/ >>>>>> Ubuntu Developer | http://www.ubuntu.com/ >>>>>> OpenStack Developer | http://www.openstack.org/ >>>>>> >>>>>> _______________________________________________ >>>>>> riak-users mailing list >>>>>> riak-users@lists.basho.com >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> riak-users mailing list >>>>> riak-users@lists.basho.com >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>> >>>>> >>>> >>> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com