Sorry for being dismissive, I do understand what you're after. I'm just saying that if your application needs those semantics, build them in -- don't expect Riak's vector clocks to do the work for you. Keep a list of the most recent "change" events either in that object or alongside, or keep a copy of the last-seen version in your object -- whatever works to make those kinds of merges possible.
Interestingly, multiple people have explored the SCM-on-top-of-Riak thing, so I know it's doable; the key difference there is that multiple, independently written objects are used to represent the history of a single conceptual "object". Once written, nothing is overwritten, only new objects are created. Sean Cribbs <s...@basho.com> Developer Advocate Basho Technologies, Inc. http://basho.com/ On Apr 18, 2011, at 10:46 PM, Ben Tilly wrote: > I'm not missing the point you think I am. Riak already has the > ability to store more than one value for a key/value pair. I'd like > an option - possibly named something new, that used this to store a > limited amount of history so that clients could be presented with a > common ancestor when that was required. > > In the case that I gave you, if the common ancestor is: > > { > "name": "Jane Doe", > "occupation": "secretary" > } > > then a standard three-way merge would say that she got married and the > correct result should be: > > { > "name": "Jane Blow", > "husband": "Joe Blow", > "occupation": "n/a" > } > > while if the common ancestor is: > > { > "name": "Jane Blow", > "husband": "Joe Blow", > "occupation": "n/a" > } > > then a standard 3-way merge would say that she dumped the jerk and got > a job resulting in: > > { > "name": "Jane Doe", > "occupation": "secretary" > } > > Without the common ancestor you know what changed, but not which > direction the changes are going, and so have no sane way to resolve > the conflict. > > Given the non-atomic nature of reads and writes in Riak, it is likely > that neither of the two clients that wrote that data was in any way > aware of the existence of the other write. This makes your suggestion > of escalating to the user impossible. And there is no particular > reason to believe that the third user to come along will necessarily > know anything either. > > (Besides, I spent enough years maintaining batch systems to be wary of > escalating to users at the drop of a hat. The "user" may well be a > complete moron on autopilot.) > > On Mon, Apr 18, 2011 at 7:01 PM, Sean Cribbs <s...@basho.com> wrote: >> I think you're missing a key point here, and that is that the vector clock >> doesn't store copies of the *values*, only the individual "touches" of >> identified clients. I'm not sure what computing the common ancestor is going >> to give you if you don't have the value. Vector clocks are essentially >> opaque to clients. >> >> That said, I think the use-case you gave is one that can clearly bubble up >> to the user, e.g. "Someone else changed this record while you were editing >> it. Can you resolve the differences?" (Give the other person's name perhaps, >> highlight the fields that are different.) >> >> Sean Cribbs <s...@basho.com> >> Developer Advocate >> Basho Technologies, Inc. >> http://basho.com/ >> >> On Apr 18, 2011, at 9:12 PM, Ben Tilly wrote: >> >>> Riak's small_vclock, big_vclock, young_vclock, and old_vclock >>> parameters already give control over pruning behavior. If there isn't >>> enough history to compute a common ancestor, then return nothing for >>> the common ancestor. >>> >>> The use case here really isn't an SCM. The use case is when two >>> clients get simultaneous (within, say, 50 ms) requests to write to the >>> same object. When a third one tries to read the data 5s later, it >>> would be nice to have a way to figure out what to do. For this use >>> case you can limit the amount of history quite severely without loss. >>> >>> Let's take a practical example of conflicting data structures: >>> >>> { >>> "name": "Jane Doe", >>> "occupation": "n/a" >>> }, >>> { >>> "name": "Jane Blow", >>> "husband": "Joe Blow", >>> "occupation": "secretary" >>> } >>> >>> What should it be resolved to? Perhaps Jane just got divorced and >>> went to work as a secretary. Or she could have gotten married and >>> left her job. If you give me the common ancestor I can tell which >>> scenario to believe. Without it I can only guess badly. I don't want >>> to keep a history here. I want to resolve the discrepancy the next >>> time I see it (and log it somewhere important if I can't resolve it). >>> >>> On Mon, Apr 18, 2011 at 5:38 PM, Sean Cribbs <s...@basho.com> wrote: >>>> Yes, but vector clocks are for resolution of race-conditions and network >>>> partitions, not to provide an SCM history. Imagine how much space would >>>> be consumed by the history long enough to disambiguate an object that has >>>> been updated normally 1000 times, followed by one bad client that decides >>>> write to it without fetching the vector clock first. >>>> >>>> Coda Hale put it well in his talk at the recent Riak Meetup: your data >>>> needs to be logically monotonic so that writes (and reads) can be retried >>>> until resolution is reached. >>>> >>>> Also, we've found that assigning the client id to something that is >>>> relevant to your domain, e.g. real people, will help reduce surprises (and >>>> degenerate cases like sibling explosion) when it comes to vector-clock >>>> resolution. >>>> >>>> Sean Cribbs <s...@basho.com> >>>> Developer Advocate >>>> Basho Technologies, Inc. >>>> http://basho.com/ >>>> >>>> On Apr 18, 2011, at 8:15 PM, Aphyr wrote: >>>> >>>>>> I actually had a question about that page. Why is it that when there >>>>>> is a conflict we can only get the conflicting versions of the data? >>>>>> If I'm going to try to resolve the conflict intelligently, I really >>>>>> want the common ancestor as well so that I can try to do a 3-way >>>>>> merge. >>>>> >>>>> Good call. If an ancestor were available it would make counting and >>>>> merging orthogonal changes *much* simpler. >>>>> >>>>> _______________________________________________ >>>>> riak-users mailing list >>>>> riak-users@lists.basho.com >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>> >>>> >> >> _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com