To save other people the effort of watching a reasonably long video,
it demonstrates how to show the conflict to the user.  In the
application where it is done (a wiki) it is an entirely appropriate
UI.

On Mon, Apr 18, 2011 at 9:01 PM, Eric Moritz <e...@themoritzfamily.com> wrote:
> Ben,
>
> There's a little demo app that was written by someone at Basho that
> demostrates a way to accomplish what you're talking about.
>
> http://forms.basho.com/riak-in-action-wriaki-p/
>
> Eric.
>
> On Mon, Apr 18, 2011 at 11:05 PM, Sean Cribbs <s...@basho.com> wrote:
>> Sorry for being dismissive, I do understand what you're after. I'm just 
>> saying that if your application needs those semantics, build them in -- 
>> don't expect Riak's vector clocks to do the work for you. Keep a list of the 
>> most recent "change" events either in that object or alongside, or keep a 
>> copy of the last-seen version in your object -- whatever works to make those 
>> kinds of merges possible.
>>
>> Interestingly, multiple people have explored the SCM-on-top-of-Riak thing, 
>> so I know it's doable; the key difference there is that multiple, 
>> independently written objects are used to represent the history of a single 
>> conceptual "object". Once written, nothing is overwritten, only new objects 
>> are created.
>>
>> Sean Cribbs <s...@basho.com>
>> Developer Advocate
>> Basho Technologies, Inc.
>> http://basho.com/
>>
>> On Apr 18, 2011, at 10:46 PM, Ben Tilly wrote:
>>
>>> I'm not missing the point you think I am.  Riak already has the
>>> ability to store more than one value for a key/value pair.  I'd like
>>> an option - possibly named something new, that used this to store a
>>> limited amount of history so that clients could be presented with a
>>> common ancestor when that was required.
>>>
>>> In the case that I gave you, if the common ancestor is:
>>>
>>>  {
>>>    "name": "Jane Doe",
>>>    "occupation": "secretary"
>>>  }
>>>
>>> then a standard three-way merge would say that she got married and the
>>> correct result should be:
>>>
>>>  {
>>>    "name": "Jane Blow",
>>>    "husband": "Joe Blow",
>>>    "occupation": "n/a"
>>>  }
>>>
>>> while if the common ancestor is:
>>>
>>>  {
>>>    "name": "Jane Blow",
>>>    "husband": "Joe Blow",
>>>    "occupation": "n/a"
>>>  }
>>>
>>> then a standard 3-way merge would say that she dumped the jerk and got
>>> a job resulting in:
>>>
>>>  {
>>>    "name": "Jane Doe",
>>>    "occupation": "secretary"
>>>  }
>>>
>>> Without the common ancestor you know what changed, but not which
>>> direction the changes are going, and so have no sane way to resolve
>>> the conflict.
>>>
>>> Given the non-atomic nature of reads and writes in Riak, it is likely
>>> that neither of the two clients that wrote that data was in any way
>>> aware of the existence of the other write.  This makes your suggestion
>>> of escalating to the user impossible.  And there is no particular
>>> reason to believe that the third user to come along will necessarily
>>> know anything either.
>>>
>>> (Besides, I spent enough years maintaining batch systems to be wary of
>>> escalating to users at the drop of a hat.  The "user" may well be a
>>> complete moron on autopilot.)
>>>
>>> On Mon, Apr 18, 2011 at 7:01 PM, Sean Cribbs <s...@basho.com> wrote:
>>>> I think you're missing a key point here, and that is that the vector clock 
>>>> doesn't store copies of the *values*, only the individual "touches" of 
>>>> identified clients. I'm not sure what computing the common ancestor is 
>>>> going to give you if you don't have the value.  Vector clocks are 
>>>> essentially opaque to clients.
>>>>
>>>> That said, I think the use-case you gave is one that can clearly bubble up 
>>>> to the user, e.g. "Someone else changed this record while you were editing 
>>>> it. Can you resolve the differences?" (Give the other person's name 
>>>> perhaps, highlight the fields that are different.)
>>>>
>>>> Sean Cribbs <s...@basho.com>
>>>> Developer Advocate
>>>> Basho Technologies, Inc.
>>>> http://basho.com/
>>>>
>>>> On Apr 18, 2011, at 9:12 PM, Ben Tilly wrote:
>>>>
>>>>> Riak's small_vclock, big_vclock, young_vclock, and old_vclock
>>>>> parameters already give control over pruning behavior.  If there isn't
>>>>> enough history to compute a common ancestor, then return nothing for
>>>>> the common ancestor.
>>>>>
>>>>> The use case here really isn't an SCM.  The use case is when two
>>>>> clients get simultaneous (within, say, 50 ms) requests to write to the
>>>>> same object.  When a third one tries to read the data 5s later, it
>>>>> would be nice to have a way to figure out what to do.  For this use
>>>>> case you can limit the amount of history quite severely without loss.
>>>>>
>>>>> Let's take a practical example of conflicting data structures:
>>>>>
>>>>>  {
>>>>>    "name": "Jane Doe",
>>>>>    "occupation": "n/a"
>>>>>  },
>>>>>  {
>>>>>    "name": "Jane Blow",
>>>>>    "husband": "Joe Blow",
>>>>>    "occupation": "secretary"
>>>>>  }
>>>>>
>>>>> What should it be resolved to?  Perhaps Jane just got divorced and
>>>>> went to work as a secretary.  Or she could have gotten married and
>>>>> left her job.  If you give me the common ancestor I can tell which
>>>>> scenario to believe.  Without it I can only guess badly.  I don't want
>>>>> to keep a history here.  I want to resolve the discrepancy the next
>>>>> time I see it (and log it somewhere important if I can't resolve it).
>>>>>
>>>>> On Mon, Apr 18, 2011 at 5:38 PM, Sean Cribbs <s...@basho.com> wrote:
>>>>>> Yes, but vector clocks are for resolution of race-conditions and network 
>>>>>> partitions, not to provide an SCM history.  Imagine how much space would 
>>>>>> be consumed by the history long enough to disambiguate an object that 
>>>>>> has been updated normally 1000 times, followed by one bad client that 
>>>>>> decides write to it without fetching the vector clock first.
>>>>>>
>>>>>> Coda Hale put it well in his talk at the recent Riak Meetup: your data 
>>>>>> needs to be logically monotonic so that writes (and reads) can be 
>>>>>> retried until resolution is reached.
>>>>>>
>>>>>> Also, we've found that assigning the client id to something that is 
>>>>>> relevant to your domain, e.g. real people, will help reduce surprises 
>>>>>> (and degenerate cases like sibling explosion) when it comes to 
>>>>>> vector-clock resolution.
>>>>>>
>>>>>> Sean Cribbs <s...@basho.com>
>>>>>> Developer Advocate
>>>>>> Basho Technologies, Inc.
>>>>>> http://basho.com/
>>>>>>
>>>>>> On Apr 18, 2011, at 8:15 PM, Aphyr wrote:
>>>>>>
>>>>>>>> I actually had a question about that page.  Why is it that when there
>>>>>>>> is a conflict we can only get the conflicting versions of the data?
>>>>>>>> If I'm going to try to resolve the conflict intelligently, I really
>>>>>>>> want the common ancestor as well so that I can try to do a 3-way
>>>>>>>> merge.
>>>>>>>
>>>>>>> Good call. If an ancestor were available it would make counting and 
>>>>>>> merging orthogonal changes *much* simpler.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> riak-users mailing list
>>>>>>> riak-users@lists.basho.com
>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to