Re: Future of Apache wave [Was: Re: Advantages of P2P messaging?]

Bruno Gonzalez (aka stenyak) Thu, 13 Jun 2013 13:16:15 -0700

I assume the "path" or "index" would be abstracted too. This way, OT can
also handle the (x,y) position of a pixel in an image, or any other kind of
position or range in which the operation must be applied.



On Thu, Jun 13, 2013 at 10:06 PM, Sam Nelson <so...@orcon.net.nz> wrote:

> Hi Michael,
>
> I'm trying to wrap my head around this too.
> Say you have some JSON object:
> {
>   "i" : 5
>   "s" : "string"
>   "c" : { "i" : 2 }
>   "a" : [ { "i" : 3 } ]
> }
>
> What would the parameters be to delete "s" since a path is really required
> isn't it, rather than an index? (i.e. parameters are specific to the type
> they operate on)  And further, what would a delete operation do in this
> case?  remove the "s" member of the object, or just set its value to null?
>  That decision could be application implementation specific, sure, but if
> the application needed both concepts, how can you now define two abstract
> delete operations, in order for the application to implement them both for
> each case?
>
> -Sam
>
>
>
>
> On 14/06/2013 07:45, Michael MacFadden wrote:
>
>> Joseph,
>>
>> We are almost in sync now.  Lets go one step further.  Let's so you were
>> designing an application to be a rich text editor.  Forget OT, you just
>> making an editor.  I assume your editor has to have some sort of model
>> right?  Let's temporarily forget the persistence format.  You may save the
>> rich text to xml, or rtf, or whatever, but I am not worried about that.  I
>> am saying what is the in memory model that your editor uses to interact
>> with the document?  Build that.  Build it any way you like.
>>
>> Ok so now you have a rich text object model.  Your editor is going to
>> interact with that though some sort of object model API.  When the user
>> selects some text and presses the bold button, the editor makes some API
>> call to the model and says, make this part bold.  For the sake of
>> conversation, I don't care how that internally happens in the object data
>> model.
>>
>> OK.  So now if we have a sufficiently powerful OT operation set can
>> describe manipulating objects, we can manipulate the object model with OT.
>>   Really what OT services are, are robust message busses that describe how
>> one user is changing the objects to another user, and accounting for
>> context transformations along the way.  So if you can build an abstract OT
>> operation set that lets you mess with objects and objects structures, then
>> you have a shot at then adapting that operation set to a whole slew of
>> applications.
>>
>> This is actually an ongoing area of research, that I presented a paper on
>> to the collaborative editing workshop at the ACM CSCW conference last
>> year.
>>
>> ~Michael
>>
>> On 6/13/13 8:34 PM, "Joseph Gentle" <jose...@gmail.com> wrote:
>>
>>  Interesting...
>>>
>>> The abstraction I use is to have a bunch of data types. Each data type
>>> defines what documents look like, what operations look like and they
>>> define a set of OT functions (transform, compose, apply, etc). Eg,
>>> Text documents are strings and their operations are lists of {skip:5},
>>> {insert:'hi'}, {delete:10}, etc. JSON documents are JSON and their
>>> operations are lists of path+what to do there. Eg, [{path: ['hi'],
>>> delete list element 5}, ...]
>>>
>>> It sounds like you're saying we should abstract over the ideas of
>>> ot-for-lists, ot-for-sets and so on. Is that right?
>>>
>>> ... But rich text isn't quite a list or a set. You can make annotation
>>> markers or something, but then they take up space. Maybe its possible
>>> to ignore the final document space that an annotation takes up for the
>>> purpose of transformation?
>>>
>>> Another architecture I've thought about using is making all documents
>>> use the JSON OT code. Specialized type like rich text can exist as
>>> leaves in the JSON structure - and let you embed a rich text operation
>>> inside a JSON operation.
>>>
>>> -J
>>>
>>>
>>> On Thu, Jun 13, 2013 at 12:05 PM, Michael MacFadden
>>> <michael.macfad...@gmail.com> wrote:
>>>
>>>> As a follow up.  The reason you are struggling with the concept is that
>>>> you have tied the operation language directly to a specific data model,
>>>> in
>>>> much the way wave did.  They created a conversation model and a specific
>>>> set of operations that act on that model.  When you do that your
>>>> operations a making assumptions on how the object model works.  This
>>>> coupling is not a good idea.  Much of the OT community strongly
>>>> recommends
>>>> avoiding this.
>>>>
>>>> Rather great a generic set of operations that manipulate things in an
>>>> abstract way, and then let the application sort out what to do with the
>>>> operations when it receives it.  The OT stack only needs to understand
>>>> how
>>>> the parameters of the operations interact; such as positional arguments
>>>> for insert and delete style operations.  The OT Stack doesn't need to
>>>> know
>>>> that the thing you are inserting is a character, a contact card, a
>>>> database record, or an object in a list.  It doesn't care.  It just
>>>> knows
>>>> that if one insert happens before another it has to increment the index
>>>> of
>>>> the second operation.
>>>>
>>>> If things are decoupled in this way, the whole OT stack becomes much
>>>> more
>>>> flexible.  As one of the founders of OT says almost every time I see
>>>> him,
>>>> "Let OT focus on what it is good at, and let it ignore everything else".
>>>>
>>>> ~Michael
>>>>
>>>> On 6/13/13 7:54 PM, "Joseph Gentle" <jose...@gmail.com> wrote:
>>>>
>>>>  So you're imagining storing rich text like this?
>>>>>
>>>>> {doc: 'hi there!', annotations: [{from:0, to:2, bold:true}]} or
>>>>> something?
>>>>>
>>>>> Every change to the document is going to need to manually update every
>>>>> single annotation which has start / end points after the edit. But it
>>>>> wouldn't work - if you insert some text and I edit an annotation later
>>>>> in the document, my annotation will float forwards / backwards when I
>>>>> get your op because I don't know how I should change it.
>>>>>
>>>>> This idea comes up about every 6 months on the sharejs mailing list.
>>>>> Several solutions have been proposed, but none of them work correctly.
>>>>> I think we just need a separate set of transform / apply / ...
>>>>> functions for rich text.
>>>>>
>>>>> -J
>>>>>
>>>>>
>>>>> On Thu, Jun 13, 2013 at 1:19 AM, Michael MacFadden
>>>>> <michael.macfad...@gmail.com> wrote:
>>>>>
>>>>>> Joseph,
>>>>>>
>>>>>> I disagree.  The annotations themselves are just another data
>>>>>> structure.
>>>>>> You add them, remove them and modify them like anything else.  You can
>>>>>> manage annotations as another structure within the blip model.  There
>>>>>> is
>>>>>> no reason why you can interface them though a JSON Style operations
>>>>>> structure.
>>>>>>
>>>>>> ~Michael
>>>>>>
>>>>>> On 6/13/13 12:11 AM, "Joseph Gentle" <jose...@gmail.com> wrote:
>>>>>>
>>>>>>  The conversation *model* yes, but not the rich text documents
>>>>>>> themselves. You can't really make text annotations work properly on
>>>>>>> top of JSON operations. We should keep something like the current
>>>>>>> system for actual blips.
>>>>>>>
>>>>>>> -J
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 12, 2013 at 4:06 PM, Michael MacFadden
>>>>>>> <michael.macfad...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Actually I just went and took a look at your operations.  The JSON
>>>>>>>> OT
>>>>>>>> type
>>>>>>>> is probably the closest to what I would suggest we use.  JSON
>>>>>>>> Objects
>>>>>>>> are
>>>>>>>> not just for javascript.  They define arbitrary objects structures.
>>>>>>>> We
>>>>>>>> don't need a specific wave XML type, we could use the JSNO
>>>>>>>> operations
>>>>>>>> to
>>>>>>>> modify the conversation model
>>>>>>>>
>>>>>>>> Potentially.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/12/13 10:55 PM, "Joseph Gentle" <jose...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>  Really?
>>>>>>>>>
>>>>>>>>> My method for ShareJS was to simply have a JSON OT type and a
>>>>>>>>> plaintext OT type. I'd like to add a rich text OT type as well.
>>>>>>>>> Then
>>>>>>>>> people can just pick which one based on what kind of data they
>>>>>>>>> have.
>>>>>>>>>
>>>>>>>>> For Wave I'd like to be able to do something similar - JSON is
>>>>>>>>> obviously useful for storing application data. It'd be nice to have
>>>>>>>>> some sort of hybrid for wavelets where we can put multiple
>>>>>>>>> different
>>>>>>>>> kinds of data inside a wavelet. One option is to use a JSON OT type
>>>>>>>>> as
>>>>>>>>> the root of all wavelets and support subdocuments at arbitrary
>>>>>>>>> paths
>>>>>>>>> (so the object could be:
>>>>>>>>> {projectName:"ruby on rails", files:[{name:'foo/bar.rb', ...}],
>>>>>>>>> documentation:{_type:richtext, _data:"<Rich text data>"}}
>>>>>>>>>
>>>>>>>>> Or wavelets could simply each have a type (defaulting to the
>>>>>>>>> current
>>>>>>>>> wavey XML type).
>>>>>>>>>
>>>>>>>>> -J
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 12, 2013 at 2:41 PM, Michael MacFadden
>>>>>>>>> <michael.macfad...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> You have stumbled upon one of the weaknesses of wave OT.  Best
>>>>>>>>>> practices
>>>>>>>>>> would say to NOT bind your OT directly to the data type, because
>>>>>>>>>> then
>>>>>>>>>> you
>>>>>>>>>> don't have an extendable model. For example if you have all of
>>>>>>>>>> your
>>>>>>>>>> operations figured out and validated, and then you need to change
>>>>>>>>>> your
>>>>>>>>>> data model, you have to go back and mess with your transformation
>>>>>>>>>> functions.  Not good.  Or you have to try to bend new data models
>>>>>>>>>> in
>>>>>>>>>> to
>>>>>>>>>> the existing one, also not good.
>>>>>>>>>>
>>>>>>>>>> Best practice is to create a generic OT model and operate on that.
>>>>>>>>>> There
>>>>>>>>>> is debate as to what the model should be, but most agree on the
>>>>>>>>>> concept.
>>>>>>>>>>
>>>>>>>>>> For example in wave they tried to create a map like collection
>>>>>>>>>> that
>>>>>>>>>> OT
>>>>>>>>>> could operate on. Essentially though that had to implement the map
>>>>>>>>>> as
>>>>>>>>>> if
>>>>>>>>>> its underlying model was a bunch of XMLish type tags.  This we
>>>>>>>>>> very
>>>>>>>>>> convoluted.
>>>>>>>>>>
>>>>>>>>>> ~Michael
>>>>>>>>>>
>>>>>>>>>> On 6/12/13 10:26 PM, "Joseph Gentle" <jose...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>  Yeah exactly. The google wave OT code uses special operations
>>>>>>>>>>> that
>>>>>>>>>>> can
>>>>>>>>>>> understand the XML structure. It doesn't just edit the plaintext.
>>>>>>>>>>> Formatting annotations are stored in a special way - operations
>>>>>>>>>>> can
>>>>>>>>>>> say something like "At position 10 add bold. At position 20 stop
>>>>>>>>>>> adding bold".
>>>>>>>>>>>
>>>>>>>>>>> -J
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jun 12, 2013 at 1:56 PM, Bruno Gonzalez (aka stenyak)
>>>>>>>>>>> <sten...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I suspected something like that. I assume it also correctly
>>>>>>>>>>>> handles
>>>>>>>>>>>> variable-length UTF8 characters, so it's not necessarily 1-byte
>>>>>>>>>>>> patches?
>>>>>>>>>>>>
>>>>>>>>>>>> This starts to make sense. OT can only compute conflict-free
>>>>>>>>>>>> merges
>>>>>>>>>>>> using
>>>>>>>>>>>> the "character" primitive (because that's how Wave was
>>>>>>>>>>>> originally
>>>>>>>>>>>> designed). As an unfortunate consequence, you can then only
>>>>>>>>>>>> OT-operate
>>>>>>>>>>>> on
>>>>>>>>>>>> plain text. Otherwise you could get conflict-free xml text that
>>>>>>>>>>>> <loo<ks
>>>>>>>>>>>> li<>ke>this>, and that of course isn't legal xml.
>>>>>>>>>>>> But we still want rich text in Google Wave, therefore all the
>>>>>>>>>>>> formatting
>>>>>>>>>>>> stuff is stored some place else, specifically in the blip
>>>>>>>>>>>> annotations.
>>>>>>>>>>>> The
>>>>>>>>>>>> modifications to annotations are (sometimes) simply derived from
>>>>>>>>>>>> the
>>>>>>>>>>>> transformations that the plain text suffers after merges?
>>>>>>>>>>>>
>>>>>>>>>>>> I suppose there could be other OT algorithms that don't use a
>>>>>>>>>>>> "character"
>>>>>>>>>>>> primitive, but rather an "xml tag" primitive, a json item, a
>>>>>>>>>>>> "pixel",
>>>>>>>>>>>> or
>>>>>>>>>>>> anything else, right?
>>>>>>>>>>>>
>>>>>>>>>>>> (sorry for only contributing with questions... :-)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jun 12, 2013 at 10:27 PM, Joseph Gentle
>>>>>>>>>>>> <jose...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>  On Wed, Jun 12, 2013 at 12:13 PM, Bruno Gonzalez (aka stenyak)
>>>>>>>>>>>>> <sten...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> My assumption was that conflicts were simply mathematically
>>>>>>>>>>>>>>
>>>>>>>>>>>>> inevitable
>>>>>>>>>>>>> in a
>>>>>>>>>>>>>
>>>>>>>>>>>>>> DVCSs, that's why your mention about lack of conflict markers
>>>>>>>>>>>>>>
>>>>>>>>>>>>> sparked my
>>>>>>>>>>>>>
>>>>>>>>>>>>>> interest... you mention conflicts like they can be optional?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> If
>>>>>>>>>>>>> so,
>>>>>>>>>>>>> are
>>>>>>>>>>>>>
>>>>>>>>>>>>>> conflicts "eliminated" by choosing an arbitrary merging
>>>>>>>>>>>>>>
>>>>>>>>>>>>> strategy
>>>>>>>>>>>>> when
>>>>>>>>>>>>>
>>>>>>>>>>>>>> conflicts *do* happen (e.g. "choose the last timestamped
>>>>>>>>>>>>>>
>>>>>>>>>>>>> patch
>>>>>>>>>>>>> and
>>>>>>>>>>>>> lose
>>>>>>>>>>>>>
>>>>>>>>>>>>>> information on the way, we don't care"), or can they be
>>>>>>>>>>>>>>
>>>>>>>>>>>>> prevented
>>>>>>>>>>>>> from
>>>>>>>>>>>>> ever
>>>>>>>>>>>>>
>>>>>>>>>>>>>> happening in the first place?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> They're inevitable in patch based systems because patches
>>>>>>>>>>>>> usually
>>>>>>>>>>>>> have
>>>>>>>>>>>>> a line level granularity. OT usually uses individual character
>>>>>>>>>>>>> positions. In OT, if two operations both delete the same
>>>>>>>>>>>>> character,
>>>>>>>>>>>>> the character gets deleted once. If two clients insert a
>>>>>>>>>>>>> character
>>>>>>>>>>>>> at
>>>>>>>>>>>>> the same position, one of the characters will be first in the
>>>>>>>>>>>>> resultant document and one will be second. Conflict markers
>>>>>>>>>>>>> just
>>>>>>>>>>>>> aren't necessary.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -J
>>>>>>>>>>>>>
>>>>>>>>>>>>>  --
>>>>>>>>>>>>>> Saludos,
>>>>>>>>>>>>>>       Bruno González
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ______________________________**_________________
>>>>>>>>>>>>>> Jabber: stenyak AT gmail.com
>>>>>>>>>>>>>> http://www.stenyak.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Saludos,
>>>>>>>>>>>>       Bruno González
>>>>>>>>>>>>
>>>>>>>>>>>> ______________________________**_________________
>>>>>>>>>>>> Jabber: stenyak AT gmail.com
>>>>>>>>>>>> http://www.stenyak.com
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>
>>
>>
>


-- 
Saludos,
     Bruno González

_______________________________________________
Jabber: stenyak AT gmail.com
http://www.stenyak.com

Re: Future of Apache wave [Was: Re: Advantages of P2P messaging?]

Reply via email to