Re: Future of Apache wave [Was: Re: Advantages of P2P messaging?]

Joseph Gentle Thu, 13 Jun 2013 11:56:15 -0700

So you're imagining storing rich text like this?

{doc: 'hi there!', annotations: [{from:0, to:2, bold:true}]} or something?


Every change to the document is going to need to manually update every
single annotation which has start / end points after the edit. But it
wouldn't work - if you insert some text and I edit an annotation later
in the document, my annotation will float forwards / backwards when I
get your op because I don't know how I should change it.

This idea comes up about every 6 months on the sharejs mailing list.
Several solutions have been proposed, but none of them work correctly.
I think we just need a separate set of transform / apply / ...
functions for rich text.

-J


On Thu, Jun 13, 2013 at 1:19 AM, Michael MacFadden
<michael.macfad...@gmail.com> wrote:
> Joseph,
>
> I disagree.  The annotations themselves are just another data structure.
> You add them, remove them and modify them like anything else.  You can
> manage annotations as another structure within the blip model.  There is
> no reason why you can interface them though a JSON Style operations
> structure.
>
> ~Michael
>
> On 6/13/13 12:11 AM, "Joseph Gentle" <jose...@gmail.com> wrote:
>
>>The conversation *model* yes, but not the rich text documents
>>themselves. You can't really make text annotations work properly on
>>top of JSON operations. We should keep something like the current
>>system for actual blips.
>>
>>-J
>>
>>
>>On Wed, Jun 12, 2013 at 4:06 PM, Michael MacFadden
>><michael.macfad...@gmail.com> wrote:
>>> Actually I just went and took a look at your operations.  The JSON OT
>>>type
>>> is probably the closest to what I would suggest we use.  JSON Objects
>>>are
>>> not just for javascript.  They define arbitrary objects structures.  We
>>> don't need a specific wave XML type, we could use the JSNO operations to
>>> modify the conversation model
>>>
>>> Potentially.
>>>
>>>
>>> On 6/12/13 10:55 PM, "Joseph Gentle" <jose...@gmail.com> wrote:
>>>
>>>>Really?
>>>>
>>>>My method for ShareJS was to simply have a JSON OT type and a
>>>>plaintext OT type. I'd like to add a rich text OT type as well. Then
>>>>people can just pick which one based on what kind of data they have.
>>>>
>>>>For Wave I'd like to be able to do something similar - JSON is
>>>>obviously useful for storing application data. It'd be nice to have
>>>>some sort of hybrid for wavelets where we can put multiple different
>>>>kinds of data inside a wavelet. One option is to use a JSON OT type as
>>>>the root of all wavelets and support subdocuments at arbitrary paths
>>>>(so the object could be:
>>>>{projectName:"ruby on rails", files:[{name:'foo/bar.rb', ...}],
>>>>documentation:{_type:richtext, _data:"<Rich text data>"}}
>>>>
>>>>Or wavelets could simply each have a type (defaulting to the current
>>>>wavey XML type).
>>>>
>>>>-J
>>>>
>>>>
>>>>On Wed, Jun 12, 2013 at 2:41 PM, Michael MacFadden
>>>><michael.macfad...@gmail.com> wrote:
>>>>> You have stumbled upon one of the weaknesses of wave OT.  Best
>>>>>practices
>>>>> would say to NOT bind your OT directly to the data type, because then
>>>>>you
>>>>> don't have an extendable model. For example if you have all of your
>>>>> operations figured out and validated, and then you need to change your
>>>>> data model, you have to go back and mess with your transformation
>>>>> functions.  Not good.  Or you have to try to bend new data models in
>>>>>to
>>>>> the existing one, also not good.
>>>>>
>>>>> Best practice is to create a generic OT model and operate on that.
>>>>>There
>>>>> is debate as to what the model should be, but most agree on the
>>>>>concept.
>>>>>
>>>>> For example in wave they tried to create a map like collection that OT
>>>>> could operate on. Essentially though that had to implement the map as
>>>>>if
>>>>> its underlying model was a bunch of XMLish type tags.  This we very
>>>>> convoluted.
>>>>>
>>>>> ~Michael
>>>>>
>>>>> On 6/12/13 10:26 PM, "Joseph Gentle" <jose...@gmail.com> wrote:
>>>>>
>>>>>>Yeah exactly. The google wave OT code uses special operations that can
>>>>>>understand the XML structure. It doesn't just edit the plaintext.
>>>>>>Formatting annotations are stored in a special way - operations can
>>>>>>say something like "At position 10 add bold. At position 20 stop
>>>>>>adding bold".
>>>>>>
>>>>>>-J
>>>>>>
>>>>>>On Wed, Jun 12, 2013 at 1:56 PM, Bruno Gonzalez (aka stenyak)
>>>>>><sten...@gmail.com> wrote:
>>>>>>> I suspected something like that. I assume it also correctly handles
>>>>>>> variable-length UTF8 characters, so it's not necessarily 1-byte
>>>>>>>patches?
>>>>>>>
>>>>>>> This starts to make sense. OT can only compute conflict-free merges
>>>>>>>using
>>>>>>> the "character" primitive (because that's how Wave was originally
>>>>>>> designed). As an unfortunate consequence, you can then only
>>>>>>>OT-operate
>>>>>>>on
>>>>>>> plain text. Otherwise you could get conflict-free xml text that
>>>>>>><loo<ks
>>>>>>> li<>ke>this>, and that of course isn't legal xml.
>>>>>>> But we still want rich text in Google Wave, therefore all the
>>>>>>>formatting
>>>>>>> stuff is stored some place else, specifically in the blip
>>>>>>>annotations.
>>>>>>>The
>>>>>>> modifications to annotations are (sometimes) simply derived from the
>>>>>>> transformations that the plain text suffers after merges?
>>>>>>>
>>>>>>> I suppose there could be other OT algorithms that don't use a
>>>>>>>"character"
>>>>>>> primitive, but rather an "xml tag" primitive, a json item, a
>>>>>>>"pixel",
>>>>>>>or
>>>>>>> anything else, right?
>>>>>>>
>>>>>>> (sorry for only contributing with questions... :-)
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 12, 2013 at 10:27 PM, Joseph Gentle <jose...@gmail.com>
>>>>>>>wrote:
>>>>>>>
>>>>>>>> On Wed, Jun 12, 2013 at 12:13 PM, Bruno Gonzalez (aka stenyak)
>>>>>>>> <sten...@gmail.com> wrote:
>>>>>>>> > My assumption was that conflicts were simply mathematically
>>>>>>>>inevitable
>>>>>>>> in a
>>>>>>>> > DVCSs, that's why your mention about lack of conflict markers
>>>>>>>>sparked my
>>>>>>>> > interest... you mention conflicts like they can be optional? If
>>>>>>>>so,
>>>>>>>>are
>>>>>>>> > conflicts "eliminated" by choosing an arbitrary merging strategy
>>>>>>>>when
>>>>>>>> > conflicts *do* happen (e.g. "choose the last timestamped patch
>>>>>>>>and
>>>>>>>>lose
>>>>>>>> > information on the way, we don't care"), or can they be prevented
>>>>>>>>from
>>>>>>>> ever
>>>>>>>> > happening in the first place?
>>>>>>>>
>>>>>>>> They're inevitable in patch based systems because patches usually
>>>>>>>>have
>>>>>>>> a line level granularity. OT usually uses individual character
>>>>>>>> positions. In OT, if two operations both delete the same character,
>>>>>>>> the character gets deleted once. If two clients insert a character
>>>>>>>>at
>>>>>>>> the same position, one of the characters will be first in the
>>>>>>>> resultant document and one will be second. Conflict markers just
>>>>>>>> aren't necessary.
>>>>>>>>
>>>>>>>> -J
>>>>>>>>
>>>>>>>> > --
>>>>>>>> > Saludos,
>>>>>>>> >      Bruno González
>>>>>>>> >
>>>>>>>> > _______________________________________________
>>>>>>>> > Jabber: stenyak AT gmail.com
>>>>>>>> > http://www.stenyak.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Saludos,
>>>>>>>      Bruno González
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Jabber: stenyak AT gmail.com
>>>>>>> http://www.stenyak.com
>>>>>
>>>>>
>>>
>>>
>
>

Re: Future of Apache wave [Was: Re: Advantages of P2P messaging?]

Reply via email to