Re: [protobuf] Re: Streaming Serialization - Suggestion

'Feng Xiao' via Protocol Buffers Fri, 01 Apr 2016 17:44:24 -0700

On Wed, Mar 30, 2016 at 5:27 PM, Yoav H <[email protected]> wrote:


> I saw the start\end group but I couldn't find any information on those and
> how to use them.
>
> Your point about skipping fields makes sense.
> I think it is also solvable with applying the same idea of chunked
> encoding, even on sub fields.
> So instead of writing the full length of the child field, you allow the
> serializer to write it in smaller chunks.
> The deserializer can then just read the chunk markings and skip them.
> A very basic serializer can put just one chunk (which will be equivalent
> to the current implementation, plus one more zero marking at the end), but
> it allows a more efficient serializer to stream data.
>
> Regarding adding something to the encoding spec, are you allowing proto2
> serializers to call into proto3 deserializers and vice versa?
> I thought that if you have a protoX server, you expect clients to take the
> protoX file and generate a client out of it, which will match that proto
> version encoding. Isn't it the case?
>
Proto2 and proto3 are wire-compatible. We already have a lot of proto3
clients communicating with proto2 servers or vice versa. Like Josh
mentioned, we can't change proto3's wire format now.


>
> Thanks,
> Yoav.
>
> On Tuesday, March 29, 2016 at 5:06:46 PM UTC-7, Feng Xiao wrote:
>
>>
>>
>> On Mon, Mar 28, 2016 at 10:53 PM, Yoav H <[email protected]> wrote:
>>
>>> They say on their website: "When evaluating new features, we look for
>>> additions that are very widely useful or very simple".
>>> What I'm suggesting here is both very useful (speeding up serialization
>>> and eliminating memory duplication) and very simple (simple additions to
>>> the encoding, no need to change the language).
>>> So far, no response from the Google guys...
>>>
>> Actually there are already a "start embedding" tag and a "end embedding"
>> tag in protobuf:
>> https://developers.google.com/protocol-buffers/docs/encoding#structure
>>
>> 3 Start group groups (deprecated)
>> 4 End group groups (deprecated)
>>
>> They are deprecated though.
>>
>> You mentioned it will be a performance gain, but what we experienced in
>> google says otherwise. For example, in a lot places we are only interested
>> in a few fields and want to skip through all other fields (if we are
>> building a proxy, or the field is simply an unknown field). The start
>> group/end group tag pair forces the parser to decode every single field in
>> the a whole group even the whole group is to be ignored after parsing, and
>> that's a very significant drawback.
>>
>> And adding a new wire tag type to protobuf is not a simple thing.
>> Actually I don't think we have added any new wire type to protobuf before.
>> There are a lot issues to consider. For example, isn't all code that switch
>> on protobuf wire types now suddenly broken? if a new serializer uses the
>> new wire type in its output, what will happen if the parsers can't
>> understand it?
>>
>> Proto3 is already finalized and we will not add new wire types in proto3.
>> Whether to add it in proto4 depends on whether we have a good use for it
>> and whether we can mitigate the risks of rolling out a new wire type.
>>
>>
>>>
>>>
>>> On Monday, March 28, 2016 at 10:24:17 AM UTC-7, Peter Hultqvist wrote:
>>>>
>>>> This exact suggestion has been up for discussion long time ago(years?,
>>>> before proto2?)
>>>>
>>>> When it comes to taking suggestions I'm only a 3rd party implementer
>>>> but my understanding is that the design process of protocol buffers and its
>>>> goals are internal to Google and they usually publish new versions of their
>>>> code implementing new features before you can read about them in the
>>>> documents.
>>>> On Mar 27, 2016 5:31 AM, "Yoav H" <[email protected]> wrote:
>>>>
>>>>> Any comment on this?
>>>>> Will you consider this for proto3?
>>>>>
>>>>> On Wednesday, March 23, 2016 at 11:50:36 AM UTC-7, Yoav H wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have a suggestion fr improving the protobuf encoding.
>>>>>> Is proto3 final?
>>>>>>
>>>>>> I like the simplicity of the encoding of protobuf.
>>>>>> But I think it has one issue with serialization, using streams.
>>>>>> The problem is with length delimited fields and the fact that they
>>>>>> require knowing the length ahead of time.
>>>>>> If we have a very long string, we need to encode the entire string
>>>>>> before we know its length, so we basically duplicate the data in memory.
>>>>>> Same is true for embedded messages, where we need to encode the
>>>>>> entire embedded message before we can append it to the stream.
>>>>>>
>>>>>> I think there is a simple solution for both issues.
>>>>>>
>>>>>> For strings and byte arrays, a simple solution is to use "chunked
>>>>>> encoding".
>>>>>> Which means that the byte array is split into chunks and every chunk
>>>>>> starts with the chunk length. End of array is indicated by length zero.
>>>>>>
>>>>>> For embedded messages, the solution is to have an "start embedding"
>>>>>> tag and an "end embedding tag".
>>>>>> Everything in between is the embedded message.
>>>>>>
>>>>>> By adding these two new features, serialization can be fully
>>>>>> streamable and there is no need to pre-serialize big chunks in memory
>>>>>> before writing them to the stream.
>>>>>>
>>>>>> Hope you'll find this suggestion useful and incorporate it into the
>>>>>> protocol.
>>>>>>
>>>>>> Thanks,
>>>>>> Yoav.
>>>>>>
>>>>>>
>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Protocol Buffers" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at https://groups.google.com/group/protobuf.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Protocol Buffers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/protobuf.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Re: [protobuf] Re: Streaming Serialization - Suggestion

Reply via email to