"new design/implementation in the next release" to significantly improve 
python sdk's performance, when?  the c++ binding has been experimental for 
years.

On Saturday, March 1, 2014 2:49:06 AM UTC+8, Feng Xiao wrote:
>
> For patch #1, you can submit it to the issue tracker. We don't have time 
> to look at it immediately but will get to it in the next release.
>
> For patch #2, we don't accept patches to the experimental C++ binding 
> support any more because it will be replaced by a new design/implementation 
> in the next release. Sorry for that but AFAIK the new design/implementation 
> is significantly better so it's worthwhile to try if you are currently 
> using the old one (which has hard-to-address memory ownership problems).
>
> On Wed, Feb 26, 2014 at 4:11 PM, <[email protected] <javascript:>> wrote:
>
>> Hey guys,
>>
>>
>> I wrote two different patches which optimize python proto performance. 
>>  Both patches are running in production at Dropbox.  What is the best way 
>> to upstream these changes?
>>
>>
>> Patrick
>>
>> ============================
>>
>> Patch #1. Python message patch (
>> https://www.dropbox.com/s/q0y44ypti0by779/protobuf-2.5.0.patch1):
>>
>> Changes:
>> - precompute various varint tables
>> - don't use proto's ByteSize function for serialization
>> - simplified some code (got rid of the listener)
>> - got rid of StringIO
>>
>> Internal benchmark:
>> - random repeated int32s - ~18% faster
>> - random repeated int64s - ~20% faster
>> - random repeated strings - 27% faster
>> - random repeated bytes - 27% faster
>> - repeated message with each with a single random string - ~20% faster
>>
>> NOTE:
>> - predefined_varints.py is generated by generate_predefined_varints.py
>>
>> ============================
>>
>> Patch #2. C++ experimental binding patch (
>> https://www.dropbox.com/s/5nr0v76nfraaxif/protobuf-2.5.0.patch2):
>>
>> Changes:
>> - fixed memory ownership / dangling pointer (see NOTE #1 for known issues)
>>     1. inc ref count parent message when accessing a field, 
>>     2. a cleared field's is freed only when the parent is deleted
>> - fixed MakeDescriptor to correctly generating simple proto (see NOTE #2)
>> - fixed MergeFrom to not crash on check failure due to self merge 
>> - fixed both repeated and non-repeated field clearing 
>> - modified varint deserialization to always return PyLong (to match 
>> existing python implementation) 
>> - always mark message as mutate when extending a repeated field (even 
>> when extending by an empty list) 
>> - deleted/updated bad tests from the protobuf test suite 
>>
>> Internal benchmark (relative to the first patch):
>> - 30x faster for repeated varints
>> - 8x faster for repeated string
>> - 6x faster for repeated bytes 
>> - 26x speed up for repeated nested msg
>>
>> NOTE: 
>> 1. In the current implementation, a new python object is created each 
>> time a field is accessed. To make this 100% correct, we should return the 
>> same python object whenever the same field is accessed; however, I don't 
>> think 
>> the accounting overhead is worth it.  Implications due to the current 
>> implementation:
>>     - repeatedly clearing / mutating the same message can cause memory 
>> blow up 
>>     - There's a subtle bug with clearing / mutating default message 
>> fields: 
>>
>>     This is correct. Holding a reference to a MUTATED field X, then 
>> clearing the parent, then mutate X. e.g., 
>>         child = parent.optional_nested_msg 
>>         child.field = 123 # this mutates the field 
>>         parent.Clear() 
>>         child.field = 321 
>>         assert not parent.HasField('child') # passes 
>>
>>     This is incorrect. Holding a reference to a UNMUTATED field X, then 
>> clearing the parent, then mutate X. 
>>         child = parent.optional_nested_msg 
>>         parent.Clear() 
>>         child.field = 321 # this inadvertently causes parent to generate 
>> a different empty msg for optional_nested_msg. 
>>         assert not parent.HasField('optional_nested_msg') # fail 
>>
>>     Luckily, these access patterns are extremely rare (at least at 
>> dropbox).
>>
>> 2. I wrote a fully functional MakeDescriptor for c++ protos when I was at 
>> google.  Talk to the F1 team (specifically Bart Samwel / Chad Whipkey) if 
>> you're interested in upstreaming that to the opensource community.
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Protocol Buffers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected]<javascript:>
>> .
>> Visit this group at http://groups.google.com/group/protobuf.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to