"new design/implementation in the next release" to significantly improve python sdk's performance, when? the c++ binding has been experimental for years.
On Saturday, March 1, 2014 2:49:06 AM UTC+8, Feng Xiao wrote: > > For patch #1, you can submit it to the issue tracker. We don't have time > to look at it immediately but will get to it in the next release. > > For patch #2, we don't accept patches to the experimental C++ binding > support any more because it will be replaced by a new design/implementation > in the next release. Sorry for that but AFAIK the new design/implementation > is significantly better so it's worthwhile to try if you are currently > using the old one (which has hard-to-address memory ownership problems). > > On Wed, Feb 26, 2014 at 4:11 PM, <[email protected] <javascript:>> wrote: > >> Hey guys, >> >> >> I wrote two different patches which optimize python proto performance. >> Both patches are running in production at Dropbox. What is the best way >> to upstream these changes? >> >> >> Patrick >> >> ============================ >> >> Patch #1. Python message patch ( >> https://www.dropbox.com/s/q0y44ypti0by779/protobuf-2.5.0.patch1): >> >> Changes: >> - precompute various varint tables >> - don't use proto's ByteSize function for serialization >> - simplified some code (got rid of the listener) >> - got rid of StringIO >> >> Internal benchmark: >> - random repeated int32s - ~18% faster >> - random repeated int64s - ~20% faster >> - random repeated strings - 27% faster >> - random repeated bytes - 27% faster >> - repeated message with each with a single random string - ~20% faster >> >> NOTE: >> - predefined_varints.py is generated by generate_predefined_varints.py >> >> ============================ >> >> Patch #2. C++ experimental binding patch ( >> https://www.dropbox.com/s/5nr0v76nfraaxif/protobuf-2.5.0.patch2): >> >> Changes: >> - fixed memory ownership / dangling pointer (see NOTE #1 for known issues) >> 1. inc ref count parent message when accessing a field, >> 2. a cleared field's is freed only when the parent is deleted >> - fixed MakeDescriptor to correctly generating simple proto (see NOTE #2) >> - fixed MergeFrom to not crash on check failure due to self merge >> - fixed both repeated and non-repeated field clearing >> - modified varint deserialization to always return PyLong (to match >> existing python implementation) >> - always mark message as mutate when extending a repeated field (even >> when extending by an empty list) >> - deleted/updated bad tests from the protobuf test suite >> >> Internal benchmark (relative to the first patch): >> - 30x faster for repeated varints >> - 8x faster for repeated string >> - 6x faster for repeated bytes >> - 26x speed up for repeated nested msg >> >> NOTE: >> 1. In the current implementation, a new python object is created each >> time a field is accessed. To make this 100% correct, we should return the >> same python object whenever the same field is accessed; however, I don't >> think >> the accounting overhead is worth it. Implications due to the current >> implementation: >> - repeatedly clearing / mutating the same message can cause memory >> blow up >> - There's a subtle bug with clearing / mutating default message >> fields: >> >> This is correct. Holding a reference to a MUTATED field X, then >> clearing the parent, then mutate X. e.g., >> child = parent.optional_nested_msg >> child.field = 123 # this mutates the field >> parent.Clear() >> child.field = 321 >> assert not parent.HasField('child') # passes >> >> This is incorrect. Holding a reference to a UNMUTATED field X, then >> clearing the parent, then mutate X. >> child = parent.optional_nested_msg >> parent.Clear() >> child.field = 321 # this inadvertently causes parent to generate >> a different empty msg for optional_nested_msg. >> assert not parent.HasField('optional_nested_msg') # fail >> >> Luckily, these access patterns are extremely rare (at least at >> dropbox). >> >> 2. I wrote a fully functional MakeDescriptor for c++ protos when I was at >> google. Talk to the F1 team (specifically Bart Samwel / Chad Whipkey) if >> you're interested in upstreaming that to the opensource community. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Protocol Buffers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected]<javascript:> >> . >> Visit this group at http://groups.google.com/group/protobuf. >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/protobuf. For more options, visit https://groups.google.com/d/optout.
