If you have a change that makes protobuf code not rely on undefined behavior, and also doesn't sacrifice performance, then we should merge it.
Feel free to send a GitHub PR. But make sure the code is C++03 (protobuf doesn't use C++11 yet). On Tuesday, April 25, 2017 at 1:36:31 PM UTC-7, Wolfgang Brehm wrote: > > I know, but the proposed solution is still worth considering as it gives > you the best of both worlds with just a little compile time overhead. > It yields the same result on 2s complement machines as the current > function but also on other machines because it is well defined. > This means that > clang++ -fsanitize=undefined > will not have anything to complain either. > TL;DR: You loose nothing but time and gain better defined code, but > basically everything stays the same. > > Am Dienstag, 25. April 2017 20:17:33 UTC+2 schrieb Feng Xiao: >> >> Right now protobuf implementation assumes 2s complement and won't work on >> any other environments. That's undefined behavior according to C++ standard >> but in practice it's very unlikely to be a problem. >> >> On Tue, Apr 25, 2017 at 5:42 AM, Wolfgang Brehm <[email protected]> >> wrote: >> >>> right shifting negative integers is tecnically undefined, this means >>> that the implementation used for encoding integers: >>> (n<<1)^(n>>(digits-1)) >>> is tecnically undefined according to: >>> >>>> The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits >>>> are zero-filled. If E1 has an unsigned type, the value of the result is E1 >>>> × 2 E2, reduced modulo one more than the maximum value representable in >>>> the >>>> result type. Otherwise, if E1 has a signed type and non-negative value, >>>> and >>>> E1 × 2 E2 is representable in the corresponding unsigned type of the >>>> result >>>> type, then that value, converted to the result type, is the resulting >>>> value; otherwise, the behavior is undefined. >>> >>> >>> and the implementation for decoding: >>> (n>>1)^(-(n&1)) >>> depends on the 2s complement as well and is probably tecnically >>> undefined (but I cant find a quote from the standard). >>> >>> This is why I propose the following solution: >>> >>> template<typename S,class U = typename std::make_unsigned<S>::type>> >>> U constexpr zigzag_encode(const S &n){ >>> return (U(n)<<1)^(n<0?~U(0):U(0)); >>> } >>> >>> template<typename U,class S = typename std::make_signed<U>::type>> >>> S constexpr zigzag_decode(const U &n){ >>> return (n&1)?-1-S(n>>1):(n>>1); >>> } >>> >>> This does not look as cool, but modern compilers (llvm and gcc were >>> tested) will compile to bascially the same instructions, gcc will introduce >>> 1 additional instruction for decode. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Protocol Buffers" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/protobuf. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/protobuf. For more options, visit https://groups.google.com/d/optout.
