rjmccall added a comment.

Yes, the main result is defined to be the true value mod 2^k even when overflow 
occurs.  We do use the intrinsics to implement the existing fixed-width GCC 
builtins, which are meant to be useful not just for security checking, but for 
implementing arbitrary-precision arithmetic libraries.  (LLVM doesn't seem to 
reliably do the add/adc peepholes that make that performant, though.)


================
Comment at: lib/CodeGen/CGBuiltin.cpp:1601
@@ +1600,3 @@
+    auto RITy = IntegerWidthAndSignedness(CGM.getContext(), RQTy);
+    auto EITy = EncompassingIntegerType({XITy, YITy, RITy});
+
----------------
DavidEGrayson wrote:
> rjmccall wrote:
> > DavidEGrayson wrote:
> > > rjmccall wrote:
> > > > These are not the most evocative names you could have chosen. :)
> > > > 
> > > > Also, this strategy is correct, but it produces horrible code when the 
> > > > operands have the same signedness and the result is different, where 
> > > > the result would otherwise be an ordinary operation and a compare.  You 
> > > > really want to just merge the two operands and then max with the width 
> > > > of the result type.
> > > I don't think that would work.  It sounds like you want me to remove RITy 
> > > from the call to EncompassingIntegerType.  That means the ecompassing 
> > > type could be smaller than the result type, and the arithmetic intrinsic 
> > > would report an overflow even though the mathematical result is 
> > > representable in the result type.
> > > 
> > > For example, if I am multiplying ((uint8_t)100) by ((uint8_t)100) and 
> > > storing the result in a (uint16_t), the result needs to be 10000 and 
> > > there is no overflow.  To arrive at that result, we need to convert the 
> > > operands to 16-bit unsigned integers before multiplying them.  If we 
> > > multiply them as 8-bit unsigned integers, the multiplication intrinsic 
> > > will report an overflow and give a result of 16.  That seems like a messy 
> > > situation and I don't see how a max operation would fix it.
> > I think I was unclear.  I'm not saying that you should do a max as part of 
> > computing the dynamic result.  I'm saying that, when deciding the 
> > encompassing type, it should still be at least as wide as the result, but 
> > you should ignore the result type's signedness and then patch it up with a 
> > comparison at the end if necessary.  (I believe that in both cases, it's a 
> > signed comparison against zero.)
> > 
> > That is, when multiplying two uint8_t's and storing the result in an 
> > int16_t, we should just do an unsigned 16-bit multiply-with-overflow.  The 
> > computation overflows if the multiply overflows (which it provably doesn't, 
> > but we can let LLVM figure that out for now) or the result is signed-< 0.
> OK, I understand now.  As an optimization, you would to like the operation 
> type (the width/signedness that we use to perform the main math operation) to 
> sometimes be the same width as the result type (the type that the third 
> argument points to) but have a different sign.
> 
> I'm thinking about how that would work in all the different cases.  I have 
> concluded that it would work in 5 out of 6 cases, but we would have to think 
> carefully about each case and it is possible I have made some mistakes.  But 
> here are the 6 cases:
> 
> 1) `__builtin_add_overflow`, operation type signed, result type unsigned: It 
> would work but it would be complicated.  If the signed addition intrinsic 
> reports an overflow, we need to know whether the result was too big (which is 
> OK) or too small (which is bad).  The most extreme underflow would be 
> (-2^(N-1)) + (-2^(N-1)) = 0, so every underflow should give a non-negative 
> result.  Similarly, every overflow-proper (case where the result was too big) 
> would result in a negative number.  So the overflow bit returned from 
> `__builtin_add_overflow` should be the overflow bit from the intrinsic XORed 
> with a "less than zero" comparison.
> 
> 2) `__builtin_add_overflow`, operation type unsigned, result type signed:  
> This is easier, because the only possible overflow reported by the unsigned 
> addition intrinsic happens when the result is too big, which means it's 
> definitely too big for the signed type.  The overflow bit returned from 
> `__builtin_add_overflow` in this case would be the overflow bit from the 
> intrinsic ORed with a "less than zero" comparison.
> 
> 3) `__builtin_sub_overflow`, operation type signed, result type unsigned:  If 
> the signed subtraction intrinsic reports an overflow, we need to know whether 
> the result was too big (which is OK) or too small (which is bad).  The most 
> extreme underflow would be (-2^(N-1)) - (2^(N-1)-1) = 1, so every underflow 
> gives a positive value.  The most extreme overflow-proper would be 
> (2^(N-1)-1) - (-2^(N-1)) = -1, so every overflow-proper gives a negative 
> number.  Just like case #1, the overflow bit we return shoud be the overflow 
> bit from the intrinsic XORed with a "less than zero" comparison.
> 
> 4) `__builtin_sub_overflow`, operation type unsigned, result type signed:  If 
> the unsigned subtraction intrinsic reports an overflow, it simply means Y > 
> X.  That is OK as long as Y isn't too much greater than X.  The most extreme 
> case is 0 - (2^N-1) = 1.  So if the result from the intrinsic is less than 
> zero, then we are OK.  So the overflow bit returned from 
> __builtin_sub_overflow should be the overflow bit from the intrinsic XORed 
> with a "less than zero" comparison. 
> 
> 5) `__builtin_mul_overflow`, operation type signed, result type unsigned: I 
> don't think this case would work, so we would have to fall back to using an 
> operation type that actually encompasses the result type.  For example, if we 
> are multiplying two int16_t and the result is uint16_t, and the intrinsic 
> gives us an overflow, it's really hard to know whether that overflow was OK 
> (200 * 200) or not OK (200 * 527).
> 
> 6) `__builtin_mul_overflow`, operation type unsigned, result type signed: The 
> logic from case 2 applies here.
> 
> Do you want the patch to go in this direction?  Since we have to consider 
> these 6 cases, I think it would be more complicated than what you were 
> describing.
Oh, I see your point; I wasn't thinking about the fact that some of the 
overflows aren't necessarily overflows in the result type.  Going ahead with 
your original approach seems reasonable.


Repository:
  rL LLVM

http://reviews.llvm.org/D12793



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to