rjmccall added inline comments.

================
Comment at: clang/lib/Basic/FixedPoint.cpp:242
+  } else
+    Overflowed = Result < Min || Result > Max;
+
----------------
leonardchan wrote:
> ebevhan wrote:
> > rjmccall wrote:
> > > ebevhan wrote:
> > > > ebevhan wrote:
> > > > > rjmccall wrote:
> > > > > > If the maximum expressible value is *k*, and the fully-precise 
> > > > > > multiplication yields *k+e* for some epsilon *e* that isn't 
> > > > > > representable in the result semantics, is that considered an 
> > > > > > overflow?  If so, I think you need to do the shift after these 
> > > > > > bound checks, since the shift destroys the difference between *k* 
> > > > > > and *k+e*.  That is, unless there's a compelling mathematical 
> > > > > > argument that it's not possible to overflow only in the 
> > > > > > fully-precision multiplication — but while I think that's possibly 
> > > > > > true of `_Fract` (since *k^2 < k*), it seems unlikely to be true of 
> > > > > > `_Accum`, although I haven't looked for a counter-example.  And if 
> > > > > > there is a compelling argument, it should probably be at least 
> > > > > > alluded to in a comment.
> > > > > > 
> > > > > > Would this algorithm be simpler if you took advantage of the fact 
> > > > > > that `APFixedPointSemantics` doesn't have to correspond to a real 
> > > > > > type?  You could probably just convert to a double-width common 
> > > > > > semantics, right?
> > > > > > If the maximum expressible value is *k*, and the fully-precise 
> > > > > > multiplication yields *k+e* for some epsilon *e* that isn't 
> > > > > > representable in the result semantics, is that considered an 
> > > > > > overflow? If so, I think you need to do the shift after these bound 
> > > > > > checks, since the shift destroys the difference between *k* and 
> > > > > > *k+e*.
> > > > > 
> > > > > I don't think I would consider that to be overflow; that's precision 
> > > > > loss. E-C considers these to be different:
> > > > > 
> > > > > > If the source value cannot be represented exactly by the 
> > > > > > fixed-point type, the source value is rounded to either the closest 
> > > > > > fixed-point value greater than the source value (rounded up) or to 
> > > > > > the closest fixed-point value less than the source value (rounded 
> > > > > > down).
> > > > > >
> > > > > > When the source value does not fit within the range of the 
> > > > > > fixed-point type, the conversion overflows. [...]
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > If the result type of an arithmetic operation is a fixed-point 
> > > > > > type, [...] the calculated result is the mathematically exact 
> > > > > > result with overflow handling and rounding performed to the full 
> > > > > > precision of the result type as explained in 4.1.3. 
> > > > > 
> > > > > There is also no value of `e` that would affect saturation. Any full 
> > > > > precision calculation that gives `k+e` must be `k` after downscaling, 
> > > > > since the bits that represent `e` must come from the extra precision 
> > > > > range. Even though `k+e` is technically larger than `k`, saturation 
> > > > > would still just give us `k` after truncating out `e`, so the end 
> > > > > result is the same.
> > > > > 
> > > > > > Would this algorithm be simpler if you took advantage of the fact 
> > > > > > that APFixedPointSemantics doesn't have to correspond to a real 
> > > > > > type? You could probably just convert to a double-width common 
> > > > > > semantics, right?
> > > > > 
> > > > > It's likely possible to use APFixedPoint in the calculations here, 
> > > > > but I used APInt to make the behavior explicit and not accidentally 
> > > > > be dependent on the behavior of APFixedPoint's conversions or 
> > > > > operations.
> > > > Although.,. I guess I see your point in that an intermediate result of 
> > > > k+e technically "does not fit within the range of the fixed-point 
> > > > type"... but I wonder if treating such cases as overflow is 
> > > > particularly meaningful. I don't find there to be much of a distinction 
> > > > between such a case and the case where the exact result lands inbetween 
> > > > two representable values. We just end up with a less precise result.
> > > Right, I was wondering if there was an accepted answer here.  For 
> > > saturating arithmetic, it's equivalent to truncate this extra precision 
> > > down to *k* or to saturate to the maximum representable value, since by 
> > > assumption that was just *k*; but for non-saturating arithmetic, it 
> > > affects whether the operation has UB.  All else being the same, it's 
> > > better to have fewer corner-case sources of UB.
> > > 
> > > My read is that Embedded C is saying there's a sequence here: compute the 
> > > exact mathematical result; round that to the precision of the result 
> > > type; the operation overflows if the rounded result is not representable 
> > > in the result type.  Is the rounding direction completely unspecified, 
> > > down to being potentially operand-specific?  If so, we could just say 
> > > that we always  round to avoid overflow if possible.  The main 
> > > consideration here is that we need to give the operation the same 
> > > semantics statically and dynamically, and I don't know if there's any 
> > > situation where those semantics would affect the performance of the 
> > > operation when done dynamically.
> > > For saturating arithmetic, it's equivalent to truncate this extra 
> > > precision down to *k* or to saturate to the maximum representable value, 
> > > since by assumption that was just *k*; but for non-saturating arithmetic, 
> > > it affects whether the operation has UB.
> > 
> > I'm fairly sure that the conclusions here about k and e only hold if k 
> > truly is the maximum representable value. If k is anything else (even 
> > epsilon-of-the-representable-range less), k+e can never be greater than the 
> > maximum.
> > 
> > And actually, crunching the numbers on this... If we have integers a and b 
> > of width N, sign extended to the double bitwidth A and B, there can be no 
> > values for a and b for which A*B is greater than N_Max<<N (`k`). Taking 
> > 8-bit as an example: Max is 127, and Max<<8 is 32512. The maximum possible 
> > value attainable is -128*-128, which is 16384. That isn't even close to the 
> > k+e case.
> > 
> > I'm unsure if this reasoning applies in the minimum case as well.
> > 
> > > My read is that Embedded C is saying there's a sequence here: compute the 
> > > exact mathematical result; round that to the precision of the result 
> > > type; the operation overflows if the rounded result is not representable 
> > > in the result type.
> > 
> > I wonder if it's intended to be a sequence. It's starting to feel like it 
> > can't actually be both cases at the same time.
> > 
> > And actually, crunching the numbers on this... If we have integers a and b 
> > of width N, sign extended to the double bitwidth A and B, there can be no 
> > values for a and b for which A*B is greater than N_Max<<N (k). Taking 8-bit 
> > as an example: Max is 127, and Max<<8 is 32512. The maximum possible value 
> > attainable is -128*-128, which is 16384. That isn't even close to the k+e 
> > case.
> 
> I think you mean the scale instead of `N` for `N_Max<<N`, and we would run 
> into this case for `(N_max << scale) < (a * b) < ((N_max + 1) << scale)` 
> where `a` and `b` represent the scaled integers. An example is `1.75 * 2.25`, 
> represented as 4 bit unsigned ints with scales of 2:
> 
> ```
>   01.11 (1.75)
> x 10.01 (2.25)
> -------------
> 11.1111 (3.9375) -> shr 2 -> 11.11 (3.75)
> ```
> 
> where the our `e` in this < 0.25.
> 
> My interpretation of the spec (which could be wrong) is whenever they refer 
> to "source value", they mean the exact mathematical result (`3.9375`), so 
> precision loss and overflow can occur at the same time independently of each 
> other. For the non-saturating case, I'd consider the `k + e` to be UB because 
> of this.
Your logic only works if the entire integer is scaled, i.e. for `_Fract`; for 
`_Accum` types where the scale S can be less than N, it's possible to have an 
"epsilon" overflow.  For example, with S=4 and N=8, `(44/16) * (93/16) == 
(255/16) + (12/256)`.

Here's a program to brute-force search for counter-examples for an arbitrary 
unsigned fixed-point type: 
https://gist.github.com/rjmccall/562c2c7c9d289edd8cdf034edd6c1f17


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73186/new/

https://reviews.llvm.org/D73186



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to