[Bug tree-optimization/116133] Missing mult_overflow detection for aarch64

pinskia at gcc dot gnu.org via Gcc-bugs Mon, 29 Jul 2024 22:04:30 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116133


--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Actually I am trying to understand the original reason for the extra checks
that was added in r14-992 when dealing with highpart:
    The reason for testing the presence of the optab
    handler is to make sure the generated code for it is short to ensure
    we don't actually pessimize code instead of optimizing it.
...

    So, the following patch matches what we do in internal-fn.cc and
    also pattern matches __builtin_mul_overflow_p if
    1) we only need the flag whether it overflowed (i.e. !use_seen)
    2) it is unsigned (i.e. !cast_stmt)
    3) umul_highpart is supported for the mode

But the code in optab handles this just fine.

In the case of can_mult_highpart_p == 1 (direct), it just generates 2
instructions (lower mult and upper mult) and one comparison. which is better
than a division. This will never be worse.

For can_mult_highpart_p == 2 (aka indirect), we will either use a widening
multiply (good) and then a comparison or 2 widening instructions and a multiply
(also decent). and then a comparison with a shift.


Note the multover internal function uses can_widen_mult_without_libcall to see
if we can do the widdening (which can handle more cases than
can_mult_highpart_p == 2 really due to using expand_mult).

So I think the real answer is just remove the extra checks and test that
!can_mult_highpart_p .

But then we still miss (due to the above mentioning that
can_widen_mult_without_libcall handles more than can_mult_highpart_p ):
```
int
f1 (unsigned char x, unsigned char y, unsigned char *res)
{
  *res = x * y;
  return x && (*res / x) != y;
}
```

Note I don't think we can make can_mult_highpart_p depedent on
can_widen_mult_without_libcall either is that a recusive dependency.


So maybe:
      || (code == MULT_EXPR
          && optab_handler (cast_stmt ? mulv4_optab : umulv4_optab,
                            TYPE_MODE (type)) == CODE_FOR_nothing
          /* The target has widdening (via high part or otherwise)
             multiply for this mode. */
          && !can_mult_highpart_p (TYPE_MODE (type), true))
          /* Or if the the mode is smaller than the word size. */
          && GET_MODE_BITSIZE (SCALAR_INT_TYPE_MODE (type)) >= BITS_PER_WORD)

The main target where this could be an issue is avr I think. IU have t o think
this more.

[Bug tree-optimization/116133] Missing mult_overflow detection for aarch64

Reply via email to