[Bug tree-optimization/114995] C++23 Assume keyword not being used for vectorization

aldyh at gcc dot gnu.org via Gcc-bugs Tue, 14 May 2024 02:00:07 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114995


--- Comment #9 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #7)
> The above examples just show misunderstanding what __builtin_assume_aligned
> is and what it is not.  You need to use the result of the built-in function
> in the accesses to be able to use the alignment information, if you just try
> to compare __builtin_assume_aligned (x, 32) == x, it will just fold as
> always true.  The design of the builtin is to attach the alignment
> information to the result of the builtin function only.
> 
> CCing Aldy/Andrew for whether prange can or could be taught to handle the
> assume cases with uintptr_t and bitwise and + comparison.

All the pieces are there to make it work, both with the assume aligned and with
the uintptr_t case.  And we could probably get it all without prange.

For example:

#include <cstdint>

void foo (const float *);

void bar1 (const float *array)
{
  [[assume(array != nullptr)]];
  const float *aligned = (const float *) __builtin_assume_aligned (array, 32);
  foo (aligned);
}

The __builtin_assume_aligned hasn't been expanded by evrp, so we should be able
to add a range-op entry for it.  This is what evrp sees:

void bar1 (const float * array)
{
  const float * aligned;

  <bb 2> :
  aligned_2 = __builtin_assume_aligned (array_1(D), 32);
  foo (aligned_2);
  return;

}

All we need is a range-op implementation for builtin_assume_aligned.  The
attached crude implementation does it.

=========== BB 2 ============
    <bb 2> :
    aligned_2 = __builtin_assume_aligned (array_1(D), 32);
    foo (aligned_2);
    return;

aligned_2 : [prange] const float * [0, +INF] MASK 0xffffffff00000000 VALUE 0x0

That is, the bottom 32 bits are cleared.

Andrew will have to comment on the uintptr_t idiom, because it gets expanded
into an .ASSUME() function which I'm unfamiliar with.

For this small function:

void bar2 (const float *array)
{
  [[assume((uintptr_t (array) & (32 - 1)) == 0)]];
  foo (array);
}

evrp expands to:

=========== BB 2 ============
Partial equiv (array.0_3 pe64 array_2(D))
    <bb 2> :
    array.0_3 = (long unsigned int) array_2(D);
    _4 = array.0_3 & 31;
    _5 = _4 == 0;
    return _5;

_4 : [irange] long unsigned int [0, 31] MASK 0x1f VALUE 0x0

I don't see any reason why we couldn't get that array.0_3 and array_2 are
aligned to 32-bits.  Maybe we don't set the value/mask pair for the
bitwise_and::op1_range?  The value/mask stuff is not very fleshed out,
especially for the op1_range operators.

[Bug tree-optimization/114995] C++23 Assume keyword not being used for vectorization

Reply via email to