On Thu, 20 Mar 2025, Richard Biener wrote:

Pointer arithmetic -- POINTER_DIFF_EXPR
---------------------------------------
Subtracting a pointer q from a pointer p is done using POINTER_DIFF_EXPR.
  * It is UB if the difference does not fit in a signed integer with the
    same precision as the pointers.

Yep.

This implies that an object's size must be less than half the address
space; otherwise, POINTER_DIFF_EXPR cannot be used to compute sizes in C.
But there may be additional restrictions. For example, compiling the
function:

   void foo(int *p, int *q, int n)
   {
     for (int i = 0; i < n; i++)
       p[i] = q[i] + 1;
   }

causes the vectorized code to perform overlap checks like:

   _7 = q_11(D) + 4;
   _25 = p_12(D) - _7;
   _26 = (sizetype) _25;
   _27 = _26 > 8;
   _28 = _27;
   if (_28 != 0)
     goto <bb 11>;
   else
     goto <bb 12>;

which takes the difference between two pointers that may point to
different objects. This suggests that all objects must fit within half the
address space.

Interesting detail.  But yes, the above is either incorrect code (I didn't
double-check) or this condition must hold.  For 64bit virtual address-space
it likely holds in practice.  For 32bit virtual address-space it might not.

Question: What are the restrictions on valid address ranges and object
sizes?

The restriction on object size is documented, as well as that objects
may not cross/include 'NULL'.  Can you open a bugreport for the
vectorizer overlap test issue above?  I think we want to track this and
at least document the restriction (we could possibly add a target hook
that can tell whether such a simplified check is OK).

PR119399


Issues
------
The semantics I've described above result in many reports of
miscompilations that I haven't reported yet.

As mentioned earlier, the vectorizer can use POINTER_PLUS_EXPR to generate
pointers that extend up to a vector length past the object (and similarly,
up to one vector length before the object in a backwards loop). This is
invalid if the object is too close to address 0 or 0xffffffffffffffff.

And this out of bounds distance can be made arbitrarily large, as can be
seen by compiling the following function below for x86_64 with -O3:

void foo(int *p, uint64_t s, int64_t n)
{
   for (int64_t i = 0; i < n; i++)
     {
       int *q = p + i * s;
       for (int j = 0; j < 4; j++)
        *q++ = j;
     }
}

Here, the vectorized code add s * 4 to the pointer ivtmp_8 at the end of
each iteration:

   <bb 5> :
   bnd.8_38 = (unsigned long) n_10(D);
   _21 = s_11(D) * 4;

   <bb 3> :
   # ivtmp_8 = PHI <ivtmp_7(6), p_12(D)(5)>
   # ivtmp_26 = PHI <ivtmp_20(6), 0(5)>
   MEM <vector(4) int> [(int *)ivtmp_8] = { 0, 1, 2, 3 };
   ivtmp_7 = ivtmp_8 + _21;
   ivtmp_20 = ivtmp_26 + 1;
   if (ivtmp_20 < bnd.8_38)
     goto <bb 6>;
   else
     goto <bb 7>;

   <bb 6> :
   goto <bb 3>;

This means calling foo with a sufficiently large s can guarantee wrapping
or evaluating to 0, even though the original IR before optimization did
not wrap or evaluating to 0. For example, when calling foo as:

   foo(p, -((uintptr_t)p / 4), 1);

I guess this is essentially the same issue as PR113590, and could be fixed
by moving all induction variable updates to the latch.

Yes.

But if that
happens, the motivating examples for needing to handle invalid pointer
values would no longer occur. Would that mean smtgcc should adopt a more
restrictive semantics for pointer values?

In principle yes, but PR113590 blocks this I guess.  Could smtgcc consider
a SSA def to be happening only at the latest possible point, that is,
"virtually"
sink it to the latch?  (I realize when you have two dependent defs that can
be moved together only such "virtual" sinking could be somewhat complicated)

I think it feels a bit strange (and brittle) to define the semantics by virtually sinking everything as much as possible. But isn't this sinking just another way of saying that the operation is UB only if the value is used? That is, we can solve this essentially the same way LLVM does with its deferred UB.

The idea is that POINTER_DIFF_EXPR, PLUS_EXPR, etc. are defined for all inputs, but the result is a poison value when it doesn't fit in the return type. Any use of a poison value is UB.

This is trivial to implement in smtgcc and would solve PR113590.

   /Krister

Reply via email to