On Fri, Apr 4, 2025 at 3:37 AM Krister Walfridsson
<krister.walfrids...@gmail.com> wrote:
>
> On Thu, 3 Apr 2025, Richard Biener wrote:
>
> > On Thu, Apr 3, 2025 at 2:23 AM Krister Walfridsson via Gcc
> > <gcc@gcc.gnu.org> wrote:
> >>
> >> I have more questions about GIMPLE memory semantics for smtgcc.
> >>
> >> As before, each section starts with a description of the semantics I've
> >> implemented (or plan to implement), followed by concrete questions if
> >> relevant. Let me know if the described semantics are incorrect or
> >> incomplete.
> >>
> >>
> >> Accessing memory
> >> ----------------
> >> Memory access in GIMPLE is done using GIMPLE_ASSIGN statements where the
> >> lhs and/or rhs is a memory reference expression (such as MEM_REF). When
> >> both lhs and rhs access memory, one of the following must hold --
> >> otherwise the access is UB:
> >>   1. There is no overlap between lhs and rhs
> >>   2. lhs and rhs represent the same address
> >>
> >> A memory access is also UB in the following cases:
> >>   * Any accessed byte is outside valid memory
> >>   * The pointer violates the alignment requirements
> >>   * The pointer provenance doesn't match the object
> >>   * The type is incorrect from a TBAA perspective
> >>   * It's a store to constant memory
> >
> > correct.
> >
> > Note that GIMPLE_CALL and GIMPLE_ASM can also access memory.
> >
> >> smtgcc requires -fno-strict-aliasing for now, so I'll ignore TBAA in this
> >> mail. Provenance has its own issues, which I'll come back to in a separate
> >> mail.
> >>
> >>
> >> Checking memory access is within bounds
> >> ---------------------------------------
> >> A memory access may be represented by a chain of memory reference
> >> expressions such as MEM_REF, ARRAY_REF, COMPONENT_REF, etc. For example,
> >> accessing a structure:
> >>
> >>    struct s {
> >>      int x, y;
> >>    };
> >>
> >> as:
> >>
> >>    int foo (struct s * p)
> >>    {
> >>      int _3;
> >>
> >>      <bb 2> :
> >>      _3 = p_1(D)->x;
> >>      return _3;
> >>    }
> >>
> >> involves a MEM_REF for the whole object and a COMPONENT_REF to select the
> >> field. Conceptually, we load the entire structure and then pick out the
> >> element -- so all bytes of the structure must be in valid memory.
> >>
> >> We could also do the access as:
> >>
> >>    int foo (struct s * p)
> >>    {
> >>      int * q;
> >>      int _3;
> >>
> >>      <bb 2> :
> >>      q_2 = &p_1(D)->x;
> >>      _3 = *q_2;
> >>      return _3;
> >>    }
> >>
> >> This calculates the address of the element, and then reads it as an
> >> integer, so only the four bytes of x must be in valid memory.
> >>
> >> In other words, the compiler is not allowed to optimize:
> >>    q_2 = &p_1(D)->x;
> >>    _3 = *q_2;
> >> to
> >>    _3 = p_1(D)->x;
> >
> > Correct.  The reason that p_1(D)->x is considered accessing the whole
> > object is because of TBAA, so with -fno-strict-aliasing there is no UB
> > when the whole object isn't accessible (but the subsetted piece is).
>
> I'm still a bit confused... Assume we're using -fno-strict-aliasing and p
> points to a 4-byte buffer in the example above. Then:
>    _3 = p_1(D)->x;
> is valid. So that means my paragraph starting with "In other words..."
> must be incorrect? That is, the compiler is allowed to optimize:
>    q_2 = &p_1(D)->x;
>    _3 = *q_2;
> to
>    _3 = p_1(D)->x;
> since it's equally valid. But this optimization would be invalid for
> -fstrict-aliasing?

Yes.  Note we avoid this transform even with -fno-strict-aliasing.

> And a memory access like:
>    _4 = p_1(D)->y;
> would be invalid (for both -fno-strict-aliasing and -fstrict-aliasing)?
> That is, the subsetted piece must be accessible?

Of course.

Note even with -fno-strict-aliasing pointer provenance rules apply,
so if you manage to lay out i and j after each other

  extern int i, j;

accessing j via a pointer to i is still UB.

Richard.

>
>     /Krister

Reply via email to