On Fri, Apr 4, 2025 at 3:37 AM Krister Walfridsson <krister.walfrids...@gmail.com> wrote: > > On Thu, 3 Apr 2025, Richard Biener wrote: > > > On Thu, Apr 3, 2025 at 2:23 AM Krister Walfridsson via Gcc > > <gcc@gcc.gnu.org> wrote: > >> > >> I have more questions about GIMPLE memory semantics for smtgcc. > >> > >> As before, each section starts with a description of the semantics I've > >> implemented (or plan to implement), followed by concrete questions if > >> relevant. Let me know if the described semantics are incorrect or > >> incomplete. > >> > >> > >> Accessing memory > >> ---------------- > >> Memory access in GIMPLE is done using GIMPLE_ASSIGN statements where the > >> lhs and/or rhs is a memory reference expression (such as MEM_REF). When > >> both lhs and rhs access memory, one of the following must hold -- > >> otherwise the access is UB: > >> 1. There is no overlap between lhs and rhs > >> 2. lhs and rhs represent the same address > >> > >> A memory access is also UB in the following cases: > >> * Any accessed byte is outside valid memory > >> * The pointer violates the alignment requirements > >> * The pointer provenance doesn't match the object > >> * The type is incorrect from a TBAA perspective > >> * It's a store to constant memory > > > > correct. > > > > Note that GIMPLE_CALL and GIMPLE_ASM can also access memory. > > > >> smtgcc requires -fno-strict-aliasing for now, so I'll ignore TBAA in this > >> mail. Provenance has its own issues, which I'll come back to in a separate > >> mail. > >> > >> > >> Checking memory access is within bounds > >> --------------------------------------- > >> A memory access may be represented by a chain of memory reference > >> expressions such as MEM_REF, ARRAY_REF, COMPONENT_REF, etc. For example, > >> accessing a structure: > >> > >> struct s { > >> int x, y; > >> }; > >> > >> as: > >> > >> int foo (struct s * p) > >> { > >> int _3; > >> > >> <bb 2> : > >> _3 = p_1(D)->x; > >> return _3; > >> } > >> > >> involves a MEM_REF for the whole object and a COMPONENT_REF to select the > >> field. Conceptually, we load the entire structure and then pick out the > >> element -- so all bytes of the structure must be in valid memory. > >> > >> We could also do the access as: > >> > >> int foo (struct s * p) > >> { > >> int * q; > >> int _3; > >> > >> <bb 2> : > >> q_2 = &p_1(D)->x; > >> _3 = *q_2; > >> return _3; > >> } > >> > >> This calculates the address of the element, and then reads it as an > >> integer, so only the four bytes of x must be in valid memory. > >> > >> In other words, the compiler is not allowed to optimize: > >> q_2 = &p_1(D)->x; > >> _3 = *q_2; > >> to > >> _3 = p_1(D)->x; > > > > Correct. The reason that p_1(D)->x is considered accessing the whole > > object is because of TBAA, so with -fno-strict-aliasing there is no UB > > when the whole object isn't accessible (but the subsetted piece is). > > I'm still a bit confused... Assume we're using -fno-strict-aliasing and p > points to a 4-byte buffer in the example above. Then: > _3 = p_1(D)->x; > is valid. So that means my paragraph starting with "In other words..." > must be incorrect? That is, the compiler is allowed to optimize: > q_2 = &p_1(D)->x; > _3 = *q_2; > to > _3 = p_1(D)->x; > since it's equally valid. But this optimization would be invalid for > -fstrict-aliasing?
Yes. Note we avoid this transform even with -fno-strict-aliasing. > And a memory access like: > _4 = p_1(D)->y; > would be invalid (for both -fno-strict-aliasing and -fstrict-aliasing)? > That is, the subsetted piece must be accessible? Of course. Note even with -fno-strict-aliasing pointer provenance rules apply, so if you manage to lay out i and j after each other extern int i, j; accessing j via a pointer to i is still UB. Richard. > > /Krister