https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114404

            Bug ID: 114404
           Summary: [11] GCC reorders stores when it probably shouldn't
           Product: gcc
           Version: 11.4.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: iii at linux dot ibm.com
  Target Milestone: ---

Reproducible with gcc commit 1b5510a59163.
I'm writing this up as a result of the following linux kernel discussion:

https://lore.kernel.org/bpf/c9923c1d-971d-4022-8dc8-1364e929d...@gmail.com/
https://lore.kernel.org/bpf/20240320015515.11883-1-...@linux.ibm.com/

In the following code:

extern const char bpf_plt[];
extern const char bpf_plt_ret[];
extern const char bpf_plt_target[];
static void bpf_jit_plt(void *plt, void *ret, void *target)
{
        memcpy(plt, bpf_plt, BPF_PLT_SIZE);
        *(void **)((char *)plt + (bpf_plt_ret - bpf_plt)) = ret;
        *(void **)((char *)plt + (bpf_plt_target - bpf_plt)) = target ?: ret;
}

GCC 11's sched1 pass reorders memcpy() and assignments.  In GCC 12 this
behavior is gone after

commit 2e96b5f14e4025691b57d2301d71aa6092ed44bc                                 
Author: Aldy Hernandez <al...@redhat.com>                                       
Date:   Tue Jun 15 12:32:51 2021 +0200                                          

    Backwards jump threader rewrite with ranger.

but this seems to be accidental.  Internally, output_dependence() for the
respective mems returns false, because it believes that they are based on
different SYMBOL_REFs.  This may be because on the C level we are not allowed
to subtract pointers to different objects.

However, a possible solution to this should be casting pointers to longs, since
C pointer subtraction rules would no longer apply, but in practice this does
nothing. 

In the attached minimized preprocessed source with long casts we get:

        stg     %r3,232(%r2,%r15)
        ltgr    %r11,%r11
        locgrne %r3,%r11
        stg     %r3,232(%r1,%r15)
        la      %r2,0(%r1,%r9)
        la      %r3,232(%r1,%r15)
        mvc     232(16,%r15),0(%r5)
        mvc     248(16,%r15),16(%r5)
        lghi    %r4,8
        brasl   %r14,s390_kernel_write@PLT

so the assignments are placed before the memcpy().

Reply via email to