https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121773

            Bug ID: 121773
           Summary: Combine over-simplifies a subreg write
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rearnsha at gcc dot gnu.org
                CC: segher at gcc dot gnu.org
  Target Milestone: ---
            Target: arm

With this testcase, compiled with -march=armv7-a+simd -mfpu=auto -marm
-mfloat-abi=hard

#include <arm_neon.h>

uint64x1_t foo() {
  uint64x2_t v36 = vdupq_n_u64(0x2020000012345678);
  uint64x1_t v48 = vget_low_u64(v36);
  uint64x1_t v50 = vadd_u64(v48, v48);
  return vpadal_u32(v50, vdup_n_u32(0));
}

Is miscompiled to

        vldr.64 d16, .L2        @ int
        vmov.i32        d17, #0  @ v2si
        vpadal.u32      d16, d17
        vmov    r0, r1, d16     @ int
        bx      lr
.L2:
        .word   0
        .word   1077936128

We get, prior to combine:

(insn 21 20 7 2 (set (reg:DI 101 [ _5 ])
        (const_int 0 [0]))
"/home/rearnsha/gnusrc/gcc/master/gcc/config/arm/arm_neon.h":607:14 -1
     (nil))
(insn 7 21 8 2 (parallel [
            (set (reg:CC_C 80 cc)
                (compare:CC_C (plus:SI (reg:SI 104 [ _6 ])
                        (reg:SI 104 [ _6 ]))
                    (reg:SI 104 [ _6 ])))
            (set (subreg:SI (reg:DI 101 [ _5 ]) 0)
                (plus:SI (reg:SI 104 [ _6 ])
                    (reg:SI 104 [ _6 ])))
        ]) "/home/rearnsha/gnusrc/gcc/master/gcc/config/arm/arm_neon.h":607:14
17 {addsi3_compare_op1}
     (expr_list:REG_DEAD (reg:SI 104 [ _6 ])
        (nil)))
(insn 8 7 9 2 (set (subreg:SI (reg:DI 101 [ _5 ]) 4)
        (plus:SI (plus:SI (reg:SI 105 [ _6+4 ])
                (reg:SI 105 [ _6+4 ]))
            (ltu:SI (reg:CC_C 80 cc)
                (const_int 0 [0]))))
"/home/rearnsha/gnusrc/gcc/master/gcc/config/arm/arm_neon.h":607:14 21
{addsi3_carryin}

That is:
insn 21 clears R101
insn 7 writes the low part of R101 with an addition that carries out any
overflow bit
insn 8 writes the top part of R101 with an addition with carry-in.

In this specific test R104 and R105 are known constants.  It appears that
combine tries to merge insns 21 and 8 with:

Trying 21 -> 8:
   21: r101:DI=0
    8: r101:DI#4=0x40400000
Successfully matched this instruction:
(set (reg:DI 101 [ _5 ])
    (const_int 4629700416936869888 [0x4040000000000000]))
ie writing the whole of r101 with the top part of the addition.

somehow combine ignores that this will overwrite the intervening write of the
low part - that subsequently becomes dead code and is eliminated.

Reply via email to