https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95650

            Bug ID: 95650
           Summary: aarch64: Missed optimization storing addition of two
                    shorts
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

With the following C code:

void foo(unsigned short a, unsigned short b, unsigned short *ptr)
{
    *ptr = a + b;
}

AArch64 GCC at -O2 generates:

foo:
        and     w1, w1, 65535
        add     w0, w1, w0, uxth
        strh    w0, [x2]
        ret

but the and is redundant (and there's no need for the zero-extended add).
Indeed, clang generates:

foo:                                    // @foo
        add     w8, w1, w0
        strh    w8, [x2]
        ret

Notably AArch32 GCC gives the optimal sequence:

foo:
        add     r1, r0, r1
        strh    r1, [r2]        @ movhi
        bx      lr

and comparing the RTL at expand time shows that AArch64 has HI -> SI
zero_extends to contend with (which AArch32 doesn't).

AArch32 RTL:

(insn 2 6 3 2 (set (reg/v:SI 111 [ a ])
        (reg:SI 0 r0 [ a ])) "./example.c":2 -1
     (nil))
(insn 3 2 4 2 (set (reg/v:SI 112 [ b ])
        (reg:SI 1 r1 [ b ])) "./example.c":2 -1
     (nil))
(insn 4 3 5 2 (set (reg/v/f:SI 113 [ ptr ])
        (reg:SI 2 r2 [ ptr ])) "./example.c":2 -1
     (nil))
(note 5 4 8 2 NOTE_INSN_FUNCTION_BEG)
(debug_insn 8 5 9 2 (debug_marker) "./example.c":3 -1
     (nil))
(insn 9 8 10 2 (set (reg:SI 114)
        (plus:SI (reg/v:SI 111 [ a ])
            (reg/v:SI 112 [ b ]))) "./example.c":3 -1
     (nil))
(insn 10 9 0 2 (set (mem:HI (reg/v/f:SI 113 [ ptr ]) [1 *ptr_5(D)+0 S2 A16])
        (subreg:HI (reg:SI 114) 0)) "./example.c":3 -1
     (nil))

AArch64 RTL:

(note 1 0 6 NOTE_INSN_DELETED)
(note 6 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 6 3 2 (set (reg/v:SI 91 [ a ])
        (zero_extend:SI (reg:HI 0 x0 [ a ]))) "./example.c":2 -1
     (nil))
(insn 3 2 4 2 (set (reg/v:SI 92 [ b ])
        (zero_extend:SI (reg:HI 1 x1 [ b ]))) "./example.c":2 -1
     (nil))
(insn 4 3 5 2 (set (reg/v/f:DI 93 [ ptr ])
        (reg:DI 2 x2 [ ptr ])) "./example.c":2 -1
     (nil))
(note 5 4 8 2 NOTE_INSN_FUNCTION_BEG)
(debug_insn 8 5 9 2 (debug_marker) "./example.c":3 -1
     (nil))
(insn 9 8 10 2 (set (reg:SI 94)
        (plus:SI (reg/v:SI 91 [ a ])
            (reg/v:SI 92 [ b ]))) "./example.c":3 -1
     (nil))
(insn 10 9 11 2 (set (reg:HI 95)
        (subreg:HI (reg:SI 94) 0)) "./example.c":3 -1
     (nil))
(insn 11 10 0 2 (set (mem:HI (reg/v/f:DI 93 [ ptr ]) [1 *ptr_5(D)+0 S2 A16])
        (reg:HI 95)) "./example.c":3 -1
     (nil))

Reply via email to