https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95650
Bug ID: 95650 Summary: aarch64: Missed optimization storing addition of two shorts Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- With the following C code: void foo(unsigned short a, unsigned short b, unsigned short *ptr) { *ptr = a + b; } AArch64 GCC at -O2 generates: foo: and w1, w1, 65535 add w0, w1, w0, uxth strh w0, [x2] ret but the and is redundant (and there's no need for the zero-extended add). Indeed, clang generates: foo: // @foo add w8, w1, w0 strh w8, [x2] ret Notably AArch32 GCC gives the optimal sequence: foo: add r1, r0, r1 strh r1, [r2] @ movhi bx lr and comparing the RTL at expand time shows that AArch64 has HI -> SI zero_extends to contend with (which AArch32 doesn't). AArch32 RTL: (insn 2 6 3 2 (set (reg/v:SI 111 [ a ]) (reg:SI 0 r0 [ a ])) "./example.c":2 -1 (nil)) (insn 3 2 4 2 (set (reg/v:SI 112 [ b ]) (reg:SI 1 r1 [ b ])) "./example.c":2 -1 (nil)) (insn 4 3 5 2 (set (reg/v/f:SI 113 [ ptr ]) (reg:SI 2 r2 [ ptr ])) "./example.c":2 -1 (nil)) (note 5 4 8 2 NOTE_INSN_FUNCTION_BEG) (debug_insn 8 5 9 2 (debug_marker) "./example.c":3 -1 (nil)) (insn 9 8 10 2 (set (reg:SI 114) (plus:SI (reg/v:SI 111 [ a ]) (reg/v:SI 112 [ b ]))) "./example.c":3 -1 (nil)) (insn 10 9 0 2 (set (mem:HI (reg/v/f:SI 113 [ ptr ]) [1 *ptr_5(D)+0 S2 A16]) (subreg:HI (reg:SI 114) 0)) "./example.c":3 -1 (nil)) AArch64 RTL: (note 1 0 6 NOTE_INSN_DELETED) (note 6 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 6 3 2 (set (reg/v:SI 91 [ a ]) (zero_extend:SI (reg:HI 0 x0 [ a ]))) "./example.c":2 -1 (nil)) (insn 3 2 4 2 (set (reg/v:SI 92 [ b ]) (zero_extend:SI (reg:HI 1 x1 [ b ]))) "./example.c":2 -1 (nil)) (insn 4 3 5 2 (set (reg/v/f:DI 93 [ ptr ]) (reg:DI 2 x2 [ ptr ])) "./example.c":2 -1 (nil)) (note 5 4 8 2 NOTE_INSN_FUNCTION_BEG) (debug_insn 8 5 9 2 (debug_marker) "./example.c":3 -1 (nil)) (insn 9 8 10 2 (set (reg:SI 94) (plus:SI (reg/v:SI 91 [ a ]) (reg/v:SI 92 [ b ]))) "./example.c":3 -1 (nil)) (insn 10 9 11 2 (set (reg:HI 95) (subreg:HI (reg:SI 94) 0)) "./example.c":3 -1 (nil)) (insn 11 10 0 2 (set (mem:HI (reg/v/f:DI 93 [ ptr ]) [1 *ptr_5(D)+0 S2 A16]) (reg:HI 95)) "./example.c":3 -1 (nil))