Hi all, One issue that we keep encountering on aarch64 is GCC not making good use of the flag-setting arithmetic instructions like ADDS, SUBS, ANDS etc. that perform an arithmetic operation and compare the result against zero. They are represented in a fairly standard way in the backend as PARALLEL patterns: (parallel [(set (reg x1) (op (reg x2) (reg x3))) (set (reg cc) (compare (op (reg x2) (reg x3)) (const_int 0)))])
GCC isn't forming these from separate arithmetic and comparison instructions as aggressively as it could. A particular pain point is when the result of the arithmetic insn is used before the comparison instruction. The testcase in this patch is one such example where we have: (insn 7 35 33 2 (set (reg/v:SI 0 x0 [orig:73 <retval> ] [73]) (plus:SI (reg:SI 0 x0 [ x ]) (reg:SI 1 x1 [ y ]))) "comb.c":3 95 {*addsi3_aarch64} (nil)) (insn 33 7 34 2 (set (reg:SI 1 x1 [77]) (plus:SI (reg/v:SI 0 x0 [orig:73 <retval> ] [73]) (const_int 2 [0x2]))) "comb.c":4 95 {*addsi3_aarch64} (nil)) (insn 34 33 17 2 (set (reg:CC 66 cc) (compare:CC (reg/v:SI 0 x0 [orig:73 <retval> ] [73]) (const_int 0 [0]))) "comb.c":4 391 {cmpsi} (nil)) This scares combine away as x0 is used in insn 33 as well as the comparison in insn 34. I think the compare-elim pass can help us here. This patch extends it by handling comparisons against zero, finding the defining instruction of the compare and merging the comparison with the defining instruction into a PARALLEL that will hopefully match the form described above. If between the comparison and the defining insn we find an instruction that uses the condition registers or any control flow we bail out, and we don't cross basic blocks. This simple technique allows us to catch cases such as this example. Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu and x86_64. Ok for trunk? 2017-08-05 Kyrylo Tkachov <kyrylo.tkac...@arm.com> Michael Collison <michael.colli...@arm.com> * compare-elim.c: Include emit-rtl.h. (can_merge_compare_into_arith): New function. (try_validate_parallel): Likewise. (try_merge_compare): Likewise. (try_eliminate_compare): Call the above when no previous clobber is available. (execute_compare_elim_after_reload): Add DF_UD_CHAIN and DF_DU_CHAIN dataflow problems. 2017-08-05 Kyrylo Tkachov <kyrylo.tkac...@arm.com> Michael Collison <michael.colli...@arm.com> * gcc.target/aarch64/cmpelim_mult_uses_1.c: New test.
pr5198v1.patch
Description: pr5198v1.patch