This patch addresses PR rtl-optimization/106594, a P1 performance regression affecting aarch64.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. If someone (who can regression test this on aarch64) could take this from here that would be much appreciated. Thanks in advance. 2023-03-04 Roger Sayle <ro...@nextmovesoftware.com> gcc/ChangeLog PR rtl-optimization/106594 * combine.cc (expand_compound_operation): Don't expand/transform ZERO_EXTEND or SIGN_EXTEND on targets where rtx_cost claims they are cheap. Roger --
diff --git a/gcc/combine.cc b/gcc/combine.cc index 0538795..cf126c8 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -7288,7 +7288,17 @@ expand_compound_operation (rtx x) && (STORE_FLAG_VALUE & ~GET_MODE_MASK (inner_mode)) == 0) return SUBREG_REG (XEXP (x, 0)); + /* If ZERO_EXTEND is cheap on this target, do nothing, + i.e. don't attempt to convert it to a pair of shifts. */ + if (set_src_cost (x, mode, optimize_this_for_speed_p) + <= COSTS_N_INSNS (1)) + return x; } + /* Likewise, if SIGN_EXTEND is cheap, do nothing. */ + else if (GET_CODE (x) == SIGN_EXTEND + && set_src_cost (x, mode, optimize_this_for_speed_p) + <= COSTS_N_INSNS (1)) + return x; /* If we reach here, we want to return a pair of shifts. The inner shift is a left shift of BITSIZE - POS - LEN bits. The outer