PR65693 exposes a case where combine does a worse job after my patches to split parallels before combining. We start with a parallel of an udiv and an umod, and a clobber; the umod is dead. The instruction is combined with one setting the divisor pseudo to a power-of-two constant, so we end up with a parallel of an lshiftrt, an and (dead), and a clobber. This is not a recognised instruction.
Before my patches this was a 2->1 combination, and the combiner throws away the dead set and everyone is happy. After the patches, this now is a 3->1 combination, the combiner does not throw away the dead set but tries to split the parallel into two, which does not work because one of the resulting insns has to end up as i2, which is earlier than the original sets. The combiner gives up. There already is code to throw away dead sets in the 3->1 case, but it only works for a parallel for two sets without any clobbers. This patch fixes it. Tested on powerpc64-linux (-m32,-m32/-mpowerpc64,-m64,-m64/-mlra); no regressions. Tested a cross to x86_64-linux on the PR65693 testcase, and it fixes it. Is this okay for current trunk? Segher 2015-04-08 Segher Boessenkool <seg...@kernel.crashing.org> * combine.c (is_parallel_of_n_reg_sets): Change first argument from an rtx_insn * to an rtx. (try_combine): Adjust both callers. Use it once more. --- gcc/combine.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/gcc/combine.c b/gcc/combine.c index 14df228..32950383 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -2493,13 +2493,11 @@ update_cfg_for_uncondjump (rtx_insn *insn) } #ifndef HAVE_cc0 -/* Return whether INSN is a PARALLEL of exactly N register SETs followed +/* Return whether PAT is a PARALLEL of exactly N register SETs followed by an arbitrary number of CLOBBERs. */ static bool -is_parallel_of_n_reg_sets (rtx_insn *insn, int n) +is_parallel_of_n_reg_sets (rtx pat, int n) { - rtx pat = PATTERN (insn); - if (GET_CODE (pat) != PARALLEL) return false; @@ -2907,7 +2905,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, decrement insn. */ if (i1 == 0 - && is_parallel_of_n_reg_sets (i2, 2) + && is_parallel_of_n_reg_sets (PATTERN (i2), 2) && (GET_MODE_CLASS (GET_MODE (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)))) == MODE_CC) && GET_CODE (SET_SRC (XVECEXP (PATTERN (i2), 0, 0))) == COMPARE @@ -2939,7 +2937,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, make those two SETs separate I1 and I2 insns, and make an I0 that is the original I1. */ if (i0 == 0 - && is_parallel_of_n_reg_sets (i2, 2) + && is_parallel_of_n_reg_sets (PATTERN (i2), 2) && can_split_parallel_of_n_reg_sets (i2, 2) && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), i2, i3) && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), i2, i3)) @@ -3460,10 +3458,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, debug info less accurate. */ if (!(added_sets_2 && i1 == 0) - && GET_CODE (newpat) == PARALLEL - && XVECLEN (newpat, 0) == 2 - && GET_CODE (XVECEXP (newpat, 0, 0)) == SET - && GET_CODE (XVECEXP (newpat, 0, 1)) == SET + && is_parallel_of_n_reg_sets (newpat, 2) && asm_noperands (newpat) < 0) { rtx set0 = XVECEXP (newpat, 0, 0); -- 1.8.1.4