luoxhu--- via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > From: Xionghu Luo <luo...@linux.ibm.com> > > This "subtract/extend/add" existed for a long time and still annoying us > (PR37451, PR61837) when converting from 32bits to 64bits, as the ctr > register is used as 64bits on powerpc64, Andraw Pinski had a patch but > caused some issue and reverted by Joseph S. Myers(PR37451, PR37782). > > Andraw: > http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01070.html > http://gcc.gnu.org/ml/gcc-patches/2008-10/msg01321.html > Joseph: > https://gcc.gnu.org/legacy-ml/gcc-patches/2011-11/msg02405.html > > However, the doloop code improved a lot since so many years passed, > gcc.c-torture/execute/doloop-1.c is no longer a simple loop with constant > desc->niter_expr since r125:SI#0 is not SImode, so it is not a valid doloop > and no transform done in doloop again. Thus we can do the simplification > from "subtract/extend/add" to only extend as the condition in doloop will > never be false based on loop ch's optimization. > What's more, this patch is slightly different with Andrw's implementation, > the check of ZERO_EXT and SImode will guard the count won't be changed > from char/short caused cases not time out on slow platforms before. > Any comments? Thanks. > > doloop-1.c.257r.loop2_doloop > ... > 12: [r129:DI]=r123:SI > REG_DEAD r129:DI > REG_DEAD r123:SI > 13: r125:SI=r120:DI#0-0x1 > REG_DEAD r120:DI > 14: r120:DI=zero_extend(r125:SI#0) > REG_DEAD r125:SI > 16: r126:CC=cmp(r120:DI,0) > 17: pc={(r126:CC!=0)?L43:pc} > REG_DEAD r126:CC > ... > > Bootstrap and regression tested pass on Power8-LE. > > gcc/ChangeLog > > 2020-04-15 Xiong Hu Luo <luo...@linux.ibm.com> > > PR rtl-optimization/37451, PR target/61837 > loop-doloop.c (doloop_modify): Simplify (add -1; zero_ext; add +1) > to zero_ext. > --- > gcc/loop-doloop.c | 26 +++++++++++++++++++++++++- > 1 file changed, 25 insertions(+), 1 deletion(-) > > diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c > index db6a014e43d..9f967fa3a0b 100644 > --- a/gcc/loop-doloop.c > +++ b/gcc/loop-doloop.c > @@ -477,7 +477,31 @@ doloop_modify (class loop *loop, class niter_desc *desc, > } > > if (increment_count) > - count = simplify_gen_binary (PLUS, mode, count, const1_rtx); > + { > + /* Fold (add -1; zero_ext; add +1) operations to zero_ext based on > addop0 > + is never zero, as gimple pass loop ch will do optimization to simplify > + the loop to NO loop for loop condition is false. */
IMO the code needs to prove this, rather than just assume that previous passes have made it so. Thanks, Richard > + bool simplify_zext = false; > + rtx extop0 = XEXP (count, 0); > + if (mode == E_DImode > + && GET_CODE (count) == ZERO_EXTEND > + && GET_CODE (extop0) == PLUS) > + { > + rtx addop0 = XEXP (extop0, 0); > + rtx addop1 = XEXP (extop0, 1); > + if (CONST_SCALAR_INT_P (addop1) > + && GET_MODE (addop0) == E_SImode > + && addop1 == GEN_INT (-1)) > + { > + count = simplify_gen_unary (ZERO_EXTEND, mode, addop0, > + GET_MODE (addop0)); > + simplify_zext = true; > + } > + } > + > + if (!simplify_zext) > + count = simplify_gen_binary (PLUS, mode, count, const1_rtx); > + } > > /* Insert initialization of the count register into the loop header. */ > start_sequence ();