Currently, doloop.xx variable is using the type as niter which may shorter than word size. For some cases, it may be better to use word size type. For example, on some 64bit system, to access 32bit niter, subreg maybe used. Then using 64bit type would not need to use subreg if the value can be present in both 32bit and 64bit.
This patch updates doloop iv to BIT_PER_WORD size if it is fine. Bootstrap and regtest pass on powerpc64le and x86, is this ok for trunk? BR. Jiufu gcc/ChangeLog: 2021-07-08 Jiufu Guo <guoji...@linux.ibm.com> PR target/61837 * tree-ssa-loop-ivopts.c (add_iv_candidate_for_doloop): Update iv on BITS_PER_WORD for niter. gcc/testsuite/ChangeLog: 2021-07-08 Jiufu Guo <guoji...@linux.ibm.com> PR target/61837 * gcc.target/powerpc/pr61837.c: New test. --- gcc/testsuite/gcc.target/powerpc/pr61837.c | 16 ++++++++++++++++ gcc/tree-ssa-loop-ivopts.c | 10 ++++++++++ 2 files changed, 26 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr61837.c diff --git a/gcc/testsuite/gcc.target/powerpc/pr61837.c b/gcc/testsuite/gcc.target/powerpc/pr61837.c new file mode 100644 index 00000000000..dc44eb9cb41 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr61837.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +void foo(int *p1, long *p2, int s) +{ + int n, v, i; + + v = 0; + for (n = 0; n <= 100; n++) { + for (i = 0; i < s; i++) + if (p2[i] == n) + p1[i] = v; + v += 88; + } +} + +/* { dg-final { scan-assembler-not {\mrldicl\M} } } */ diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 12a8a49a307..c3c2f97918d 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -5690,6 +5690,16 @@ add_iv_candidate_for_doloop (struct ivopts_data *data) tree base = fold_build2 (PLUS_EXPR, ntype, unshare_expr (niter), build_int_cst (ntype, 1)); + + /* Use type in word size may fast. */ + if (TYPE_PRECISION (ntype) < BITS_PER_WORD + && TYPE_PRECISION (long_unsigned_type_node) == BITS_PER_WORD + && wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE (ntype)))) + { + ntype = long_unsigned_type_node; + base = fold_convert (ntype, base); + } + add_candidate (data, base, build_int_cst (ntype, -1), true, NULL, NULL, true); } -- 2.17.1