This avoids cases of PHI node vectorization that just causes us to insert vector CTORs inside loops for values only required outside of the loop.
Bootstrap and regtest running on x86_64-unknown-linux-gnu. 2021-01-27 Richard Biener <rguent...@suse.de> PR tree-optimization/98854 * tree-vect-slp.c (vect_build_slp_tree_2): Also build PHIs from scalars when the number of CTORs matches the number of children. * gcc.dg/vect/bb-slp-pr98854.c: New testcase. --- gcc/testsuite/gcc.dg/vect/bb-slp-pr98854.c | 24 ++++++++++++++++++++++ gcc/tree-vect-slp.c | 5 ++++- 2 files changed, 28 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr98854.c diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr98854.c b/gcc/testsuite/gcc.dg/vect/bb-slp-pr98854.c new file mode 100644 index 00000000000..0c8141e1d17 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr98854.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ + +double a[1024]; + +int bar(); +void foo (int n) +{ + double x = 0, y = 0; + int i = 1023; + do + { + x += a[i] + a[i+1]; + y += a[i] / a[i+1]; + if (bar ()) + break; + } + while (--i); + /* We want to avoid vectorizing the LC PHI and insert vector CTORs + inside of the loop where it is only needed here. */ + a[0] = x; + a[1] = y; +} + +/* { dg-final { scan-tree-dump-not "vectorizing SLP node starting from: ._\[0-9\]+ = PHI" "slp1" } } */ diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index 4465cf7494e..10b876ff5ed 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -1896,7 +1896,10 @@ fail: n_vector_builds++; } } - if (all_uniform_p || n_vector_builds > 1) + if (all_uniform_p + || n_vector_builds > 1 + || (n_vector_builds == children.length () + && is_a <gphi *> (stmt_info->stmt))) { /* Roll back. */ matches[0] = false; -- 2.26.2