Hi,
I'd like to discuss how to go forward with getting the vectorizer to all-SLP for this stage1. While there is a personal branch with my ongoing work (users/rguenth/vect-force-slp) branches haven't proved themselves working well for collaboration. The branch isn't ready to be merged in full but I have been picking improvements to trunk last stage1 and some remaining bits in the past weeks. I have refrained from merging code paths that cannot be exercised on trunk. There are two important set of changes on the branch, both critical to get more testing on non-x86 targets. 1. enable single-lane SLP discovery 2. avoid splitting store groups (9315bfc661432c3 and 4336060fe2db8ec if you fetch the branch) The first point is also most annoying on the testsuite since doing SLP instead of interleaving changes what we dump and thus tests start to fail in random ways when you switch between both modes. On the branch single-lane SLP discovery is gated with --param vect-single-lane-slp. The branch has numerous changes to enable single-lane SLP for some code paths that have SLP not implemented and where I did not bother to try supporting multi-lane SLP at this point. It also adds more SLP discovery entry points. I'm not sure how to try merging these pieces to allow others to more easily help out. One possibility is to merge --param vect-single-lane-slp defaulted off and pick dependent changes even when they cause testsuite regressions with vect-single-lane-slp=1. Alternatively adjust the testsuite by adding --param vect-single-lane-slp=0 and default to 1 (or keep the default). Or require a clean testsuite with --param vect-single-lane-slp defaulted to 1 but keep the --param for debugging (and allow FAILs with 0). For fun I merged just single-lane discovery of non-grouped stores and have that enabled by default. On x86_64 this results in the set of FAILs below. Any suggestions? Thanks, Richard. FAIL: gcc.dg/vect/O3-pr39675-2.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED." 1 FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect "Alignment of access forced using peeling" 2 FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect "Alignment of access forced using peeling" 2 FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect "Alignment of access forced using peeling" 2 FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/slp-12a.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-19a.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-19a.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-19b.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-19b.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c scan-tree-dump vect "vectorized 1 loops" XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c scan-tree-dump vect "vectorized 1 loops" XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c scan-tree-dump vect "vectorized 1 loops" XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c scan-tree-dump vect "vectorized 1 loops" FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 1 FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Vectorizing an unaligned access" 1 FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 1 FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Vectorizing an unaligned access" 1 FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 3 FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Alignment of access forced using peeling" 3 FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Vectorizing an unaligned access" 0 FAIL: gcc.dg/vect/vect-early-break_25.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 2 FAIL: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect "Alignment of access forced using peeling" 2 XPASS: gcc.dg/vect/vect-outer-4e.c -flto -ffat-lto-objects scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4e.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 1 FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Vectorizing an unaligned access" 1 FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects scan-tree-dump-times vect "Vectorizing an unaligned access" 1 FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1 FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Vectorizing an unaligned access" 1 FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times vfmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times vfmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times vfnmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times vfnmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times vfmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times vfmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times vfnmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times vfnmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times vfmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times vfmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times vfnmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times vfnmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1 FAIL: gcc.target/i386/pr101950-2.c scan-assembler-times \\txor[ql]\\t 2 FAIL: gcc.target/i386/pr88531-2b.c scan-assembler-times vmulps 1 FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 1 FAIL: gcc.target/i386/vectorize1.c scan-tree-dump vect "vect_cst" FAIL: gfortran.dg/temporary_3.f90 -O2 execution test FAIL: gfortran.dg/vect/fast-math-mgrid-resid.f -O scan-tree-dump pcom "Executing predictive commoning without unrolling" FAIL: gfortran.dg/vect/vect-8.f90 -O scan-tree-dump-times vect "vectorized 2[234] loops" 1