Hi all, This patch set adds support for vector load/store with length, Power ISA 3.0 brings instructions lxvl/stxvl to perform vector load/store with length, it's good to be exploited for those cases we don't have enough stuffs to fill in the whole vector like epilogues.
This support mainly refers to the handlings for fully-predicated loop but it also covers the epilogue usage. Now it supports two modes controlled by parameter vect-with-length-scope, it can support any loops fully with length or just for those cases with small iteration counts less than VF like epilogue, for now I don't have ready env to benchmark it, but based on the current inefficient length generation, I don't think it's a good idea to adopt vector with length for any loops. For the main loop which used to be vectorized, it increases register pressure and introduces extra computation for length, the pro for icache seems not comparable. But I think it might be a good idea to keep this parameter there for functionality testing, further benchmarking and other ports' potential future supports. As we don't have any benchmarking, this support isn't enabled by default for any particular cpus, all testings are with explicit parameter setting. Bootstrapped on powerpc64le-linux-gnu P9 with all vect-with-length-scope settings (0/1/2). Regress-test passed with vector-with-length-scope 0, for the other twos, several vector related cases need to be updated, no remarkable failures found. BTW, P9 is the one which supports the functionality but not ready to evaluate the performance. Here still are many things to be supported or improved, not limited to: - reduction/live-out support - Cost model adding/tweaking - IFN gimple folding - Some unnecessary ops improvements eg: vector_size check - Some possible refactoring I'll support/post the patches gradually. Any comments are highly appreciated. BR, Kewen ----- Patch set outline: [PATCH 1/7] ifn/optabs: Support vector load/store with length [PATCH 2/7] rs6000: lenload/lenstore optab support [PATCH 3/7] vect: Factor out codes for niters smaller than vf check [PATCH 4/7] hook/rs6000: Add vectorize length mode for vector with length [PATCH 5/7] vect: Support vector load/store with length in vectorizer [PATCH 6/7] ivopts: Add handlings for vector with length IFNs [PATCH 7/7] rs6000/testsuite: Vector with length test cases gcc/config/rs6000/rs6000.c | 3 + gcc/config/rs6000/vsx.md | 30 ++++++++++ gcc/doc/invoke.texi | 7 +++ gcc/doc/md.texi | 16 ++++++ gcc/doc/tm.texi | 6 ++ gcc/doc/tm.texi.in | 2 + gcc/internal-fn.c | 13 ++++- gcc/internal-fn.def | 6 ++ gcc/optabs.def | 2 + gcc/params.opt | 4 ++ gcc/target.def | 7 +++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-1.h | 18 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-2.h | 17 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-3.h | 31 +++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-4.h | 24 ++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-5.h | 29 ++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-6.h | 32 +++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-1.c | 15 +++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-2.c | 15 +++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-3.c | 18 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-4.c | 15 +++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-5.c | 15 +++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-6.c | 16 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-1.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-2.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-3.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-4.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-5.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-6.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-1.c | 16 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-2.c | 16 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-3.c | 17 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-4.c | 16 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-5.c | 16 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-6.c | 16 ++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-run-1.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-run-2.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-run-3.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-run-4.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-run-5.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-run-6.c | 10 ++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-run-1.h | 34 ++++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-run-2.h | 36 ++++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-run-3.h | 34 ++++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-run-4.h | 62 +++++++++++++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-run-5.h | 45 +++++++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length-run-6.h | 52 +++++++++++++++++ gcc/testsuite/gcc.target/powerpc/p9-vec-length.h | 14 +++++ gcc/tree-ssa-loop-ivopts.c | 4 ++ gcc/tree-vect-loop-manip.c | 268 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- gcc/tree-vect-loop.c | 272 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----- gcc/tree-vect-stmts.c | 152 ++++++++++++++++++++++++++++++++++++++++++++++++++ gcc/tree-vectorizer.h | 32 +++++++++++ 53 files changed, 1545 insertions(+), 18 deletions(-)