On Wed, Dec 13, 2017 at 04:59:00PM +0000, Jeff Law wrote: > On 11/17/2017 08:29 AM, Richard Sandiford wrote: > > This patch uses SVE CLASTB to optimise conditional reductions. It means > > that we no longer need to maintain a separate index vector to record > > the most recent valid value, and no longer need to worry about overflow > > cases. > > > > Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu > > and powerpc64le-linux-gnu. OK to install? > > > > Richard > > > > > > 2017-11-17 Richard Sandiford <richard.sandif...@linaro.org> > > Alan Hayward <alan.hayw...@arm.com> > > David Sherwood <david.sherw...@arm.com> > > > > gcc/ > > * doc/md.texi (fold_extract_last_@var{m}): Document. > > * doc/sourcebuild.texi (vect_fold_extract_last): Likewise. > > * optabs.def (fold_extract_last_optab): New optab. > > * internal-fn.def (FOLD_EXTRACT_LAST): New internal function. > > * internal-fn.c (fold_extract_direct): New macro. > > (expand_fold_extract_optab_fn): Likewise. > > (direct_fold_extract_optab_supported_p): Likewise. > > * tree-vectorizer.h (EXTRACT_LAST_REDUCTION): New vect_reduction_type. > > * tree-vect-loop.c (vect_model_reduction_cost): Handle > > EXTRACT_LAST_REDUCTION. > > (get_initial_def_for_reduction): Do not create an initial vector > > for EXTRACT_LAST_REDUCTION reductions. > > (vectorizable_reduction): Leave the scalar phi in place for > > EXTRACT_LAST_REDUCTIONs. Try using EXTRACT_LAST_REDUCTION > > ahead of INTEGER_INDUC_COND_REDUCTION. Do not check for an > > epilogue code for EXTRACT_LAST_REDUCTION and defer the > > transform phase to vectorizable_condition. > > * tree-vect-stmts.c (vect_finish_stmt_generation_1): New function, > > split out from... > > (vect_finish_stmt_generation): ...here. > > (vect_finish_replace_stmt): New function. > > (vectorizable_condition): Handle EXTRACT_LAST_REDUCTION. > > * config/aarch64/aarch64-sve.md (fold_extract_last_<mode>): New > > pattern. > > * config/aarch64/aarch64.md (UNSPEC_CLASTB): New unspec. > > > > gcc/testsuite/ > > * lib/target-supports.exp > > (check_effective_target_vect_fold_extract_last): New proc. > > * gcc.dg/vect/pr65947-1.c: Update dump messages. Add markup > > for fold_extract_last. > > * gcc.dg/vect/pr65947-2.c: Likewise. > > * gcc.dg/vect/pr65947-3.c: Likewise. > > * gcc.dg/vect/pr65947-4.c: Likewise. > > * gcc.dg/vect/pr65947-5.c: Likewise. > > * gcc.dg/vect/pr65947-6.c: Likewise. > > * gcc.dg/vect/pr65947-9.c: Likewise. > > * gcc.dg/vect/pr65947-10.c: Likewise. > > * gcc.dg/vect/pr65947-12.c: Likewise. > > * gcc.dg/vect/pr65947-13.c: Likewise. > > * gcc.dg/vect/pr65947-14.c: Likewise. > > * gcc.target/aarch64/sve_clastb_1.c: New test. > > * gcc.target/aarch64/sve_clastb_1_run.c: Likewise. > > * gcc.target/aarch64/sve_clastb_2.c: Likewise. > > * gcc.target/aarch64/sve_clastb_2_run.c: Likewise. > > * gcc.target/aarch64/sve_clastb_3.c: Likewise. > > * gcc.target/aarch64/sve_clastb_3_run.c: Likewise. > > * gcc.target/aarch64/sve_clastb_4.c: Likewise. > > * gcc.target/aarch64/sve_clastb_4_run.c: Likewise. > > * gcc.target/aarch64/sve_clastb_5.c: Likewise. > > * gcc.target/aarch64/sve_clastb_5_run.c: Likewise. > > * gcc.target/aarch64/sve_clastb_6.c: Likewise. > > * gcc.target/aarch64/sve_clastb_6_run.c: Likewise. > > * gcc.target/aarch64/sve_clastb_7.c: Likewise. > > * gcc.target/aarch64/sve_clastb_7_run.c: Likewise.
At some point I suppose we ous is a problem for Advanced SIMD too) we ought to clean up the AArch64 tests in to more meaningful folders. To be clear, that doesn't block any of the patches it is just an observation. > LIke some of the other patches, I focused just on the generic bits and > did not look at the aarch64 target bits. The generic bits are OK. The AArch64 parts are also OK. Thanks, James