On Fri, 17 Sep 2021, Andre Vieira (lists) wrote: > Hi all, > > This patch adds the ability to define a target hook to unroll the main > vectorized loop. It also introduces --param's vect-unroll and > vect-unroll-reductions to control this through a command-line. I found this > useful to experiment and believe can help when tuning, so I decided to leave > it in. > We only unroll the main loop and have disabled unrolling epilogues for now. We > also do not support unrolling of any loop that has a negative step and we do > not support unrolling a loop with any reduction other than a > TREE_CODE_REDUCTION. > > Bootstrapped and regression tested on aarch64-linux-gnu as part of the series.
I wonder why we want to change the vector modes used for the epilogue, we're either making it more likely to need to fall through to the scalar epilogue or require another vectorized epilogue. That said, for simplicity I'd only change the VF of the main loop. There I wonder why you need to change vect_update_vf_for_slp or vect_determine_partial_vectors_and_peeling and why it's not enough to adjust the VF in a single place, I'd do that here: /* This is the point where we can re-start analysis with SLP forced off. */ start_over: /* Now the vectorization factor is final. */ poly_uint64 vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo); gcc_assert (known_ne (vectorization_factor, 0U)); ----> call vect_update_vf_for_unroll () note there's also loop->unroll (from #pragma GCC unroll) which we could include in what you look at in vect_unroll_value. I don't like add_stmt_cost_for_unroll - how should a target go and decide based on what it is fed? You could as well feed it the scalar body or the vinfo so it can get a shot at all the vectorizers meta data - but feeding it individual stmt_infos does not add any meaningful abstraction and thus what's the point? I _think_ what would make some sense is when we actually cost the vector body (with the not unrolled VF) ask the target "well, how about unrolling this?" because there it has the chance to look at the actual vector stmts produced (in "cost form"). And if the target answers "yeah - go ahead and try x4" we signal that to the iteration and have "mode N with x4 unroll" validated and costed. So instead of a new target hook amend the finish_cost hook to produce a suggested unroll value and cost both the unrolled and not unrolled body. Sorry for steering in a different direction ;) Thanks, Richard. > gcc/ChangeLog: > > * doc/tm.texi: Document TARGET_VECTORIZE_UNROLL_FACTOR > and TARGET_VECTORIZE_ADD_STMT_COST_FOR_UNROLL. > * doc/tm.texi.in: Add entries for target hooks above. > * params.opt: Add vect-unroll and vect-unroll-reductions > parameters. > * target.def: Define hooks TARGET_VECTORIZE_UNROLL_FACTOR > and TARGET_VECTORIZE_ADD_STMT_COST_FOR_UNROLL. > * targhooks.c (default_add_stmt_cost_for_unroll): New. > (default_unroll_factor): Likewise. > * targhooks.h (default_add_stmt_cost_for_unroll): Likewise. > (default_unroll_factor): Likewise. > * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize > par_unrolling_factor. > (vect_update_vf_for_slp): Use unrolling factor to update > vectorization > factor. > (vect_determine_partial_vectors_and_peeling): Account for > unrolling. > (vect_determine_unroll_factor): Determine how much to unroll > vectorized > main loop. > (vect_analyze_loop_2): Call vect_determine_unroll_factor. > (vect_analyze_loop): Allow for epilogue vectorization when > unrolling > and rewalk vector_mode array for the epilogues. > (vectorizable_reduction): Disable single_defuse_cycle when > unrolling. > * tree-vectorizer.h (vect_unroll_value): Declare par_unrolling_factor > as a member of loop_vec_info. > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)