Hi,
I don't think I ever ended up posting the rebased version on top of the
epilogue mode patch. So here it is, I think I had a conditional OK if I
split the epilogue mode patch, but just want to double check this is OK
for trunk?
gcc/ChangeLog:
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Pass
new argument
suggested_unroll_factor.
(vect_analyze_loop_costing): Likewise.
(_loop_vec_info::_loop_vec_info): Initialize new member
suggested_unroll_factor.
(vect_determine_partial_vectors_and_peeling): Make epilogue of
unrolled
main loop use partial vectors.
(vect_analyze_loop_2): Pass and use new argument
suggested_unroll_factor.
(vect_analyze_loop_1): Likewise.
(vect_analyze_loop): Change to intialize local
suggested_unroll_factor and use it.
(vectorizable_reduction): Don't use single_defuse_cycle when
unrolling.
* tree-vectorizer.h (_loop_vec_info::_loop_vec_info): Add new
member suggested_unroll_factor.
(vector_costs::vector_costs): Add new member
m_suggested_unroll_factor.
(vector_costs::suggested_unroll_factor): New getter function.
(finish_cost): Set return argument suggested_unroll_factor.
Regards,
Andre
On 30/11/2021 13:56, Richard Biener wrote:
On Tue, 30 Nov 2021, Andre Vieira (lists) wrote:
On 25/11/2021 12:46, Richard Biener wrote:
Oops, my fault, yes, it does. I would suggest to refactor things so
that the mode_i = first_loop_i case is there only once. I also wonder
if all the argument about starting at 0 doesn't apply to the
not unrolled LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P as well? So
what's the reason to differ here? So in the end I'd just change
the existing
if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
{
to
if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)
|| first_loop_vinfo->suggested_unroll_factor > 1)
{
and maybe revisit this when we have an actual testcase showing that
doing sth else has a positive effect?
Thanks,
Richard.
So I had a quick chat with Richard Sandiford and he is suggesting resetting
mode_i to 0 for all cases.
He pointed out that for some tunings the SVE mode might come after the NEON
mode, which means that even for not-unrolled loop_vinfos we could end up with
a suboptimal choice of mode for the epilogue. I.e. it could be that we pick
V16QI for main vectorization, but that's VNx16QI + 1 in the array, so we'd not
try VNx16QI for the epilogue.
This would simplify the mode selecting cases, by just simply restarting at
mode_i in all epilogue cases. Is that something you'd be OK?
Works for me with an updated comment. Even better with showing a
testcase exercising such tuning.
Richard.
Hi,
I don't think I ever ended up posting the rebased version on top of the
epilogue mode patch. So here it is.
OK for trunk?
gcc/ChangeLog:
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Pass new
argument
suggested_unroll_factor.
(vect_analyze_loop_costing): Likewise.
(_loop_vec_info::_loop_vec_info): Initialize new member
suggested_unroll_factor.
(vect_determine_partial_vectors_and_peeling): Make epilogue of unrolled
main loop use partial vectors.
(vect_analyze_loop_2): Pass and use new argument
suggested_unroll_factor.
(vect_analyze_loop_1): Likewise.
(vect_analyze_loop): Change to intialize local suggested_unroll_factor
and use it.
(vectorizable_reduction): Don't use single_defuse_cycle when unrolling.
* tree-vectorizer.h (_loop_vec_info::_loop_vec_info): Add new member
suggested_unroll_factor.
(vector_costs::vector_costs): Add new member m_suggested_unroll_factor.
(vector_costs::suggested_unroll_factor): New getter function.
(finish_cost): Set return argument suggested_unroll_factor.
Regards,
Andre