On Mon, 10 Jan 2022, Andre Vieira (lists) wrote: > Hi, > > I don't think I ever ended up posting the rebased version on top of the > epilogue mode patch. So here it is, I think I had a conditional OK if I split > the epilogue mode patch, but just want to double check this is OK for trunk?
Yes, I think I acked this. Richard. > > gcc/ChangeLog: > > * tree-vect-loop.c (vect_estimate_min_profitable_iters): Pass new > argument > suggested_unroll_factor. > (vect_analyze_loop_costing): Likewise. > (_loop_vec_info::_loop_vec_info): Initialize new member > suggested_unroll_factor. > (vect_determine_partial_vectors_and_peeling): Make epilogue of > unrolled > main loop use partial vectors. > (vect_analyze_loop_2): Pass and use new argument > suggested_unroll_factor. > (vect_analyze_loop_1): Likewise. > (vect_analyze_loop): Change to intialize local > suggested_unroll_factor and use it. > (vectorizable_reduction): Don't use single_defuse_cycle when > unrolling. > * tree-vectorizer.h (_loop_vec_info::_loop_vec_info): Add new member > suggested_unroll_factor. > (vector_costs::vector_costs): Add new member > m_suggested_unroll_factor. > (vector_costs::suggested_unroll_factor): New getter function. > (finish_cost): Set return argument suggested_unroll_factor. > > > > Regards, > Andre > > On 30/11/2021 13:56, Richard Biener wrote: > > On Tue, 30 Nov 2021, Andre Vieira (lists) wrote: > > > >> On 25/11/2021 12:46, Richard Biener wrote: > >>> Oops, my fault, yes, it does. I would suggest to refactor things so > >>> that the mode_i = first_loop_i case is there only once. I also wonder > >>> if all the argument about starting at 0 doesn't apply to the > >>> not unrolled LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P as well? So > >>> what's the reason to differ here? So in the end I'd just change > >>> the existing > >>> > >>> if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) > >>> { > >>> > >>> to > >>> > >>> if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo) > >>> || first_loop_vinfo->suggested_unroll_factor > 1) > >>> { > >>> > >>> and maybe revisit this when we have an actual testcase showing that > >>> doing sth else has a positive effect? > >>> > >>> Thanks, > >>> Richard. > >> So I had a quick chat with Richard Sandiford and he is suggesting resetting > >> mode_i to 0 for all cases. > >> > >> He pointed out that for some tunings the SVE mode might come after the NEON > >> mode, which means that even for not-unrolled loop_vinfos we could end up > >> with > >> a suboptimal choice of mode for the epilogue. I.e. it could be that we pick > >> V16QI for main vectorization, but that's VNx16QI + 1 in the array, so we'd > >> not > >> try VNx16QI for the epilogue. > >> > >> This would simplify the mode selecting cases, by just simply restarting at > >> mode_i in all epilogue cases. Is that something you'd be OK? > > Works for me with an updated comment. Even better with showing a > > testcase exercising such tuning. > > > > Richard. > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)