On Wed, May 18, 2016 at 5:29 PM, Kumar, Venkataramanan <venkataramanan.ku...@amd.com> wrote: > Hi Richard, > >> -----Original Message----- >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: Tuesday, May 17, 2016 5:40 PM >> To: Kumar, Venkataramanan <venkataramanan.ku...@amd.com> >> Cc: gcc-patches@gcc.gnu.org >> Subject: Re: [Patch V2] Fix SLP PR58135. >> >> On Tue, May 17, 2016 at 1:56 PM, Kumar, Venkataramanan >> <venkataramanan.ku...@amd.com> wrote: >> > Hi Richard, >> > >> > I created the patch by passing -b option to git. Now the patch is more >> readable. >> > >> > As per your suggestion I tried to fix the PR by splitting the SLP store >> > group at >> vector boundary after the SLP tree is built. >> > >> > Boot strap PASSED on x86_64. >> > Checked the patch with check_GNU_style.sh. >> > >> > The gfortran.dg/pr46519-1.f test now does SLP vectorization. Hence it >> generated 2 more vzeroupper. >> > As recommended I adjusted the test case by adding -fno-tree-slp-vectorize >> to make it as expected after loop vectorization. >> > >> > The following tests are now passing. >> > >> > ------ Snip----- >> > Tests that now work, but didn't before: >> > >> > gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects scan-tree-dump-times >> > slp2 "basic block vectorized" 1 >> > >> > gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "basic block >> > vectorized" 1 >> > >> > New tests that PASS: >> > >> > gcc.dg/vect/pr58135.c (test for excess errors) gcc.dg/vect/pr58135.c >> > -flto -ffat-lto-objects (test for excess errors) >> > >> > ------ Snip----- >> > >> > ChangeLog >> > >> > 2016-05-14 Venkataramanan Kumar >> <venkataramanan.ku...@amd.com> >> > PR tree-optimization/58135 >> > * tree-vect-slp.c: When group size is not multiple of vector size, >> > allow splitting of store group at vector boundary. >> > >> > Test suite ChangeLog >> > 2016-05-14 Venkataramanan Kumar >> <venkataramanan.ku...@amd.com> >> > * gcc.dg/vect/bb-slp-19.c: Remove XFAIL. >> > * gcc.dg/vect/pr58135.c: Add new. >> > * gfortran.dg/pr46519-1.f: Adjust test case. >> > >> > The attached patch Ok for trunk? >> >> >> Please avoid the excessive vertical space around the vect_build_slp_tree >> call. > Yes fixed in the attached patch. >> >> + /* Calculate the unrolling factor. */ >> + unrolling_factor = least_common_multiple >> + (nunits, group_size) / group_size; >> ... >> + else >> { >> /* Calculate the unrolling factor based on the smallest type. */ >> if (max_nunits > nunits) >> - unrolling_factor = least_common_multiple (max_nunits, group_size) >> - / group_size; >> + unrolling_factor >> + = least_common_multiple (max_nunits, >> + group_size)/group_size; >> >> please compute the "correct" unroll factor immediately and move the >> "unrolling of BB required" error into the if() case by post-poning the >> nunits < >> group_size check (and use max_nunits here). >> > Yes fixed in the attached patch. > >> + if (is_a <bb_vec_info> (vinfo) >> + && nunits < group_size >> + && unrolling_factor != 1 >> + && is_a <bb_vec_info> (vinfo)) >> + { >> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> + "Build SLP failed: store group " >> + "size not a multiple of the vector size " >> + "in basic block SLP\n"); >> + /* Fatal mismatch. */ >> + matches[nunits] = false; >> >> this is too pessimistic - you want to add the extra 'false' at group_size / >> max_nunits * max_nunits. > Yes fixed in attached patch. > >> >> It looks like you leak 'node' in the if () path as well. You need >> >> vect_free_slp_tree (node); >> loads.release (); >> >> thus treat it as a failure case. > > Yes fixed. I added an else part before scalar_stmts.release call for the case > when SLP tree is not built. This avoids double freeing. > Bootstrapped and reg tested on X86_64. > > Ok for trunk ?
+ /*Calculate the unrolling factor based on the smallest type. */ + unrolling_factor Space after /* missing. + unrolling_factor + = least_common_multiple (max_nunits, group_size)/group_size; spaces before/after the '/' + if (max_nunits > nunits + && unrolling_factor != 1 + && is_a <bb_vec_info> (vinfo)) { if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "Build SLP failed: unrolling required in basic" - " block SLP\n"); + dump_printf_loc (MSG_MISSED_OPTIMIZATION, + vect_location, + "Build SLP failed: unrolling " + "required in basic block SLP\n"); vect_free_slp_tree (node); loads.release (); return false; } + if (is_a <bb_vec_info> (vinfo) + && max_nunits < group_size + && unrolling_factor != 1) please merge these. I think you have a not covered case of max_nunits == nunits, max_nunits > group_size. So do if (unrolling_factor != 1 && is_a <...>) { if (max_nunits >= group_size) { first error } ... signal fatal mismatch instead... } else { Ok with that change. Thanks, Richard. >> >> Thanks, >> Richard. >> >> > Regards, >> > Venkat. >> > > > Regards, > Venkat.