On 25 May 2016 at 05:29, Kumar, Venkataramanan <venkataramanan.ku...@amd.com> wrote: > Hi Christophe, > >> -----Original Message----- >> From: Christophe Lyon [mailto:christophe.l...@linaro.org] >> Sent: Tuesday, May 24, 2016 8:45 PM >> To: Kumar, Venkataramanan <venkataramanan.ku...@amd.com> >> Cc: Richard Biener <richard.guent...@gmail.com>; gcc-patches@gcc.gnu.org >> Subject: Re: [Patch V2] Fix SLP PR58135. >> >> Hi Venkat, >> >> >> On 23 May 2016 at 11:54, Kumar, Venkataramanan >> <venkataramanan.ku...@amd.com> wrote: >> > Hi Richard, >> > >> >> -----Original Message----- >> >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> >> Sent: Thursday, May 19, 2016 4:08 PM >> >> To: Kumar, Venkataramanan <venkataramanan.ku...@amd.com> >> >> Cc: gcc-patches@gcc.gnu.org >> >> Subject: Re: [Patch V2] Fix SLP PR58135. >> >> >> >> On Wed, May 18, 2016 at 5:29 PM, Kumar, Venkataramanan >> >> <venkataramanan.ku...@amd.com> wrote: >> >> > Hi Richard, >> >> > >> >> >> -----Original Message----- >> >> >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> >> >> Sent: Tuesday, May 17, 2016 5:40 PM >> >> >> To: Kumar, Venkataramanan <venkataramanan.ku...@amd.com> >> >> >> Cc: gcc-patches@gcc.gnu.org >> >> >> Subject: Re: [Patch V2] Fix SLP PR58135. >> >> >> >> >> >> On Tue, May 17, 2016 at 1:56 PM, Kumar, Venkataramanan >> >> >> <venkataramanan.ku...@amd.com> wrote: >> >> >> > Hi Richard, >> >> >> > >> >> >> > I created the patch by passing -b option to git. Now the patch >> >> >> > is more >> >> >> readable. >> >> >> > >> >> >> > As per your suggestion I tried to fix the PR by splitting the >> >> >> > SLP store group at >> >> >> vector boundary after the SLP tree is built. >> >> >> > >> >> >> > Boot strap PASSED on x86_64. >> >> >> > Checked the patch with check_GNU_style.sh. >> >> >> > >> >> >> > The gfortran.dg/pr46519-1.f test now does SLP vectorization. >> >> >> > Hence it >> >> >> generated 2 more vzeroupper. >> >> >> > As recommended I adjusted the test case by adding >> >> >> > -fno-tree-slp-vectorize >> >> >> to make it as expected after loop vectorization. >> >> >> > >> >> >> > The following tests are now passing. >> >> >> > >> >> >> > ------ Snip----- >> >> >> > Tests that now work, but didn't before: >> >> >> > >> >> >> > gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects >> >> >> > scan-tree-dump-times >> >> >> > slp2 "basic block vectorized" 1 >> >> >> > >> >> >> > gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "basic block >> >> >> > vectorized" 1 >> >> >> > >> >> >> > New tests that PASS: >> >> >> > >> >> >> > gcc.dg/vect/pr58135.c (test for excess errors) >> >> >> > gcc.dg/vect/pr58135.c -flto -ffat-lto-objects (test for excess >> >> >> > errors) >> >> >> > >> >> >> > ------ Snip----- >> >> >> > >> >> >> > ChangeLog >> >> >> > >> >> >> > 2016-05-14 Venkataramanan Kumar >> >> >> <venkataramanan.ku...@amd.com> >> >> >> > PR tree-optimization/58135 >> >> >> > * tree-vect-slp.c: When group size is not multiple of vector >> >> >> > size, >> >> >> > allow splitting of store group at vector boundary. >> >> >> > >> >> >> > Test suite ChangeLog >> >> >> > 2016-05-14 Venkataramanan Kumar >> >> >> <venkataramanan.ku...@amd.com> >> >> >> > * gcc.dg/vect/bb-slp-19.c: Remove XFAIL. >> >> >> > * gcc.dg/vect/pr58135.c: Add new. >> >> >> > * gfortran.dg/pr46519-1.f: Adjust test case. >> >> >> > >> >> >> > The attached patch Ok for trunk? >> >> >> >> >> >> >> >> >> Please avoid the excessive vertical space around the >> >> >> vect_build_slp_tree >> >> call. >> >> > Yes fixed in the attached patch. >> >> >> >> >> >> + /* Calculate the unrolling factor. */ >> >> >> + unrolling_factor = least_common_multiple >> >> >> + (nunits, group_size) / group_size; >> >> >> ... >> >> >> + else >> >> >> { >> >> >> /* Calculate the unrolling factor based on the smallest >> >> >> type. */ >> >> >> if (max_nunits > nunits) >> >> >> - unrolling_factor = least_common_multiple (max_nunits, >> group_size) >> >> >> - / group_size; >> >> >> + unrolling_factor >> >> >> + = least_common_multiple (max_nunits, >> >> >> + group_size)/group_size; >> >> >> >> >> >> please compute the "correct" unroll factor immediately and move >> >> >> the "unrolling of BB required" error into the if() case by >> >> >> post-poning the nunits < group_size check (and use max_nunits here). >> >> >> >> >> > Yes fixed in the attached patch. >> >> > >> >> >> + if (is_a <bb_vec_info> (vinfo) >> >> >> + && nunits < group_size >> >> >> + && unrolling_factor != 1 >> >> >> + && is_a <bb_vec_info> (vinfo)) >> >> >> + { >> >> >> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> >> >> + "Build SLP failed: store group " >> >> >> + "size not a multiple of the vector size " >> >> >> + "in basic block SLP\n"); >> >> >> + /* Fatal mismatch. */ >> >> >> + matches[nunits] = false; >> >> >> >> >> >> this is too pessimistic - you want to add the extra 'false' at >> >> >> group_size / max_nunits * max_nunits. >> >> > Yes fixed in attached patch. >> >> > >> >> >> >> >> >> It looks like you leak 'node' in the if () path as well. You need >> >> >> >> >> >> vect_free_slp_tree (node); >> >> >> loads.release (); >> >> >> >> >> >> thus treat it as a failure case. >> >> > >> >> > Yes fixed. I added an else part before scalar_stmts.release call >> >> > for the case >> >> when SLP tree is not built. This avoids double freeing. >> >> > Bootstrapped and reg tested on X86_64. >> >> > >> >> This patch is causing regressions on armeb: >> http://people.linaro.org/~christophe.lyon/cross- >> validation/gcc/trunk/236591/report-build-info.html >> The following fortran tests now fail at runtime: >> >> gfortran.dg/c_f_pointer_logical.f03 -O3 -fomit-frame-pointer >> -funroll-loops -fpeel-loops -ftracer -finline-functions execution test >> gfortran.dg/c_f_pointer_logical.f03 -O3 -g execution test >> gfortran.dg/intrinsic_pack_1.f90 -O3 -fomit-frame-pointer >> -funroll-loops -fpeel-loops -ftracer -finline-functions execution test >> gfortran.dg/intrinsic_pack_1.f90 -O3 -g execution test >> >> Can you have a look? >> > > Sure. I will take a look. > Thanks, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71270 to keep track of this.
Christophe >> Thanks, >> >> Christophe. >> >> >> >> >> > Ok for trunk ? >> >> >> >> + /*Calculate the unrolling factor based on the smallest type. */ >> >> + unrolling_factor >> >> >> >> Space after /* missing. >> >> >> >> + unrolling_factor >> >> + = least_common_multiple (max_nunits, group_size)/group_size; >> >> >> >> spaces before/after the '/' >> >> >> >> + if (max_nunits > nunits >> >> + && unrolling_factor != 1 >> >> + && is_a <bb_vec_info> (vinfo)) >> >> { >> >> if (dump_enabled_p ()) >> >> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> >> - "Build SLP failed: unrolling required in >> >> basic" >> >> - " block SLP\n"); >> >> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, >> >> + vect_location, >> >> + "Build SLP failed: unrolling " >> >> + "required in basic block SLP\n"); >> >> vect_free_slp_tree (node); >> >> loads.release (); >> >> return false; >> >> } >> >> >> >> + if (is_a <bb_vec_info> (vinfo) >> >> + && max_nunits < group_size >> >> + && unrolling_factor != 1) >> >> >> >> please merge these. I think you have a not covered case of >> >> max_nunits == nunits, max_nunits > group_size. >> >> So do >> >> >> >> if (unrolling_factor != 1 >> >> && is_a <...>) >> >> { >> >> if (max_nunits >= group_size) >> >> { >> >> first error >> >> } >> >> ... signal fatal mismatch instead... >> >> } >> >> else >> >> { >> >> >> >> Ok with that change. >> > >> > Thank you for the review. I updated and patch and up streamed it to trunk. >> > Ref: https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=236582 >> > >> > Regards, >> > Venkat. > > Regards, > Venkat.