On Tue, May 17, 2016 at 1:56 PM, Kumar, Venkataramanan <venkataramanan.ku...@amd.com> wrote: > Hi Richard, > > I created the patch by passing -b option to git. Now the patch is more > readable. > > As per your suggestion I tried to fix the PR by splitting the SLP store group > at vector boundary after the SLP tree is built. > > Boot strap PASSED on x86_64. > Checked the patch with check_GNU_style.sh. > > The gfortran.dg/pr46519-1.f test now does SLP vectorization. Hence it > generated 2 more vzeroupper. > As recommended I adjusted the test case by adding -fno-tree-slp-vectorize to > make it as expected after loop vectorization. > > The following tests are now passing. > > ------ Snip----- > Tests that now work, but didn't before: > > gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects scan-tree-dump-times slp2 > "basic block vectorized" 1 > > gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "basic block vectorized" 1 > > New tests that PASS: > > gcc.dg/vect/pr58135.c (test for excess errors) gcc.dg/vect/pr58135.c -flto > -ffat-lto-objects (test for excess errors) > > ------ Snip----- > > ChangeLog > > 2016-05-14 Venkataramanan Kumar <venkataramanan.ku...@amd.com> > PR tree-optimization/58135 > * tree-vect-slp.c: When group size is not multiple of vector size, > allow splitting of store group at vector boundary. > > Test suite ChangeLog > 2016-05-14 Venkataramanan Kumar <venkataramanan.ku...@amd.com> > * gcc.dg/vect/bb-slp-19.c: Remove XFAIL. > * gcc.dg/vect/pr58135.c: Add new. > * gfortran.dg/pr46519-1.f: Adjust test case. > > The attached patch Ok for trunk?
Please avoid the excessive vertical space around the vect_build_slp_tree call. + /* Calculate the unrolling factor. */ + unrolling_factor = least_common_multiple + (nunits, group_size) / group_size; ... + else { /* Calculate the unrolling factor based on the smallest type. */ if (max_nunits > nunits) - unrolling_factor = least_common_multiple (max_nunits, group_size) - / group_size; + unrolling_factor + = least_common_multiple (max_nunits, group_size)/group_size; please compute the "correct" unroll factor immediately and move the "unrolling of BB required" error into the if() case by post-poning the nunits < group_size check (and use max_nunits here). + if (is_a <bb_vec_info> (vinfo) + && nunits < group_size + && unrolling_factor != 1 + && is_a <bb_vec_info> (vinfo)) + { + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "Build SLP failed: store group " + "size not a multiple of the vector size " + "in basic block SLP\n"); + /* Fatal mismatch. */ + matches[nunits] = false; this is too pessimistic - you want to add the extra 'false' at group_size / max_nunits * max_nunits. It looks like you leak 'node' in the if () path as well. You need vect_free_slp_tree (node); loads.release (); thus treat it as a failure case. Thanks, Richard. > Regards, > Venkat. >