On Fri, 8 Nov 2019, Andre Vieira (lists) wrote: > Hi, > > As I mentioned in the patch to disable epilogue vectorization for loops with > SIMDUID set, there were still some aarch64 libgomp failures. This patch fixes > those. > > The problem was that we were vectorizing a reduction that was only using one > of the parts from a complex number, creating data accesses with gaps. For this > we set PEELING_FOR_GAPS which forces us to peel an extra scalar iteration. > > What was happening in the testcase I looked at was that we had a known niters > of 10. The first VF was 4, leaving 10 % 4 = 2 scalar iterations. The epilogue > had VF 2, which meant the current code thought we could do it. However, given > the PEELING_FOR_GAPS it would create a scalar epilogue and we would end up > doing too many iterations, surprisingly 12 as I think the code assumed we > hadn't created said epilogue. > > I ran a local check where I upped the iterations of the fortran test to 11 and > I see GCC vectorizing the epilogue with VF = 2 and a scalar epilogue for one > iteration, so that looks good too. I have transformed it into a test that > would reproduce the issue in C and without openacc so I can run it in gcc's > normal testsuite more easily. > > Bootstrap on aarch64 and x86_64. > > Is this OK for trunk?
OK. Richard. > Cheers, > Andre > > gcc/ChangeLog: > 2019-11-08 Andre Vieira <andre.simoesdiasvie...@arm.com> > > * tree-vect-loop-manip.c (vect_do_peeling): Take epilogue gaps > into account when checking if there are enough iterations to > vectorize epilogue. > > gcc/testsuite/ChangeLog: > 2019-11-08 Andre Vieira <andre.simoesdiasvie...@arm.com> > > * gcc.dg/vect/vect-reduc-epilogue-gaps.c: New test. > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)