On Fri, 8 Nov 2019, Andre Vieira (lists) wrote:

> Hi,
> 
> As I mentioned in the patch to disable epilogue vectorization for loops with
> SIMDUID set, there were still some aarch64 libgomp failures. This patch fixes
> those.
> 
> The problem was that we were vectorizing a reduction that was only using one
> of the parts from a complex number, creating data accesses with gaps. For this
> we set PEELING_FOR_GAPS which forces us to peel an extra scalar iteration.
> 
> What was happening in the testcase I looked at was that we had a known niters
> of 10. The first VF was 4, leaving 10 % 4 = 2 scalar iterations. The epilogue
> had VF 2, which meant the current code thought we could do it. However, given
> the PEELING_FOR_GAPS it would create a scalar epilogue and we would end up
> doing too many iterations, surprisingly 12 as I think the code assumed we
> hadn't created said epilogue.
> 
> I ran a local check where I upped the iterations of the fortran test to 11 and
> I see GCC vectorizing the epilogue with VF = 2 and a scalar epilogue for one
> iteration, so that looks good too. I have transformed it into a test that
> would reproduce the issue in C and without openacc so I can run it in gcc's
> normal testsuite more easily.
> 
> Bootstrap on aarch64 and x86_64.
> 
> Is this OK for trunk?

OK.

Richard.

> Cheers,
> Andre
> 
> gcc/ChangeLog:
> 2019-11-08  Andre Vieira  <andre.simoesdiasvie...@arm.com>
> 
>       * tree-vect-loop-manip.c (vect_do_peeling): Take epilogue gaps
>         into account when checking if there are enough iterations to
>         vectorize epilogue.
> 
> gcc/testsuite/ChangeLog:
> 2019-11-08  Andre Vieira  <andre.simoesdiasvie...@arm.com>
> 
>       * gcc.dg/vect/vect-reduc-epilogue-gaps.c: New test.
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to