On 25/04/12 15:31, Richard Guenther wrote: > On Wed, Apr 25, 2012 at 4:27 PM, Greta Yorsh <greta.yo...@arm.com> wrote: >> Richard Guenther wrote: >>> On Wed, Apr 25, 2012 at 3:34 PM, Greta Yorsh <greta.yo...@arm.com> >>> wrote: >>>> Richard Guenther wrote: >>>>> On Wed, Apr 25, 2012 at 1:51 PM, Greta Yorsh <greta.yo...@arm.com> >>>>> wrote: >>>>>> The test gcc.dg/vect/slp-perm-8.c fails on arm-none-eabi with neon >>>>> enabled: >>>>>> FAIL: gcc.dg/vect/slp-perm-8.c scan-tree-dump-times vect >>> "vectorized >>>>> 1 >>>>>> loops" 2 >>>>>> >>>>>> The test expects 2 loops to be vectorized, while gcc successfully >>>>> vectorizes >>>>>> 3 loops in this test using neon on arm. This patch adjusts the >>>>> expected >>>>>> output. Fixed test passes on qemu for arm and powerpc. >>>>>> >>>>>> OK for trunk? >>>>> >>>>> I think the proper fix is to instead of >>>>> >>>>> for (i = 0; i < N; i++) >>>>> { >>>>> input[i] = i; >>>>> output[i] = 0; >>>>> if (input[i] > 256) >>>>> abort (); >>>>> } >>>>> >>>>> use >>>>> >>>>> for (i = 0; i < N; i++) >>>>> { >>>>> input[i] = i; >>>>> output[i] = 0; >>>>> __asm__ volatile (""); >>>>> } >>>>> >>>>> to prevent vectorization of initialization loops. >>>> >>>> Actually, it looks like both arm and powerpc vectorize this >>> initialization loop (line 31), because the control flow is hoisted >>> outside the loop by previous optimizations. In addition, arm with neon >>> vectorizes the second loop (line 39), but powerpc does not: >>>> >>>> 39: not vectorized: relevant stmt not supported: D.2163_8 = i_40 * 9; >>>> >>>> If this is the expected behaviour for powerpc, then the patch I >>> proposed is still needed to fix the test failure on arm. Also, there >>> would be no need to disable vectorization of the initialization loop, >>> right? >>> >>> Ah, I thought that was what changed. Btw, the if () abort () tries to >>> disable >>> vectorization but does not succeed in doing so. >>> >>> Richard. >> >> Here is an updated patch. It prevents vectorization of the initialization >> loop, as Richard suggested, and updates the expected number of vectorized >> loops accordingly. This patch assumes that the second loop in main (line 39) >> should only be vectorized on arm with neon. The test passes for arm and >> powerpc. >> >> OK for trunk? > > If arm cannot handle 9 * i then the approrpiate condition would be > vect_int_mult, not arm_neon_ok. >
The issue is that arm has (well, should be marked has having) vect_char_mult. The difference in count of vectorized loops is based on that. R. > Ok with that change. > > Richard. > >> Thank you, >> Greta >> >> gcc/testsuite/ChangeLog >> >> 2012-04-25 Greta Yorsh <greta.yo...@arm.com> >> >> * gcc.dg/vect/slp-perm-8.c (main): Prevent >> vectorization of initialization loop. >> (dg-final): Adjust the expected number of >> vectorized loops. >> >> >> >> >