I'm attaching an updated version of the patch, addressing the comments from http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01615.html
This patch adds arm32 to targets that support vect_char_mult. In addition, the test is updated to prevent vectorization of the initialization loop. The expected number of vectorized loops is adjusted accordingly. No regression with check-gcc on qemu for arm-none-eabi cortex-a9 neon softfp arm/thumb. OK for trunk? Thanks, Greta ChangeLog gcc/testsuite 2012-05-30 Greta Yorsh <Greta.Yorsh at arm.com> * gcc.dg/vect/slp-perm-8.c (main): Prevent vectorization of the initialization loop. (dg-final): Adjust the expected number of vectorized loops depending on vect_char_mult target selector. * lib/target-supports.exp (check_effective_target_vect_char_mult): Add arm32 to targets > -----Original Message----- > From: Richard Earnshaw [mailto:rearn...@arm.com] > Sent: 25 April 2012 17:30 > To: Richard Guenther > Cc: Greta Yorsh; gcc-patches@gcc.gnu.org; mikest...@comcast.net; > r...@cebitec.uni-bielefeld.de > Subject: Re: [Patch, testsuite] fix failure in test gcc.dg/vect/slp- > perm-8.c > > On 25/04/12 15:31, Richard Guenther wrote: > > On Wed, Apr 25, 2012 at 4:27 PM, Greta Yorsh <greta.yo...@arm.com> > wrote: > >> Richard Guenther wrote: > >>> On Wed, Apr 25, 2012 at 3:34 PM, Greta Yorsh <greta.yo...@arm.com> > >>> wrote: > >>>> Richard Guenther wrote: > >>>>> On Wed, Apr 25, 2012 at 1:51 PM, Greta Yorsh > <greta.yo...@arm.com> > >>>>> wrote: > >>>>>> The test gcc.dg/vect/slp-perm-8.c fails on arm-none-eabi with > neon > >>>>> enabled: > >>>>>> FAIL: gcc.dg/vect/slp-perm-8.c scan-tree-dump-times vect > >>> "vectorized > >>>>> 1 > >>>>>> loops" 2 > >>>>>> > >>>>>> The test expects 2 loops to be vectorized, while gcc > successfully > >>>>> vectorizes > >>>>>> 3 loops in this test using neon on arm. This patch adjusts the > >>>>> expected > >>>>>> output. Fixed test passes on qemu for arm and powerpc. > >>>>>> > >>>>>> OK for trunk? > >>>>> > >>>>> I think the proper fix is to instead of > >>>>> > >>>>> for (i = 0; i < N; i++) > >>>>> { > >>>>> input[i] = i; > >>>>> output[i] = 0; > >>>>> if (input[i] > 256) > >>>>> abort (); > >>>>> } > >>>>> > >>>>> use > >>>>> > >>>>> for (i = 0; i < N; i++) > >>>>> { > >>>>> input[i] = i; > >>>>> output[i] = 0; > >>>>> __asm__ volatile (""); > >>>>> } > >>>>> > >>>>> to prevent vectorization of initialization loops. > >>>> > >>>> Actually, it looks like both arm and powerpc vectorize this > >>> initialization loop (line 31), because the control flow is hoisted > >>> outside the loop by previous optimizations. In addition, arm with > neon > >>> vectorizes the second loop (line 39), but powerpc does not: > >>>> > >>>> 39: not vectorized: relevant stmt not supported: D.2163_8 = i_40 * > 9; > >>>> > >>>> If this is the expected behaviour for powerpc, then the patch I > >>> proposed is still needed to fix the test failure on arm. Also, > there > >>> would be no need to disable vectorization of the initialization > loop, > >>> right? > >>> > >>> Ah, I thought that was what changed. Btw, the if () abort () tries > to > >>> disable > >>> vectorization but does not succeed in doing so. > >>> > >>> Richard. > >> > >> Here is an updated patch. It prevents vectorization of the > initialization > >> loop, as Richard suggested, and updates the expected number of > vectorized > >> loops accordingly. This patch assumes that the second loop in main > (line 39) > >> should only be vectorized on arm with neon. The test passes for arm > and > >> powerpc. > >> > >> OK for trunk? > > > > If arm cannot handle 9 * i then the approrpiate condition would be > > vect_int_mult, not arm_neon_ok. > > > > The issue is that arm has (well, should be marked has having) > vect_char_mult. The difference in count of vectorized loops is based > on > that. > > R. > > > Ok with that change. > > > > Richard. > > > >> Thank you, > >> Greta > >> > >> gcc/testsuite/ChangeLog > >> > >> 2012-04-25 Greta Yorsh <greta.yo...@arm.com> > >> > >> * gcc.dg/vect/slp-perm-8.c (main): Prevent > >> vectorization of initialization loop. > >> (dg-final): Adjust the expected number of > >> vectorized loops. > >> > >> > >> > >> > > >
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c index d211ef9..c4854d5 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c @@ -32,8 +32,7 @@ int main (int argc, const char* argv[]) { input[i] = i; output[i] = 0; - if (input[i] > 256) - abort (); + __asm__ volatile (""); } for (i = 0; i < N / 3; i++) @@ -52,7 +51,8 @@ int main (int argc, const char* argv[]) return 0; } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_perm_byte } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target { vect_perm_byte && vect_char_mult } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_perm_byte && {! vect_char_mult } } } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_perm_byte } } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index b93dc5c..d249404 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -3462,7 +3462,8 @@ proc check_effective_target_vect_char_mult { } { set et_vect_char_mult_saved 0 if { [istarget ia64-*-*] || [istarget i?86-*-*] - || [istarget x86_64-*-*] } { + || [istarget x86_64-*-*] + || [check_effective_target_arm32] } { set et_vect_char_mult_saved 1 } }