On Mon, 2022-02-28 at 13:37 +0800, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> As PR103196 shows, p9-vec-length-full-7.c needs to be adjusted as the
> complete unrolling can happen on some of its loops.  This patch is to
> use pragma "GCC unroll 0" to disable all possible loop unrollings.
> Hope it can help the case not that fragile.

ok

Is the lack of effectiveness of "-fno-unroll-loops" otherwise
understood, or is there further issue behind that option? 

I would
expect the effect of the option, versus the pragma, two to roughly
equivalent.   Obviously it is not.  :-)
> 
> There are some other p9-vec-length* cases, I noticed that some of them
> use either bigger or unknown loop iteration counts, and
> "p9-vec-length-3*" have considered the effects of complete unrolling.
> So I just leave them alone for now.
> 
> Tested on powerpc64-linux-gnu P8 and powerpc64le-linux-gnu P9 and P10.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -----
>       PR testsuite/103196
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/powerpc/p9-vec-length-7.h: Add DO_PRAGMA macro.
>       * gcc.target/powerpc/p9-vec-length-epil-7.c: Use unroll pragma to
>       disable any unrollings.
>       * gcc.target/powerpc/p9-vec-length-full-7.c: Remove useless option.
>       * gcc.target/powerpc/p9-vec-length.h: Likewise.

I suggest a slight rearrangement and correction.

The -fno-unroll-loops options are removed from *-epil-7.c and *-full-7.c.

p9-vec-length.h  adds the DO_PRAGMA macro.

p9-vec-length-7.h updates (corrects?) whitespace and adds the PRAGMA call for 
"GCC unroll 0" around the test loop. 




> > ---
> >  .../gcc.target/powerpc/p9-vec-length-7.h        | 17 +++++++++++------
> >  .../gcc.target/powerpc/p9-vec-length-epil-7.c   |  2 +-
> >  .../gcc.target/powerpc/p9-vec-length-full-7.c   |  2 +-
> >  .../gcc.target/powerpc/p9-vec-length.h          |  2 ++
> >  4 files changed, 15 insertions(+), 8 deletions(-)
> > 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-7.h 
> > b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-7.h
> > index 4ef8f974a04..4f338565619 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-7.h
> > +++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-7.h
> > @@ -7,14 +7,19 @@
> >  #define START 1
> >  #define END 59
> >  
> > +/* Note that we use pragma unroll to disable any loop unrollings.  */
> > +
> >  #define test(TYPE)                                                         
> >     \
> > -  TYPE x_##TYPE[N] __attribute__((aligned(16)));                           
> >      \
> > -  void __attribute__((noinline, noclone)) test_npeel_##TYPE() {            
> >     \
> > +  TYPE x_##TYPE[N] __attribute__ ((aligned (16)));                         
> >     \
> > +  void __attribute__ ((noinline, noclone)) test_npeel_##TYPE ()            
> >     \
> > +  {                                                                        
> >     \
> >      TYPE v = 0;                                                            
> >     \
> > -    for (unsigned int i = START; i < END; i++) {                           
> >     \
> > -      x_##TYPE[i] = v;                                                     
> >     \
> > -      v += 1;                                                              
> >     \
> > -    }                                                                      
> >     \
> > +    DO_PRAGMA (GCC unroll 0)                                               
> >     \
> > +    for (unsigned int i = START; i < END; i++)                             
> >     \
> > +      {                                                                    
> >     \
> > +   x_##TYPE[i] = v;                                                       \
> > +   v += 1;                                                                \
> > +      }                                                                    
> >     \
> >    }


Some whitespace fix-ups (ok), and the addition of
the "DO_PRAGMA (GCC unroll 0)".

ok.


> >  
> >  TEST_ALL (test)
> > diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-7.c 
> > b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-7.c
> > index a27ee347ca1..859fedd5679 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-7.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-7.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile { target { lp64 && powerpc_p9vector_ok } } } */
> > -/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize 
> > -fno-vect-cost-model -fno-unroll-loops -ffast-math" } */
> > +/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize 
> > -fno-vect-cost-model -ffast-math" } */

ok

> >  
> >  /* { dg-additional-options "--param=vect-partial-vector-usage=1" } */
> >  
> > diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c 
> > b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c
> > index 89ff38443e7..5fe542bba20 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile { target { lp64 && powerpc_p9vector_ok } } } */
> > -/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize 
> > -fno-vect-cost-model -fno-unroll-loops -ffast-math" } */
> > +/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize 
> > -fno-vect-cost-model -ffast-math" } */
> >  

ok


> >  /* { dg-additional-options "--param=vect-partial-vector-usage=2" } */
> >  
> > diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length.h 
> > b/gcc/testsuite/gcc.target/powerpc/p9-vec-length.h
> > index 83418b0b641..7aefc9b308d 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length.h
> > +++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length.h
> > @@ -1,5 +1,7 @@
> >  #include <stdint.h>
> >  
> > +#define DO_PRAGMA(x) _Pragma (#x)
> > +

I note that I'm used to seeing a different form of the pragma usage,
i.e. 
        #pragma GCC optimize("O3,unroll-loops")

though there is one other case in the powerpc testsuite with the
_Pragma style.   
        _Pragma ("__vector")

I expect this is fine either way.    Though
with just the one call, perhaps the addition of the DO_PRAGMA() macro
is unnecessary?   (thoughts?)


Thanks,
-Will


> >  #define TEST_ALL(T)                                                        
> >     \
> >    T (int8_t)                                                               
> >     \
> >    T (uint8_t)                                                              
> >     \
> > -- 
> > 2.27.0
> > 

Reply via email to