On Tue, Jul 27, 2010 at 06:47:53PM -0500, Sebastian Pop wrote:
> Hi,
> 
> I ran the following script to gather data with trunk (from 20100615)
> and Graphite branch (today).
> 
> for i in `ls -1 *.f90`; do
>     echo -n $i
>     $FC $OPT -c ./$i &> out
>     grep "LOOP VECTORIZED" out | wc
> done
> 
> The following columns correspond to the number of lines reported by wc.
> 
> Trunk0: OPT="-ftree-vectorizer-verbose=2 -O3 -ffast-math"
> Trunk1: OPT="-ftree-vectorizer-verbose=2 -O3 -ffast-math -fgraphite-identity"
> Gr0: OPT="-ftree-vectorizer-verbose=2 -O3 -ffast-math"
> Gr1: OPT="-ftree-vectorizer-verbose=2 -O3 -ffast-math
> -fgraphite-identity -fno-loop-strip-mine -fno-loop-interchange
> -fno-loop-block"
> 
>               Trunk0  Trunk1  Gr0     Gr1
> ac.f90                30      30      29      29
> aermod.f90    151     110     147     147
> air.f90               4       3       4       4
> capacita.f90  17      11      13      13
> channel.f90   15      14      14      14
> doduc.f90     155     146     155     155
> fatigue.f90   15      15      15      15
> gas_dyn.f90   44      42      41      41
> induct.f90    9       5       5       5
> linpk.f90     14      3       14      14
> mdbx.f90      12      8       12      12
> nf.f90                51      34      50      50
> protein.f90   31      31      31      31
> rnflow.f90    87      75      85      85
> test_fpu.f90  80      65      78      78
> tfft.f90      4       3       4       4
> 
> Overall, with the recent changes that I pushed to the Graphite branch
> and that should be stable by now, we improved the vectorization of
> loops generated by Graphite.
> 
> The improvements in today's Graphite branch Gr1 with respect to
> Trunk1, that is trunk with -fgraphite-identity are the difference
> between Gr1 and Trunk1 (higher is more loops vectorized by Gr1):
> 
> ac.f90                -1
> aermod.f90    37
> air.f90               1
> capacita.f90  2
> channel.f90   0
> doduc.f90     9
> fatigue.f90   0
> gas_dyn.f90   -1
> induct.f90    0
> linpk.f90     11
> mdbx.f90      4
> nf.f90                16
> protein.f90   0
> rnflow.f90    10
> test_fpu.f90  13
> tfft.f90      1
> 
> There still are some missed vectorization cases, see the difference
> between Trunk0 and Gr0:
> 
> ac.f90                1
> aermod.f90    4
> air.f90               0
> capacita.f90  4
> channel.f90   1
> doduc.f90     0
> fatigue.f90   0
> gas_dyn.f90   3
> induct.f90    4
> linpk.f90     0
> mdbx.f90      0
> nf.f90                1
> protein.f90   0
> rnflow.f90    2
> test_fpu.f90  2
> tfft.f90      0
> 

Sebastian,
    When do you think we may start to see the vectorizations in
Gr1 exceed those from Gr0? Will that required upgrading to the
newer cloog?
            Jack
ps If the vectorizations using -fgraphite-identity eventually reaches
parity with those without that option, would -fgraphite-identity
become defaulted on for gcc builds with graphite support
(assuming minimal compile time increases)?

> After these changes are merged to trunk, we should revisit the
> following PRs:
> 
> http://gcc.gnu.org/PR38846: 35% slower using -floop* than without graphite
> http://gcc.gnu.org/PR40979: induct benchmark 60% slower when compiled
> with -fgraphite
> http://gcc.gnu.org/PR43359: gas_dyn benchmark exhibits missed
> vectorization with graphite
> 
> Sebastian Pop
> --
> AMD / Open Source Compiler Engineering / GNU Tools

Reply via email to