Re: GCC Summit 2010 topic (potentially).

Richard Guenther Thu, 04 Jun 2009 13:53:07 -0700

On Thu, Jun 4, 2009 at 7:33 PM, Toon Moene <t...@moene.org> wrote:
> L.S.,
>
> This year I'm unable to attend the GCC Summit (both due to time and money
> constraints).
>
> In 2008, I pondered to talk about the effect of link time optimization on
> typical Fortran programs -
>
> That is, until my attention got hijacked by the geo-politically more
> pressing question of Coarrays in Fortran.
>
> However, the issue still stands.  So I'm thinking ahead of next year
> (assuming LTO will work by that time for most front-end languages):
>
> What will LTO bring for Fortran ?
>
> Here's a run-of-the-mill example from our code:
>
>      SUBROUTINE VERINT (
>     I   KLON   , KLAT   , KLEV   , KINT  , KHALO
>     I , KLON1  , KLON2  , KLAT1  , KLAT2
>     I , KP     , KQ     , KR
>     R , PARG   , PRES
>     R , PALFH  , PBETH
>     R , PALFA  , PBETA  , PGAMA   )
> ...
>      DO JY = KLAT1,KLAT2
>      DO JX = KLON1,KLON2
>         IDX  = KP(JX,JY)
>         IDY  = KQ(JX,JY)
>         ILEV = KR(JX,JY)
> C
>         PRES(JX,JY) = PGAMA(JX,JY,1)*(
> C
>     +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
>     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
>     + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
>     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
> C    +
>     +               + PGAMA(JX,JY,2)*(
> C    +
>     +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
>     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
>     + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
>     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
>      ENDDO
>      ENDDO
> ...
>      RETURN
>      END
>
> There are several issues a link time optimization pass could determine:
>
> 1. Whether or not the arrays PALFA, PARG, ... are suitably aligned for
>   vectorization (forgoing a run time check for that).
>
> 2. Wheter KLON{1,2}, KLAT{1,2} are actually invariant throughout an
>   invocation of the execuatable (as they are in our case)
>   (CSE of vectorization criteria).
>
> However, with a little bit of extra effort (instrumentation outside the
> program), the following can be determined:
>
> 3. KLON{1,2}, KLAT{1,2} are in fact known constants, which only happen
>   to be variables because the executable is built to accommodate
>   arbitrary grid sizes.
>
> Would it help to provide GCC with knowledge about KLON, KLAT (and thereby,
> KLON{1,2}, KLAT{1,2}) ?
>
> Note that this question is less academic than it seems.  We often run on the
> same grid for years without changing an executable, so this optimization
> makes sense.


IPA-CP would be the candidate to propagate that information.  But if
they are both sufficiently large the performance improvement form knowning
them is likely minimal.

Richard.

> Kind regards,
>
> --
> Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> At home: http://moene.org/~toon/
> Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.4/changes.html
>

Re: GCC Summit 2010 topic (potentially).

Reply via email to