On Thu, Jun 4, 2009 at 7:33 PM, Toon Moene <t...@moene.org> wrote: > L.S., > > This year I'm unable to attend the GCC Summit (both due to time and money > constraints). > > In 2008, I pondered to talk about the effect of link time optimization on > typical Fortran programs - > > That is, until my attention got hijacked by the geo-politically more > pressing question of Coarrays in Fortran. > > However, the issue still stands. So I'm thinking ahead of next year > (assuming LTO will work by that time for most front-end languages): > > What will LTO bring for Fortran ? > > Here's a run-of-the-mill example from our code: > > SUBROUTINE VERINT ( > I KLON , KLAT , KLEV , KINT , KHALO > I , KLON1 , KLON2 , KLAT1 , KLAT2 > I , KP , KQ , KR > R , PARG , PRES > R , PALFH , PBETH > R , PALFA , PBETA , PGAMA ) > ... > DO JY = KLAT1,KLAT2 > DO JX = KLON1,KLON2 > IDX = KP(JX,JY) > IDY = KQ(JX,JY) > ILEV = KR(JX,JY) > C > PRES(JX,JY) = PGAMA(JX,JY,1)*( > C > + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1) > + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV-1) ) > + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV-1) > + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV-1) ) ) > C + > + + PGAMA(JX,JY,2)*( > C + > + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV ) > + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV ) ) > + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV ) > + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV ) ) ) > ENDDO > ENDDO > ... > RETURN > END > > There are several issues a link time optimization pass could determine: > > 1. Whether or not the arrays PALFA, PARG, ... are suitably aligned for > vectorization (forgoing a run time check for that). > > 2. Wheter KLON{1,2}, KLAT{1,2} are actually invariant throughout an > invocation of the execuatable (as they are in our case) > (CSE of vectorization criteria). > > However, with a little bit of extra effort (instrumentation outside the > program), the following can be determined: > > 3. KLON{1,2}, KLAT{1,2} are in fact known constants, which only happen > to be variables because the executable is built to accommodate > arbitrary grid sizes. > > Would it help to provide GCC with knowledge about KLON, KLAT (and thereby, > KLON{1,2}, KLAT{1,2}) ? > > Note that this question is less academic than it seems. We often run on the > same grid for years without changing an executable, so this optimization > makes sense.
IPA-CP would be the candidate to propagate that information. But if they are both sufficiently large the performance improvement form knowning them is likely minimal. Richard. > Kind regards, > > -- > Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > At home: http://moene.org/~toon/ > Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.4/changes.html >