https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89578

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
So it looks like the change from r269097 to r269098 is only 2.5% (I've included
the followup fix ontop of r269098):

481.wrf         11170        245       45.7 *   11170        251       44.5 S
481.wrf         11170        246       45.4 S   11170        251       44.5 *

Checking r269096 reveals (BASE = r269096, PEAK = r269098 + followup):

481.wrf         11170        244       45.7 S   11170        254       44.0 S
481.wrf         11170        244       45.8 *   11170        252       44.3 *

Note that fortran guarantees on parameter aliasing are stronger than
those of restrict qualified pointers so using restrict isn't perfect
and it might suffer from the correctness fix unnecessarily.  In a
similar fashion performing inlining might make PTA do more conservative
choices than when looking at the fnspec guarantees.

For r269096 there's the possibility of simply never recomputing restrict
(gate it with !cfun->after_inlining).

Overall I see

   9.90%        201223  wrf_peak.amd64-  wrf_peak.amd64-m64-gcc42-nn  [.]
solve_interface_
   8.50%        172668  wrf_base.amd64-  wrf_base.amd64-m64-gcc42-nn  [.]
solve_interface_

which explains why we only see this with -flto (this function does nothing
but call other functions...).  It doesn't really explain why r269096 makes
such a big difference though (I guess it must be the PRE PTA recompute
triggering this).

But the function is a huge mess and the profile quite flat and similar...

Reply via email to