Richard Guenther wrote:
I think the underlying issue is
phtask/41(-1) @0x7fd198c35100 availability:local 26416 time, 4268
benefit 4541 size, 880 benefit 480 bytes stack usage reachable body
local finalized inlinable
called by:
phcall/33(-1) @0x7fd198c33a00 availability:local 8281 time, 972
benefit 1351 size, 291 benefit 984 bytes stack usage reachable body
local finalized inlinable
called by:
that these are not called but still reachable (they should not be reachable
anymore, instead the clones are now reachable). I think there already is
a bug about cloning not updating cgraph reachability and not reclaiming
nodes after IPA transform application.
You don't happen to recall the bug number ?
The last time I did this sort of optimization was in 1992.
f2c (the Fortran-to-C compiler) gave me C equivalents of all Fortran
code in the forecasting executable.
I spent a rainy Sunday afternoon to paste them into one giant source
file, order them correctly (all called subroutines first) and then slap
"static inline" on them.
Subsequently, I compiled the (30,000 line) C file with gcc -O3. The
resulting executable was about 10 % faster than the original (which was
also compiled by f2c - g77 didn't exist at that time).
So my hopes on this optimization (when done right) are quite high :-)
--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html