Hi,
after IRA I've re-done x86-64 SPECint testing (SPECfp, CSiBE and C++
benchmark failed because tree was broken at that point, I will get
results tomorrow, but there was no surprises already before) also with
the new code to eliminate arguments. Luis also did PPC SPEC runs.

The most important regression I am aware of is equake on PPC, about 5%
with clonning enabled.  The problem here is that IPCP correctly propagate
operands array into initialization functions.  This makes them to be
called just once (since clonning avoids the fact that quake lacks static
modifiers) and we inline them.  Inlining cause branch prediction code to
misidentify hot portion of program to be the initialization loop instead
of simulation loop. Same problem happens e.g. with -fwhole-program or if 
"static" keywords are added.  Equake consist of single file.
So I don't think it is IPCP fault per se, we are just unlucky.

Other interesting issue happens with vortex, where we get quite high code
size growth, 660KB->720Kb.  The problem here is that vortex has statistic
code that is implemented by passing __FILE__, __LINE__ macros.
SPEC for some reason redefine it to NULL, 0.  This cause IPCP to
propagate those two constant into about every function and we end up
with a lot of clonning.  New operand removal code helps code size quite
a lot (originally we needed 800Kb), so I think it is within bounds for
-O3.   Argument skipping also makes propagation to actually help
performance here (1400->1460).

So my current plan is to wait for tomorrow results of the failed
benchmarks, get also IA-64 result and enable IPCP by default by
tomorrow.  I also hope that the argument skipping code will get reviewed
before stage1 so we don't get unnecesary code growth on VRP or we can
trottle down clonning even further for release.

Honza

> Hi,
> Since most of issues with IPCP should be fixed now and it should be as
> strong as possible with the elementary textbook quality algorithm it
> uses, I would like to enable it by default.  I've tested it on SPEC and
> C++ behcmarks yeterday and didn't measured any significant improvments.
> There is quite a lot of clonning happening now (as can be seen on size
> increase) on SPECint, but the benchmark performance don't care much.
> Today I fixed some isses and added code to avoid code size growth, so
> I expect IPCP to be mostly neutral.  Will re-run the tests tonight.
> 
> To some degree I would say it is expected as all those codebases are quite 
> well
> hand optimized and this is kind of optimization one does by hand if needed.
> 
> Ipcp can now run in tow modes: in pure constant propagation when clonning is
> not happening (well, in fact it does. We clone the function and the remove the
> original as in place replacement is not implement (yet)) and with clonning.
> With clonning overall unit growth is limited to 10% and ipcp performs very
> simplistic analysis of effectivity and will clone functions until the limit is
> met in priority order.
> 
> On CC1 binarry I get following results:
>                       stripped size   clonned functions
> no ipcp                       8863256         0
> ipcp only             8773432         45
> ipcp&clonning         8772344         154
> (with unlimited clonning we get about 180 clones)
> 
> Additionally ipcp and ipcp&clonning binary seems consistently 0.7%-1.5% faster
> on compiling C objects.
> 
> Since IPCP seems essentially free for compile time (i.e. intraprocedural
> analysis is performed for inlining anyway and interprocedural step is very
> cheap when nothing is transformed) and cause 1% code size savings on cc1 
> and speeds it up enough to pay back, I would like to propose:
>   - enable IPCP for -O2 and -Os
>   - enable IPCP clonning for -O3
> 
> Ipcp also carries basic IPA infrastructure we want to keep excercised
> (jump function analysis, clonning, propagation to function bodies and
> infrastructure to solve cgraph updates for future whopr mode).
> 
> I would still like to work on better ipcp cost model (i.e. estimates on how
> much function will simplify with constant propagation) and also to allow
> producing multiple clones when function is called with multiple constant
> arguments.
> 
> If there are no complains, I will enable ipcp as proposed after remaining
> patches are tested and comitted (that would be about day after tomorrow)
> Honza

Reply via email to