> > 
> > 
> > > Am 16.11.2024 um 14:08 schrieb Jan Hubicka <hubi...@ucw.cz>:
> > > 
> > > Ignore conditions guarding __builtin_unreachable in inliner metrics
> > > 
> > > This patch extends my last year attempt to make inliner metric ignore
> > > conditionals guarding __builtin_unreachable.  Compared to previous patch, 
> > > this
> > > one implements a "mini-dce" in ipa-fnsummary to avoid accounting all 
> > > statements
> > > that are only used to determine conditionals guarding 
> > > __builtin_unnecesary.
> > > These will be removed later once value ranges are determined.
> > > 
> > > While working on this, I noticed that we do have a lot of dead code while
> > > computing fnsummary for early inline. Those are only used to apply
> > > large-function growth, but it seems there is enough dead code to make this
> > > valud kind of irrelevant.  Also there seems to be quite a lot of 
> > > const/pure
> > > calls that can be cheaply removed before we inline them.  So I wonder if 
> > > we
> > > want to run one DCE before early inlining.
> > 
> > I would not have expected a ‚lot‘ of dead const function calls.  By same 
> > argument we should rather run CCP before inlining as that tends to prune 
> > most dead code early?
Just to quantify a lot, on tramp3d there are 2263 calls declared dead 
and 34206 declared live (by my mini-dce) in fnsummary1. So 6%.
There are overall 3797 unnecesary stmts and 97263 necessary ones.
So most of dead stuff are actually calls.

In IPA fnsummary there are 4 dead calls, all of them .part clones. 
This looks like missed optimization pre ipa-fnsplit.

Those are most frequent ones:
     12   skipping unnecesary stmt 
Evaluator<RemoteMultiPatchEvaluatorTag>::Evaluator (&evaluator);
     12   skipping unnecesary stmt Pooma::DummyMutex::lock (_1);
     16   skipping unnecesary stmt ForEach<Scalar<double>, DomainFunctorTag, 
DomainFunctorTag>::apply (_2, f_7(D), c_8(D));
     23   skipping unnecesary stmt operator delete (_5, _2);
     24   skipping unnecesary stmt 
Evaluator<RemoteMultiPatchEvaluatorTag>::~Evaluator (&evaluator);
     28   skipping unnecesary stmt ForEach<Scalar<double>, DomainFunctorTag, 
DomainFunctorTag>::apply (_1, f_6(D), c_7(D));
     37   skipping unnecesary stmt _37 = 
Field<UniformRectilinearMesh<MeshTraits<3, double, UniformRectilinearTag, 
CartesianTag, 3> >, double, BrickView>::engine (_9);
     37   skipping unnecesary stmt _39 = engineFunctor<Engine<3, double, 
BrickView>, DataObjectRequest<BlockAffinity> > (_10, &getAffinity);
     43   skipping unnecesary stmt 
DataObjectRequest<BlockAffinity>::DataObjectRequest (&getAffinity);
     43   skipping unnecesary stmt 
MultiArgEvaluator<SinglePatchEvaluatorTag>::MultiArgEvaluator (&speval);
     43   skipping unnecesary stmt Smarts::Iterate<Smarts::Stub>::hintAffinity 
(_8, _11);
     86   skipping unnecesary stmt 
MultiArgEvaluator<SinglePatchEvaluatorTag>::~MultiArgEvaluator (&speval);
    227   skipping unnecesary stmt PoomaCTAssert<true>::test ();

Mostly those are functions which are empty at release_ssa time. Perhaps
we could special case pures/const calls with no LHS and get rid of them
during early inline. This is cheaper then doing actual inline which
triggers a lot of logic in tree-inline...

Honza

Reply via email to