On 10/17/13 04:41, Richard Biener wrote:
I don't see this as the major benefit of type demotion.  Yes, there is some
value in shrinking constants and the like, but in my experience the benefits
are relatively small and often get lost in things like partial register
stalls on x86, the PA and probably others (yes, the PA has partial register
stalls, it's just that nobody used that term).

What I really want to get at here is avoiding having a large number of
optimizers looking back through the use-def chains and attempting to elide
typecasts in the middle of a chain of statements of interest.

Hmm, off the top of my head only forwprop and VRP look back through
use-def chains to elide typecasts.  And they do that to optimize those
casts, thus it is their job ...?  Other cases are around, but those
are of the sorts of "is op1 available in type X and/or can I safely cast
it to type X?" that code isn't going to be simplified by generic
promotion / demotion because that code isn't going to know what
type pass Y in the end wants.
I strongly suspect if we were to look hard at why various optimizations weren't being applied in cases where intuitively we think they should, we'd find that type conversions are often the culprit.

And so we'd go off fixing the vectorizer, DOM, and god knows what else to start looking through the type conversions. I want to stop this before it starts.

I'm *certain* that to do this well, we're going to need a mess of additional cases in tree-ssa-forwprop.c based on my prior investigations. A large part of the reason I stopped with that work was I could already see the code was ultimately going to be an utter mess.


Abstracting functions that can answer those questions instead of
repeating N variants of it would of course be nice.
Or we can move the type conversions out of the way so they don't impact our optimizers.




Likewise reducing the number of places we perform promotion / demotion
(remove it from frontend code and fold, add it in the GIMPLE combiner).

Also making the GIMPLE combiner available as an utility to apply
to a single statement (see my very original GIMPLE-fold proposal)
would be very useful.
I strongly believe the gimple combiner is not the place to handle promotion/demotion based on already working through some of these issues privately. It was that investigative work which led me to look more closely at what Kai was doing with the promotion/demotion work.



As for promotion / demotion (if you are not talking about applying
PROMOTE_MODE which rather forces promotion of variables and
requires inserting compensation code), you want to optimize

  op1 = (T) op1';
  op2 = (T) op2';
  x = op1 OP op2; (*)
  y = (T2) x;

to either carry out OP in type T2 or in a type derived from the types
of op1' and op2'.
That's part of the benefit, but you also want to be able to look at where op1' and op2' came from and possibly do something even more significant than just changing the type of OP. Getting the casts out of the way makes that a lot easier. And that's one of the reasons why you want both promotion and demotion, both expose those kind of opportunities.


For the simple case combine-like pattern matching is ok.  It gets
more complicated if there are a series of statements here (*), but
even that case is handled by iteratively applying the combiner
patterns (which forwprop does).
Right, but you're still missing the point that every time a type conversion apepars in a stream of interesting statements that you have to special case the optimization to deal with the type conversions.

With Kai's work that special casing goes away and thus our existing reassociation & forwprop passes do a better job without needing a ton of special cases.


If you split out promotion / demotion into a separate pass then
you introduce pass ordering issues as combining may introduce
promotion / demotion opportunities and the other way around.
Right, which is why you promote, optimize, demote, optimize. Both promotion and demotion have the potential to expose optimizable sequences.

It's not perfect, but it's a hell of a lot better than what we do now.


If we remove the ad-hoc frontend code and strip down fold then an
early combine phase (before CSE wrecks single-use cases) will
more reliably handle what frontends and fold do.  Conveniently the first
forwprop is already placed very early.
But again, you're burdening every transformation in forwprop with being aware that there may be type conversions mid-stream and having to deal with them. So consider a slightly different approach where we promote, run forwprop, demote, run forwprop, all before PRE/DOM, etc wreck the single use cases.




As far as dealing with the target dependencies, there's no clear "this is
best".  I vaguely recall discussions with Kai where we decided that handling
PROMOTE_MODE was relatively easy from a coding standpoint -- it's more a
matter of where does that fit into the entire optimization pipeline.  I
could make arguments either way.

One thing is honoring PROMOTE_MODE for deciding what types
to promote/demote to, another thing is applying PROMOTE_MODE
somewhen during GIMPLE optimizations with the goal to remove
its handling from RTL expansion (I'd really like to move most of
RTL expansions side-effects such as PROMOTE_MODE or
strict-align bitfield memory stuff to GIMPLE).
Can we please deal with PROMOTE_MODE independently from Kai's initial work. Kai's work may make it easier to implement what you want, but Kai's work has significant value independently of using it to reimplement PROMOTE_MODE in a better place in the pipeline.

jeff


Reply via email to