On 10/16/13 03:31, Richard Biener wrote:
I see two primary effects of type sinking.
Note it was called type demotion ;)
;) It's a mental block of mine; it's been called type hoisting/sinking
in various contexts and I see parallels between the code motion
algorithms and how the type promotion/demotion exposes unnecessary type
conversions. So I keep calling them hoisting/sinking. I'll try to use
promotion/demotion.
First and probably the most
important in my mind is by sinking a cast through its uses the various
transformations we already perform are more likely to apply *without*
needing to handle optimizing through typecasts explicitly.
I would say it is desirable to express arithmetic in the smallest possible
types (see what premature optimization the C family frontends do
to narrow operations again after C integer promotion applied).
I don't see this as the major benefit of type demotion. Yes, there is
some value in shrinking constants and the like, but in my experience the
benefits are relatively small and often get lost in things like partial
register stalls on x86, the PA and probably others (yes, the PA has
partial register stalls, it's just that nobody used that term).
What I really want to get at here is avoiding having a large number of
optimizers looking back through the use-def chains and attempting to
elide typecasts in the middle of a chain of statements of interest.
You need some kind of range information to do this, thus either integrate
it into VRP (there is already code that does this there) or use range
information from VRP which we now preserve.
If the primary goal is to shrink types, then yes, you want to use
whatever information you can, including VRP. But that's not the primary
goal in my mind, at least not at this stage.
There's no reason why this pass couldn't utilize VRP information to
provide more opportunities to demote types and achieve the goal you
want. But I'd consider that a follow-on opportunity.
The second primary effect is, given two casts where the first indirectly
feeds the second (ie, the first feeds some statement, which then feeds the
second cast), if we're able to sink the first cast, we end up with the first
cast directly feeding the second cast. When this occurs one of the two
casts can often be eliminated. Sadly, I didn't keep any of those test
files, but I regularly saw them in GCC bootstraps.
This transformation is applied both by fold-const.c and by SSA forwprop
(our GIMPLE combiner). Doing it in yet another pass looks wrong
(and it isn't type demotion but also can be promotion).
Yes, I know. And we need to get this back down to a single
implementation. I don't much care which of the 3 implementations we
keep, but it really should just be one and it needs to be reusable.
I probably should have stated this differently -- the second primary
effect is to expose more cases where type conversions can be eliminated
via type promotion/demotion. I don't much care which of the 3 blobs of
code to eliminate the conversions we use -- I do care that we've got a
consistent way to promote/demote conversions to expose the unnecessary
type conversions.
In contrast to the desire of expressing operations in the smallest required
type there is the desire of exposing the effect of PROMOTE_MODE on
GIMPLE instead of only during RTL expansion. This is because the
truncations (sext and zext) PROMOTE_MODE introduced are
easier to optimize away when range information is available (see the
attempts to address this at RTL expansion time from Kugan from Linaro).
Right. I'm aware of this work and the problem he's trying to solve and
have been loosely watching it -- primarily for the persistent VRP
information.
Similarly, I know there's a type hoisting patch that's also queued up. I
think it should be handled separately as well.
I think we need to paint a picture of the final result - what is the
main objective of the various(?!) passes in question? Where do
we do the same kind of transformation already?
I thought we'd done this at a high level already. At the heart of this
work is to:
1. Isolate, to the fullest extent possible, code which promotes and
demotes types. We have this stuff all over the place right now
and it's very ad-hoc.
2. Promote/demote types to allow our optimizers to not concern
themselves with walking back through type conversions when applying
optimizations.
3. Promote/demote types to expose unnecessary type conversions.
If we look at #2 and #3 we can expect that we'd want a structure which
allows for a simplification/optimization step to occur after types are
promoted or demoted. ie, a pipeline that looks like:
promote types -> optimize1 -> demote types -> optimize2
Now where that little mini pipeline lands is still a big question to me.
optimize1 may be a fairly significant hunk of our pipeline. optimize2
probably isn't (may just be a final tree-ssa-forwprop pass).
We have no pass that tries to promote or demote the types of
variables with using a data-flow approach (VRP comes closest,
but the transform is again pattern-matching, thus combine-like).
I do not object to adding this kind of pass, but I suggest to
look at the targets desires when implementing it - which eventually
means to honor PROMOTE_MODE (be careful about pass
placement here - you want this after loop optimizations like
vectorization but possibly before induction variable optimization).
Placement is one of the biggest questions in my mind. If I think about
something like the old SGI compiler, they did a very early promotion,
then lowered/demoted and got reasonable results with it.
As far as dealing with the target dependencies, there's no clear "this
is best". I vaguely recall discussions with Kai where we decided that
handling PROMOTE_MODE was relatively easy from a coding standpoint --
it's more a matter of where does that fit into the entire optimization
pipeline. I could make arguments either way.
Jeff