On Thu, Oct 17, 2013 at 9:32 PM, Jeff Law <l...@redhat.com> wrote: > On 10/17/13 04:41, Richard Biener wrote: >>> >>> I don't see this as the major benefit of type demotion. Yes, there is >>> some >>> value in shrinking constants and the like, but in my experience the >>> benefits >>> are relatively small and often get lost in things like partial register >>> stalls on x86, the PA and probably others (yes, the PA has partial >>> register >>> stalls, it's just that nobody used that term). >>> >>> What I really want to get at here is avoiding having a large number of >>> optimizers looking back through the use-def chains and attempting to >>> elide >>> typecasts in the middle of a chain of statements of interest. >> >> >> Hmm, off the top of my head only forwprop and VRP look back through >> use-def chains to elide typecasts. And they do that to optimize those >> casts, thus it is their job ...? Other cases are around, but those >> are of the sorts of "is op1 available in type X and/or can I safely cast >> it to type X?" that code isn't going to be simplified by generic >> promotion / demotion because that code isn't going to know what >> type pass Y in the end wants. > > I strongly suspect if we were to look hard at why various optimizations > weren't being applied in cases where intuitively we think they should, we'd > find that type conversions are often the culprit. > > And so we'd go off fixing the vectorizer, DOM, and god knows what else to > start looking through the type conversions. I want to stop this before it > starts. > > I'm *certain* that to do this well, we're going to need a mess of additional > cases in tree-ssa-forwprop.c based on my prior investigations. A large part > of the reason I stopped with that work was I could already see the code was > ultimately going to be an utter mess. > > > >> Abstracting functions that can answer those questions instead of >> repeating N variants of it would of course be nice. > > Or we can move the type conversions out of the way so they don't impact our > optimizers.
You can't move type conversion "out of the way" in most cases as GIMPLE is stronly typed and data sources and sinks can obviously not be "promoted" (nor can function arguments). So you'll very likely not be able to remove the code from the optimizers, it will only maybe trigger less often. >> Likewise reducing the number of places we perform promotion / demotion >> (remove it from frontend code and fold, add it in the GIMPLE combiner). >> >> Also making the GIMPLE combiner available as an utility to apply >> to a single statement (see my very original GIMPLE-fold proposal) >> would be very useful. > > I strongly believe the gimple combiner is not the place to handle > promotion/demotion based on already working through some of these issues > privately. It was that investigative work which led me to look more closely > at what Kai was doing with the promotion/demotion work. > > > >> >> As for promotion / demotion (if you are not talking about applying >> PROMOTE_MODE which rather forces promotion of variables and >> requires inserting compensation code), you want to optimize >> >> op1 = (T) op1'; >> op2 = (T) op2'; >> x = op1 OP op2; (*) >> y = (T2) x; >> >> to either carry out OP in type T2 or in a type derived from the types >> of op1' and op2'. > > That's part of the benefit, but you also want to be able to look at where > op1' and op2' came from and possibly do something even more significant than > just changing the type of OP. Getting the casts out of the way makes that a > lot easier. And that's one of the reasons why you want both promotion and > demotion, both expose those kind of opportunities. > > >> >> For the simple case combine-like pattern matching is ok. It gets >> more complicated if there are a series of statements here (*), but >> even that case is handled by iteratively applying the combiner >> patterns (which forwprop does). > > Right, but you're still missing the point that every time a type conversion > apepars in a stream of interesting statements that you have to special case > the optimization to deal with the type conversions. > > With Kai's work that special casing goes away and thus our existing > reassociation & forwprop passes do a better job without needing a ton of > special cases. See above - you can't remove the special casing. >> If you split out promotion / demotion into a separate pass then >> you introduce pass ordering issues as combining may introduce >> promotion / demotion opportunities and the other way around. > > Right, which is why you promote, optimize, demote, optimize. Both promotion > and demotion have the potential to expose optimizable sequences. > > It's not perfect, but it's a hell of a lot better than what we do now. I'm not sure ;) Keep an eye on compile-time. >> If we remove the ad-hoc frontend code and strip down fold then an >> early combine phase (before CSE wrecks single-use cases) will >> more reliably handle what frontends and fold do. Conveniently the first >> forwprop is already placed very early. > > But again, you're burdening every transformation in forwprop with being > aware that there may be type conversions mid-stream and having to deal with > them. So consider a slightly different approach where we promote, run > forwprop, demote, run forwprop, all before PRE/DOM, etc wreck the single use > cases. Fact is that conversions mid-stream cannot simply be ignored. If we can remove them then a combiner pattern can possibly remove them which will make the transform that only works without them trigger subsequently. The proposed patch doesn't add a single testcase nor does it remove any special code from other optimizations so it is hard to see what it tries to enable that doesn't already work. >>> As far as dealing with the target dependencies, there's no clear "this is >>> best". I vaguely recall discussions with Kai where we decided that >>> handling >>> PROMOTE_MODE was relatively easy from a coding standpoint -- it's more a >>> matter of where does that fit into the entire optimization pipeline. I >>> could make arguments either way. >> >> >> One thing is honoring PROMOTE_MODE for deciding what types >> to promote/demote to, another thing is applying PROMOTE_MODE >> somewhen during GIMPLE optimizations with the goal to remove >> its handling from RTL expansion (I'd really like to move most of >> RTL expansions side-effects such as PROMOTE_MODE or >> strict-align bitfield memory stuff to GIMPLE). > > Can we please deal with PROMOTE_MODE independently from Kai's initial work. > Kai's work may make it easier to implement what you want, but Kai's work has > significant value independently of using it to reimplement PROMOTE_MODE in a > better place in the pipeline. I think it is related in a way because PROMOTE_MODE has the issue that it introduces tons of unnecessary casts if done naiively. So the pass, if it works properly, has to show that if we apply PROMOTE_MODE as "cost model" it will remove most of the unnecessary sign-/zero-extensions (and you'll quickly find out that with strongly typed GIMPLE this gets interesting). Richard. > jeff > >