http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

--- Comment #9 from Jan Hubicka <hubicka at ucw dot cz> ---
> It's not a matter of cost model, but if propagating the values to their uses. 
> I haven't looked closely at the tracer, but wouldn't it benefit by having
> constants in particular propagated to their uses?

Tracer depends on the usual estimate_num_insns limits
 (it is 12 years since I wrote it, so what I recall)
It collects BBs that are interesting starts of traces, takes them in priority
order and duplicates from the seeds until code growth is met or it runs out of
interesting candidates by other criteria.

I think it generally tends to starve on candidates as the definition of trace
is
relatively strong, but I am not 100% sure on it. So it is not that much
dependent
on bounds given by code size metric.

If we had unlimited time, it would  be better to propagate constants and
cleanup
both before and after tracer.

If we can chose whether we want to do tracer before last pass that is able
to propagate and fold constants or after, I would chose before for the reason
I mentioned on begginig; the whole point of the tail duplication is to simplify
CFG and allow better propagation.

I think missed tracing here and there is less painful than missed optimizatoin
in duplicated code.

We may even consider pushing tracer before DOM, since tail duplication may
enable
DOM to produce more useful threading/propagation and code after tracer is not
too painfuly obstructated. Sure you can end up with PHI that has only one
constant
argument.  I can see that DOM may miss optimization here.
> Propagating the constant for x' in BBm and eliminating the degenerate is what
> the phi-only cprop pass does.  If the tracer generates similar things, then
> running phi-only cprop after it might be useful as well.  It *should* be very
> fast.

Yes, tracer does similar things.  You can think about it as about speculative
jump threading - if one path through meet points seems more likely than the
other based on profile, tracer will duplicate it in a hope that later
optimization pass will prove some of conditionals constant over the duplicated
path.  For that it needs subsequent propagation pass (CCP or better VRP) to
match.  That is why its current place in pass queue is unlucky.  Possible
benefits of tail duplications are of course not limited to threading.

We can do one extra cleanup pass, too.  Tracer is on by default only with
-fprofile-use so extra phi-only cprop with -ftracer probably is not dangerous
to overall compile time experience.

Honza

Reply via email to