http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545
--- Comment #9 from Jan Hubicka <hubicka at ucw dot cz> --- > It's not a matter of cost model, but if propagating the values to their uses. > I haven't looked closely at the tracer, but wouldn't it benefit by having > constants in particular propagated to their uses? Tracer depends on the usual estimate_num_insns limits (it is 12 years since I wrote it, so what I recall) It collects BBs that are interesting starts of traces, takes them in priority order and duplicates from the seeds until code growth is met or it runs out of interesting candidates by other criteria. I think it generally tends to starve on candidates as the definition of trace is relatively strong, but I am not 100% sure on it. So it is not that much dependent on bounds given by code size metric. If we had unlimited time, it would be better to propagate constants and cleanup both before and after tracer. If we can chose whether we want to do tracer before last pass that is able to propagate and fold constants or after, I would chose before for the reason I mentioned on begginig; the whole point of the tail duplication is to simplify CFG and allow better propagation. I think missed tracing here and there is less painful than missed optimizatoin in duplicated code. We may even consider pushing tracer before DOM, since tail duplication may enable DOM to produce more useful threading/propagation and code after tracer is not too painfuly obstructated. Sure you can end up with PHI that has only one constant argument. I can see that DOM may miss optimization here. > Propagating the constant for x' in BBm and eliminating the degenerate is what > the phi-only cprop pass does. If the tracer generates similar things, then > running phi-only cprop after it might be useful as well. It *should* be very > fast. Yes, tracer does similar things. You can think about it as about speculative jump threading - if one path through meet points seems more likely than the other based on profile, tracer will duplicate it in a hope that later optimization pass will prove some of conditionals constant over the duplicated path. For that it needs subsequent propagation pass (CCP or better VRP) to match. That is why its current place in pass queue is unlucky. Possible benefits of tail duplications are of course not limited to threading. We can do one extra cleanup pass, too. Tracer is on by default only with -fprofile-use so extra phi-only cprop with -ftracer probably is not dangerous to overall compile time experience. Honza