On Fri, Sep 09, 2016 at 09:26:39AM -0600, Jeff Law wrote: > >>I think one of the questions (and I haven't looked through the whole > >>thread yet to see if it's answered) is why the basic shrink-wrapping > >>algorithm can't be applied to each of the prologue components -- though > >>you may have answered it with your loop example below. > > > >You get code size explosion with the "basic" algorithm. And a lot of > >that isn't really worth it: avoiding the whole prologue/epilogue is > >usually worth copying some blocks for, but avoiding just a single register > >save/restore pair? More doubtful. > Understood. We have similar balancing problems elsewhere. How much > duplication should we allow to expose a CSE or eliminate a conditional > jump on a path through the CFG. > > So I think sticking with this as a design decision makes sense -- does > it impact the issue around running a particular component's prologue > more than once?
I don't follow, sorry; could you rephrase? > >>Thanks. I'd been wondering about when it'd be useful to push prologue > >>code into a loop nest when I saw the patches fly by, but didn't think > >>about it too much. I haven't looked at the shrink-wrapping literature > >>in years, but I don't recall it having any concept that there were cases > >>where sinking into a loop nest was profitable. > > > >It isn't common, but it does happen. If you use a proper cost metric > >based on executiuon frequency with some algorithm then that algorithm will > >naturally avoid placing *logues into loops. > Right and the cases where it's placing them into loops it's going to be > doing so on paths that are unlikely executed within the loop. ISTM that > using frequencies is a significant step forward over the older placement > algorithms in this space (which IIRC were structured as simple dataflow > solvers) -- with the added advantage that if we have profiling data we > can make better decisions. Using the known/guessed execution frequencies is not really new, just not forty years old :-) > Does this impact the compile time computation complexity issue that was > raised elsewhere? I'm not sure what you mean here either, sorry. It is all O(NM) with N the number of BBs and M the number of components (and assuming dom lookups are constant time, as usual). Segher