On 5/8/19 6:20 AM, Richard Biener wrote: > On Wed, 8 May 2019, Jiufu Guo wrote: > >> Hi, >> >> Thanks Richard, Segher, Andrew and all. >> >> Segher Boessenkool <seg...@kernel.crashing.org> writes: >> >>> Let me try to answer some of this... >>> >>> On Tue, May 07, 2019 at 03:31:27PM +0200, Richard Biener wrote: >>>> On Mon, 6 May 2019, Jiufu Guo wrote: >>>>> This patch implements the optimization in PR77820. The optimization >>>>> eliminates phi and phi's basic block, if the phi is used only by >>>>> condition branch, and the phi's incoming value in the result of a >>>>> CMP result. >>> >>>> I'm not sure I like a new pass here ;) The transform is basically >>>> tail-duplicating the PHI block because the exit conditional can >>>> be "simplfied" - that's something jump threading generally does >>>> though it relies on "simplified" being a little bit more simplified >>>> than above. >>> >> Adding a new pass is just because it may be relatively easy to extend >> and maintain. >> >> Adding this micor-optimization into other passes is also a good >> sugguestion. >> >> - Similar with jump threading, this new transform is replacing jump >> old destination with new destination. While for this case, the new >> destination can not be determined at compile time. >> >> - And as Andrew said: forwprop/backprop are similar, but this case is in >> opposite: it is spread common code(phi+gcond) into different preds. >> >> - And there is phiopt pass which focus on making 'phi' stmt better. >> it also do some similar thing: >> bb0: >> if (a != b) goto bb2; else goto bb1; >> bb1: >> bb2: >> x = PHI <a (bb1), b (bb0), ...>; >> >> tranform to: >> >> bb0: >> bb2: >> x = PHI <b (bb0), ...>; >> >> If I avoid to add new pass, any suggustions about which exiting pass >> (jumpthreading/forwprop/phiopt/...) would be more suitable to extend to >> support this new transform? > > The main thing the transform does is tail-duplicate the PHI block, > if we'd just do that followup transforms would do the rest. One might even claim this is really a transformation for cfg cleanups. If we ignore the PHI what we have is a unconditional jump to a conditional jump. The obvious way to optimize that is to replace the unconditional jump with a copy of the conditional jump.
>> Currently this pass handles the case if there is only one PHI and the >> phi is used by gcond. If there are other suitable cases, we can just >> extend this tranform. > > I was just thinking of a small amount of unrelated stmts in those > blocks or the case where just one PHI argument is fed by a conditional > thus only one path would simplify (but maybe the hot one if you look > at edge probabilities). Perhaps, but do these happen enough in practice? With the code motions we do I'd be a bit surprised. > >>> >>>> Any specific reason for the pass placement between PRE and sinking? >>>> tracer and path splitting run much later, jump threading runs >>>> all over the place. >>> >>> Dunno. Jiufu, does the pass placement matter much? >> This pass would just need be inserted after ssa, and before the last dce >> and forwprop which could help to eliminate temp conversion from bool to >> int: >> _1 = a_8(D) < b_9(D); >> t_14 = (int) _1; >> if (_1 != 0) >> >> Putting it after PRE is just because some case is better be handled by PREx, >> like below case: if (x) t = a < b; else t = a < b; if (t) xx. >> While actually, this pass is ok to insert at other placement. > > OK, so all of the suggestions above would work since we have a late > forwprop and DCE pass to clean up. I'd expect this transformation to be useful for jump threading, VRP, DOM, etc. > > Btw, I wonder if on RTL basic-block reordering (which also does > some tail duplication) could be a place to do such transform? > Or is it too late to do the desired cleanups after that? > Possibly since we're after RA.IIRC we've had code in jump.c to do this in the > past. It may have morphed through the years, but I'm pretty sure we've done this kind of thing on a low level before. jeff