On 18 October 2013 17:53, Richard Biener <richard.guent...@gmail.com> wrote: > On Fri, Oct 18, 2013 at 11:21 AM, Zhenqiang Chen > <zhenqiang.c...@linaro.org> wrote: >> On 18 October 2013 00:58, Jeff Law <l...@redhat.com> wrote: >>> On 10/17/13 05:03, Richard Biener wrote: >>>>>> >>>>>> Is it OK for trunk? >>>>> >>>>> >>>>> I had a much simpler change which did basically the same from 4.7 (I >>>>> can update it if people think this is a better approach). >>>> >>>> >>>> I like that more (note you can now use is_gimple_condexpr as predicate >>>> for force_gimple_operand). >>> >>> The obvious question is whether or not Andrew's simpler change picks up as >>> many transformations as Zhenqiang's change. If not are the things missed >>> important. >>> >>> Zhenqiang, can you do some testing of your change vs Andrew P.'s change? >> >> Here is a rough compare: >> >> 1) Andrew P.'s change can not handle ssa-ifcombine-ccmp-3.c (included >> in my patch). Root cause is that it does not skip "LABEL". The guard >> to do this opt should be the same the bb_has_overhead_p in my patch. > > I think we want a "proper" predicate in tree-cfg.c for this, like maybe > a subset of tree_forwarder_block_p or whatever it will end up looking > like (we need "effectively empty BB" elsewhere, for example in vectorization, > add a flag to allow a condition ending the BB and the predicate is done). > >> 2) Andrew P.'s change always generate TRUTH_AND_EXPR, which is not >> efficient for "||". e.g. For ssa-ifcombine-ccmp-6.c, it will generate >> >> _3 = a_2(D) > 0; >> _5 = b_4(D) > 0; >> _6 = _3 | _5; >> _9 = c_7(D) <= 0; >> _10 = ~_6; >> _11 = _9 & _10; >> if (_11 == 0) >> >> With my patch, it will generate >> >> _3 = a_2(D) > 0; >> _5 = b_4(D) > 0; >> _6 = _3 | _5; >> _9 = c_7(D) > 0; >> _10 = _6 | _9; >> if (_10 != 0) > > But that seems like a missed simplification in predicate combining > which should be fixed more generally. > >> 3) The good thing of Andrew P.'s change is that "Inverse the order of >> the basic block walk" so it can do combine recursively. >> >> But I think we need some heuristic to control the number of ifs. Move >> too much compares from >> the inner_bb to outer_bb is not good. > > True, but that's what fold-const.c does, no?
Based on current fold-const, we can not generate more than "two" compares in a basic block. But if ifs can be combined recursively in ifcombine, we might generate many compares in a basic block. >> 4) Another good thing of Andrew P.'s change is that it reuses some >> existing functions. So it looks much simple. > > Indeed - that's what I like about it. > >>>> >>>> With that we should be able to kill the fold-const.c transform? >>> >>> That would certainly be nice and an excellent follow-up for Zhenqiang. >> >> That's my final goal to "kill the fold-const.c transform". I think we >> may combine the two changes to make a "simple" and "good" patch. > > Thanks, > Richard. > >> Thanks! >> -Zhenqiang