On Sat, Oct 26, 2013 at 2:30 PM, Andrew Pinski <pins...@gmail.com> wrote: > On Fri, Oct 18, 2013 at 2:21 AM, Zhenqiang Chen > <zhenqiang.c...@linaro.org> wrote: >> On 18 October 2013 00:58, Jeff Law <l...@redhat.com> wrote: >>> On 10/17/13 05:03, Richard Biener wrote: >>>>>> >>>>>> Is it OK for trunk? >>>>> >>>>> >>>>> I had a much simpler change which did basically the same from 4.7 (I >>>>> can update it if people think this is a better approach). >>>> >>>> >>>> I like that more (note you can now use is_gimple_condexpr as predicate >>>> for force_gimple_operand). >>> >>> The obvious question is whether or not Andrew's simpler change picks up as >>> many transformations as Zhenqiang's change. If not are the things missed >>> important. >>> >>> Zhenqiang, can you do some testing of your change vs Andrew P.'s change? >> >> Here is a rough compare: >> >> 1) Andrew P.'s change can not handle ssa-ifcombine-ccmp-3.c (included >> in my patch). Root cause is that it does not skip "LABEL". The guard >> to do this opt should be the same the bb_has_overhead_p in my patch. > > This should be an easy change, I am working on this right now. > >> >> 2) Andrew P.'s change always generate TRUTH_AND_EXPR, which is not >> efficient for "||". e.g. For ssa-ifcombine-ccmp-6.c, it will generate >> >> _3 = a_2(D) > 0; >> _5 = b_4(D) > 0; >> _6 = _3 | _5; >> _9 = c_7(D) <= 0; >> _10 = ~_6; >> _11 = _9 & _10; >> if (_11 == 0) >> >> With my patch, it will generate >> >> _3 = a_2(D) > 0; >> _5 = b_4(D) > 0; >> _6 = _3 | _5; >> _9 = c_7(D) > 0; >> _10 = _6 | _9; >> if (_10 != 0) > > As mentioned otherwise, this seems like a missed optimization inside > forwprop. When I originally wrote this code there used to be two > cases one for & and one for |, but this was removed sometime and I > just made the code evolve with that.
Actually I can make a small patch (3 lines) to my current patch which causes tree-ssa-ifcombine.c to produce the behavior of your patch. if (result_inv) { t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t); result_inv = false; } I will submit a new patch after some testing of my current patch which fixes items 1 and 2. Thanks, Andrew Pinski > >> >> 3) The good thing of Andrew P.'s change is that "Inverse the order of >> the basic block walk" so it can do combine recursively. >> >> But I think we need some heuristic to control the number of ifs. Move >> too much compares from >> the inner_bb to outer_bb is not good. > > > I think this depends on the target. For MIPS we don't want an upper > bound as integer comparisons go directly to GPRs. I wrote this patch > with MIPS in mind as that was the target I was working on. > >> >> 4) Another good thing of Andrew P.'s change is that it reuses some >> existing functions. So it looks much simple. > > > Thanks, > Andrew Pinski > >> >>>> >>>> With that we should be able to kill the fold-const.c transform? >>> >>> That would certainly be nice and an excellent follow-up for Zhenqiang. >> >> That's my final goal to "kill the fold-const.c transform". I think we >> may combine the two changes to make a "simple" and "good" patch. >> >> Thanks! >> -Zhenqiang