On Fri, Jun 22, 2018 at 5:11 PM, Kugan Vivekanandarajah <kugan.vivekanandara...@linaro.org> wrote: > When we set niter with maybe_zero, currently final_value_relacement > will not happen due to expression_expensive_p not handling. Patch 1 > adds this. > > With that we have the following optimized gimple. > > <bb 2> [local count: 118111601]: > if (b_4(D) != 0) > goto <bb 3>; [89.00%] > else > goto <bb 4>; [11.00%] > > <bb 3> [local count: 105119324]: > _2 = (unsigned long) b_4(D); > _9 = __builtin_popcountl (_2); > c_3 = b_4(D) != 0 ? _9 : 1; > > <bb 4> [local count: 118111601]: > # c_12 = PHI <c_3(3), 0(2)> > > I assume that 1 in b_4(D) != 0 ? _9 : 1; is OK (?) because when the No, it doesn't make much sense. when b_4(D) == 0, the popcount computed should be 0. Point is you can never get b_4(D) == 0 with guard condition in basic block 2. So the result should simply be:
> <bb 2> [local count: 118111601]: > if (b_4(D) != 0) > goto <bb 3>; [89.00%] > else > goto <bb 4>; [11.00%] > > <bb 3> [local count: 105119324]: > _2 = (unsigned long) b_4(D); > c_3 = __builtin_popcountl (_2); > > <bb 4> [local count: 118111601]: > # c_12 = PHI <c_3(3), 0(2)> I think this is the code generated if maybe_zero is not set? which it should not be set here. For the same reason, it can be further optimized into: > <bb 2> [local count: 118111601]: > _2 = (unsigned long) b_4(D); > c_12 = __builtin_popcountl (_2); > > latch execute zero times for b_4 == 0 means that the body will execute > ones. You never get zero times latch here with the aforementioned guard condition. BTW, I didn't look at following patches which could be wanted optimizations. Thanks, bin > > The issue here is, since we are checking if (b_4(D) != 0) before > entering the loop means we don't need to set maybe_zero. Patch 2 > handles this. > > With that we have > <bb 2> [local count: 118111601]: > if (b_4(D) != 0) > goto <bb 3>; [89.00%] > else > goto <bb 4>; [11.00%] > > <bb 3> [local count: 105119324]: > _2 = (unsigned long) b_4(D); > _9 = __builtin_popcountl (_2); > > <bb 4> [local count: 118111601]: > # c_12 = PHI <0(2), _9(3)> > > As advised earlier, patch 3 adds phiopt support to remove this. > > Bootstrap and regression testing are ongoing. > > Is this OK for trunk. > > Thanks, > Kugan