On Thu, Feb 1, 2018 at 5:07 AM, Kugan Vivekanandarajah <kugan.vivekanandara...@linaro.org> wrote: > Hi Richard, > > On 31 January 2018 at 21:39, Richard Biener <richard.guent...@gmail.com> > wrote: >> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah >> <kugan.vivekanandara...@linaro.org> wrote: >>> Hi Richard, >>> >>> Thanks for the review. >>> On 25 January 2018 at 20:04, Richard Biener <richard.guent...@gmail.com> >>> wrote: >>>> On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah >>>> <kugan.vivekanandara...@linaro.org> wrote: >>>>> Hi All, >>>>> >>>>> Here is a patch for popcount builtin detection similar to LLVM. I >>>>> would like to queue this for review for next stage 1. >>>>> >>>>> 1. This is done part of loop-distribution and effective for -O3 and above. >>>>> 2. This does not distribute loop to detect popcount (like >>>>> memcpy/memmove). I dont think that happens in practice. Please correct >>>>> me if I am wrong. >>>> >>>> But then it has no business inside loop distribution but instead is >>>> doing final value >>>> replacement, right? You are pattern-matching the whole loop after all. I >>>> think >>>> final value replacement would already do the correct thing if you >>>> teached number of >>>> iteration analysis that niter for >>>> >>>> <bb 3> [local count: 955630224]: >>>> # b_11 = PHI <b_5(5), b_8(6)> >>>> _1 = b_11 + -1; >>>> b_8 = _1 & b_11; >>>> if (b_8 != 0) >>>> goto <bb 6>; [89.00%] >>>> else >>>> goto <bb 8>; [11.00%] >>>> >>>> <bb 6> [local count: 850510900]: >>>> goto <bb 3>; [100.00%] >>> >>> I am looking into this approach. What should be the scalar evolution >>> for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me >>> and can this be represented with the scev? >> >> No, it's not affine and thus cannot be represented. You only need the >> scalar evolution of the counting IV which is already handled and >> the number of iteration analysis needs to handle the above IV - this >> is the missing part. > Thanks for the clarification. I am now matching this loop pattern in > number_of_iterations_exit when number_of_iterations_exit_assumptions > fails. If the pattern matches, I am inserting the _builtin_popcount in > the loop preheater and setting the loop niter with this. This will be > used by the final value replacement. Is this what you wanted?
No, you shouldn't insert a popcount stmt but instead the niter GENERIC tree should be a CALL_EXPR to popcount with the appropriate argument. > In general, there is also a condition gating the loop like > > if (b_4 != 0) > goto loop; > else > end: > > This of course will not be removed by final value replacement. Since > popcount (0) is defined, this is redundant and could be removed but > not removed. Yeah, that's probably sth for another pass though. I suppose the end: case just uses zero in which case you'll have a PHI and you can optimize if (b != 0) return popcount (b); return 0; in phiopt. Richard. > Thanks, > Kugan > >> >> Richard. >> >>> Thanks, >>> Kugan >>>> >>>> is related to popcount (b_5). >>>> >>>> Richard. >>>> >>>>> Bootstrapped and regression tested on aarch64-linux-gnu with no new >>>>> regressions. >>>>> >>>>> Thanks, >>>>> Kugan >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>> 2018-01-25 Kugan Vivekanandarajah <kug...@linaro.org> >>>>> >>>>> PR middle-end/82479 >>>>> * tree-loop-distribution.c (handle_popcount): New. >>>>> (pass_loop_distribution::execute): Use handle_popcount. >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> >>>>> 2018-01-25 Kugan Vivekanandarajah <kug...@linaro.org> >>>>> >>>>> PR middle-end/82479 >>>>> * gcc.dg/tree-ssa/popcount.c: New test.