On Mon, Jun 25, 2018 at 11:37 AM, Kugan Vivekanandarajah
<kugan.vivekanandara...@linaro.org> wrote:
> Hi Bin,
>
> Thanks for your comments.
>
> On 25 June 2018 at 11:15, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> On Fri, Jun 22, 2018 at 5:11 PM, Kugan Vivekanandarajah
>> <kugan.vivekanandara...@linaro.org> wrote:
>>> When we set niter with maybe_zero, currently final_value_relacement
>>> will not happen due to expression_expensive_p not handling. Patch 1
>>> adds this.
>>>
>>> With that we have the following optimized gimple.
>>>
>>>   <bb 2> [local count: 118111601]:
>>>   if (b_4(D) != 0)
>>>     goto <bb 3>; [89.00%]
>>>   else
>>>     goto <bb 4>; [11.00%]
>>>
>>>   <bb 3> [local count: 105119324]:
>>>   _2 = (unsigned long) b_4(D);
>>>   _9 = __builtin_popcountl (_2);
>>>   c_3 = b_4(D) != 0 ? _9 : 1;
>>>
>>>   <bb 4> [local count: 118111601]:
>>>   # c_12 = PHI <c_3(3), 0(2)>
>>>
>>> I assume that 1 in  b_4(D) != 0 ? _9 : 1; is OK (?) because when the
>> No, it doesn't make much sense.  when b_4(D) == 0, the popcount
>> computed should be 0.  Point is you can never get b_4(D) == 0 with
>> guard condition in basic block 2.  So the result should simply be:
>
> When we do  calculate niter (for the copy header case), we set
> may_be_zero as (which I think is correct)
> niter->may_be_zero = fold_build2 (EQ_EXPR, boolean_type_node, src,
>                   build_zero_cst
>                   (TREE_TYPE (src)));
>
> Then in final_value_replacement_loop (struct loop *loop)
>
> for the PHI stmt for which we are going to do the final value replacement,
> we analyze_scalar_evolution_in_loop which is POLYNOMIAL_CHREC.
>
> then we do
> compute_overall_effect_of_inner_loop (struct loop *loop, tree evolution_fn)
>
> where when we do chrec_apply to the polynomial_chrec with niter from
> popcount which also has the may_be_zero, we end up with the 1.
> Looking at this, I am not sure if this is wrong. May be I am missing 
> something.
I think it is wrong.  How could you get popcount == 1 when b_4(D) ==
0?  Though it never happens in this case.
>
> In this testcase, before we enter the loop we have a check for (b_4(D)
>> 0). Thus, setting niter->may_be_zero is not strictly necessary but
> conservatively correct (?).
Yes, but not necessarily.  Setting maybe_zero could confuse following
optimizations and we should avoid doing that whenever possible.  If
any pass goes wrong because it's not set conservatively, it is that
pass' responsibility and should be fixed accordingly.  Here IMHO, we
don't need to set it.

Thanks,
bin
>
> Thanks,
> Kugan
>
>>
>>>   <bb 2> [local count: 118111601]:
>>>   if (b_4(D) != 0)
>>>     goto <bb 3>; [89.00%]
>>>   else
>>>     goto <bb 4>; [11.00%]
>>>
>>>   <bb 3> [local count: 105119324]:
>>>   _2 = (unsigned long) b_4(D);
>>>   c_3 = __builtin_popcountl (_2);
>>>
>>>   <bb 4> [local count: 118111601]:
>>>   # c_12 = PHI <c_3(3), 0(2)>
>>
>> I think this is the code generated if maybe_zero is not set?  which it
>> should not be set here.
>> For the same reason, it can be further optimized into:
>>
>>>   <bb 2> [local count: 118111601]:
>>>   _2 = (unsigned long) b_4(D);
>>>   c_12 = __builtin_popcountl (_2);
>>>
>>
>>> latch execute zero times for b_4 == 0 means that the body will execute
>>> ones.
>> You never get zero times latch here with the aforementioned guard condition.
>>
>> BTW, I didn't look at following patches which could be wanted optimizations.
>>
>> Thanks,
>> bin
>>>
>>> The issue here is, since we are checking if (b_4(D) != 0) before
>>> entering the loop means we don't need to set maybe_zero. Patch 2
>>> handles this.
>>>
>>> With that we have
>>>   <bb 2> [local count: 118111601]:
>>>   if (b_4(D) != 0)
>>>     goto <bb 3>; [89.00%]
>>>   else
>>>     goto <bb 4>; [11.00%]
>>>
>>>   <bb 3> [local count: 105119324]:
>>>   _2 = (unsigned long) b_4(D);
>>>   _9 = __builtin_popcountl (_2);
>>>
>>>   <bb 4> [local count: 118111601]:
>>>   # c_12 = PHI <0(2), _9(3)>
>>>
>>> As advised earlier, patch 3 adds phiopt support to remove this.
>>>
>>> Bootstrap and regression testing are ongoing.
>>>
>>> Is this OK for trunk.
>>>
>>> Thanks,
>>> Kugan

Reply via email to