Re: [PATCH, RFC, rs6000] PR80791 Consider doloop in ivopts

Kewen.Lin Sat, 04 May 2019 22:27:36 -0700

on 2019/4/27 上午11:44, Bin.Cheng wrote:
> On Fri, Apr 26, 2019 at 2:44 PM Kewen.Lin <li...@linux.ibm.com> wrote:
>> +
>> +      /* zero cost use makes it easier to select memory based iv cand
>> +         for replacement of non memory based iv and its use.   But if
>> +         the setup sequence are too costly, loop iv analysis can NOT
>> +         easily figure out it's finite, it's possible to stop the
>> +         low-overhead loop transformation and get unexpected code.  */
>> +      if (use->zero_cost_p && cand->iv->base_object && !use->iv->base_object
>> +         && elim_cost.cost >= 30)
>> +       dont_elim_p = true;
> No, we'd like to avoid such things in general.  The conditions look
> like a hack to me.  elim_cost is compared to express_cost, adding
> another check on it at different place isn't really good, especially
> 30 itself is a magic number.  It's most likely improvement in some
> cases, deterioration in others.
>


Yes, I agree it's too hacky and unacceptable as a formal fix.  And I
tried to investigate more whether it's a general issue but never got
exposed.

> Also it punishes one pass (IVOPTs here) because of other pass' (RTL)
> problem.  It does't mean we can't do such transformations, only it has
> to be as precise/conservative as possible.  For example, if RTL loop
> iv is improved to handle the case in the future, who would remember to
> come back and adjust this?
> 

Good question!

> GCC lacks the capability passing information to later passes.  Gimple
> analyzer worked hard collecting various information but discards it
> entering RTL or earlier.  Other examples are like runtime alias
> information, non-wrapping information for specific operations, etc.
> IMHO, this is what needs to be done.  As for this case, it could be
> finite loop info, or non-wrapping info of the iv_var's increment
> operation.  By passing more information, RTL passes can be simplified
> too.
> 

Thanks for the information! Is there any under development work for this?
That would be fine if we can pass down those information to downstream
passes based on upcoming feature.

> Thanks,
> bin
>> +
>>        /* The bound is a loop invariant, so it will be only computed
>>          once.  */
>>        elim_cost.cost = adjust_setup_cost (data, elim_cost.cost);
>> @@ -5184,7 +5195,7 @@ determine_group_iv_cost_cond (struct ivopts_data *data,
>>    express_cost += bound_cost;
>>
>>    /* Choose the better approach, preferring the eliminated IV. */
>> -  if (elim_cost <= express_cost)
>> +  if (elim_cost <= express_cost && !dont_elim_p)
>>      {
>>
>>
>> I was thinking whether this zero cost change just exposed
>> an existing problem, then this kind of check should be for all
>> cases not only for zero cost use, similar to
>> expression_expensive_p?  But why doesn't it happen before?
>> Need more investigation.
>>
>>>
>>>> Btw, this is for GCC10.
>>>
>>> *Phew* ;-)
>>>
>>>
>>> Some trivial comments:
>>>
>>>> +static bool
>>>> +invalid_insn_for_doloop_p (struct loop *loop)
>>>> +{
>>>> +  basic_block *body = get_loop_body (loop);
>>>> +  unsigned num_nodes = loop->num_nodes;
>>>> +  gimple_stmt_iterator gsi;
>>>> +  unsigned i;
>>>> +
>>>> +  for (i = 0; i < num_nodes; i++)
>>>
>>>   for (unsigned i = 0; i < num_nodes; i++)
>>>
>>> (and maybe you can just say loop->num_nodes here; I don't know if that
>>> generates worse code, or if that even matters).
>>
>> Good idea, will fix it.
>>
>>>
>>>> +        if (dump_file && (dump_flags & TDF_DETAILS))
>>>> +          fprintf (
>>>> +            dump_file,
>>>> +            "predict doloop failure due to finding computed jump.\n");
>>>
>>> We don't normally end lines in (.  There are other solutions to why you
>>> did that here; you can use string pasting, to break the string into two,
>>> or factor out (some part of) the loop body to reduce indentation.
>>>
>>
>> Will adjust it.
>>
>>>
>>> Segher
>>>
>>
>

Re: [PATCH, RFC, rs6000] PR80791 Consider doloop in ivopts

Reply via email to