Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

Xionghu Luo via Gcc-patches Thu, 02 Sep 2021 03:00:26 -0700

On 2021/9/2 16:50, Richard Biener wrote:
> On Thu, 2 Sep 2021, Richard Biener wrote:
> 
>> On Thu, 2 Sep 2021, Xionghu Luo wrote:
>>
>>>
>>>
>>> On 2021/9/1 17:58, Richard Biener wrote:
>>>> This fixes the CFG walk order of fill_always_executed_in to use
>>>> RPO oder rather than the dominator based order computed by
>>>> get_loop_body_in_dom_order.  That fixes correctness issues with
>>>> unordered dominator children.
>>>>
>>>> The RPO order computed by rev_post_order_and_mark_dfs_back_seme in
>>>> its for-iteration mode is a good match for the algorithm.
>>>>
>>>> Xionghu, I've tried to only fix the CFG walk order issue and not
>>>> change anything else with this so we have a more correct base
>>>> to work against.  The code still walks inner loop bodies
>>>> up to loop depth times and thus is quadratic in the loop depth.
>>>>
>>>> Bootstrapped and tested on x86_64-unknown-linux-gnu, if you don't
>>>> have any comments I plan to push this and then revisit what we
>>>> were circling around.
>>>
>>> LGTM, thanks.
>>
>> I pushed it, thought again in the attempt to build a testcase and
>> concluded I was wrong with the appearant mishandling of
>> contains_call - get_loop_body_in_dom_order seems to be exactly
>> correct for this specific case.  So I reverted the commit again.
> 
> And I figured what the
> 
>                /* In a loop that is always entered we may proceed anyway.
>                   But record that we entered it and stop once we leave it.
> */
> 
> comment was about.  The code was present before the fix for PR78185
> and it was supposed to catch the case where the entered inner loop
> is not finite.  Just as the testcase from PR78185 shows the
> stopping was done too late when the exit block was already marked
> as to be always executed.  A simpler fix for PR78185 would have been
> to move
> 
>            if (!flow_bb_inside_loop_p (inn_loop, bb))
>              break;
> 
> before setting of last = bb.  In fact the installed fix was more
> pessimistic than that given it terminated already when entering
> a possibly infinite loop.  So we can improve that by doing
> sth like which should also improve the situation for some of
> the cases you were looking at?
> 
> What remains is that we continue to stop when entering a
> not always executed loop:
> 
>            if (bb->loop_father->header == bb)
>              {
>                if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
>                  break;

Yes.  This will cause blocks after inner loop missed to be check
if they are actually ALWAYS_EXECUTED.   I am afraid O(N^2) is 
inevitable here...

> 
> that I can at this point only explain by possible efficiency
> concerns?  Any better idea on that one?
>From experiment, early break from inner loop seems not cost shorter
time than full inner loop walk.  I will take more precise
measurement and larger data set on the function fill_always_executed_in_1
if necessary.   

My previous v2 patch also tried to update inn_loop level by level
when exiting from inn_loops, but it is proved to be  unnecessary
but you worried about the dominance order by get_loop_body_in_dom_order.

> 
> I'm going to test the patch below which improves the situation for
> 
> volatile int flag, bar;
> double foo (double *valp)
> {
>    double sum = 0;
>    for (int i = 0; i < 256; ++i)
>      {
>        for (int j = 0; j < 256; ++j)
>          bar = flag;
>        if (flag)
>          sum += 1.;
>        sum += *valp;
>      }
>    return sum;
> }

The patch still fails to handle cases like this:


struct X { int i; int j; int k;};
volatile int m;

void bar (struct X *x, int n, int l, int k)
{
  for (int i = 0; i < l; i++)
    {
     if (k)
        for (int j = 0; j < l; j++)
          {
            if (m)
              x->i = m;
            else
              x->i = 1 - m;

            int *r = &x->k;
            int tem2 = *r;
            x->k += tem2 * j;
        }

      x->i = m;
    }
}

x->i is still not marked ALWAYS_EXECUTED for outer loop.


> 
> Thanks,
> Richard.
> 
> diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
> index d9f75d5025e..f0c93d6a882 100644
> --- a/gcc/tree-ssa-loop-im.c
> +++ b/gcc/tree-ssa-loop-im.c
> @@ -3044,23 +3044,27 @@ fill_always_executed_in_1 (class loop *loop,
> sbitmap contains_call)
>            edge_iterator ei;
>            bb = bbs[i];
>   
> +         if (!flow_bb_inside_loop_p (inn_loop, bb))
> +           {
> +             /* When we are leaving a possibly infinite inner loop
> +                we have to stop processing.  */
> +             if (!finite_loop_p (inn_loop))
> +               break;
> +             /* If the loop was finite we can continue with processing
> +                the loop we exited to.  */
> +             inn_loop = bb->loop_father;
> +           }
> +
>            if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
>              last = bb;
>   
>            if (bitmap_bit_p (contains_call, bb->index))
>              break;
>   
> +         /* If LOOP exits from this BB stop processing.  */
>            FOR_EACH_EDGE (e, ei, bb->succs)
> -           {
> -             /* If there is an exit from this BB.  */
> -             if (!flow_bb_inside_loop_p (loop, e->dest))
> -               break;
> -             /* Or we enter a possibly non-finite loop.  */
> -             if (flow_loop_nested_p (bb->loop_father,
> -                                     e->dest->loop_father)
> -                 && ! finite_loop_p (e->dest->loop_father))
> -               break;
> -           }
> +           if (!flow_bb_inside_loop_p (loop, e->dest))
> +             break;
>            if (e)
>              break;
>   
> @@ -3069,16 +3073,14 @@ fill_always_executed_in_1 (class loop *loop,
> sbitmap contains_call)
>            if (bb->flags & BB_IRREDUCIBLE_LOOP)
>              break;
>   
> -         if (!flow_bb_inside_loop_p (inn_loop, bb))
> -           break;
> -
>            if (bb->loop_father->header == bb)
>              {
>                if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
>                  break;
>   
>                /* In a loop that is always entered we may proceed anyway.
> -                But record that we entered it and stop once we leave it.
> */
> +                But record that we entered it and stop once we leave it
> +                since it might not be finite.  */
>                inn_loop = bb->loop_father;
>              }
>          }
> 

-- 
Thanks,
Xionghu
Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

Reply via email to