On 2021/9/2 16:50, Richard Biener wrote:
> On Thu, 2 Sep 2021, Richard Biener wrote:
>
>> On Thu, 2 Sep 2021, Xionghu Luo wrote:
>>
>>>
>>>
>>> On 2021/9/1 17:58, Richard Biener wrote:
>>>> This fixes the CFG walk order of fill_always_executed_in to use
>>>> RPO oder rather than the dominator based order computed by
>>>> get_loop_body_in_dom_order. That fixes correctness issues with
>>>> unordered dominator children.
>>>>
>>>> The RPO order computed by rev_post_order_and_mark_dfs_back_seme in
>>>> its for-iteration mode is a good match for the algorithm.
>>>>
>>>> Xionghu, I've tried to only fix the CFG walk order issue and not
>>>> change anything else with this so we have a more correct base
>>>> to work against. The code still walks inner loop bodies
>>>> up to loop depth times and thus is quadratic in the loop depth.
>>>>
>>>> Bootstrapped and tested on x86_64-unknown-linux-gnu, if you don't
>>>> have any comments I plan to push this and then revisit what we
>>>> were circling around.
>>>
>>> LGTM, thanks.
>>
>> I pushed it, thought again in the attempt to build a testcase and
>> concluded I was wrong with the appearant mishandling of
>> contains_call - get_loop_body_in_dom_order seems to be exactly
>> correct for this specific case. So I reverted the commit again.
>
> And I figured what the
>
> /* In a loop that is always entered we may proceed anyway.
> But record that we entered it and stop once we leave it.
> */
>
> comment was about. The code was present before the fix for PR78185
> and it was supposed to catch the case where the entered inner loop
> is not finite. Just as the testcase from PR78185 shows the
> stopping was done too late when the exit block was already marked
> as to be always executed. A simpler fix for PR78185 would have been
> to move
>
> if (!flow_bb_inside_loop_p (inn_loop, bb))
> break;
>
> before setting of last = bb. In fact the installed fix was more
> pessimistic than that given it terminated already when entering
> a possibly infinite loop. So we can improve that by doing
> sth like which should also improve the situation for some of
> the cases you were looking at?
>
> What remains is that we continue to stop when entering a
> not always executed loop:
>
> if (bb->loop_father->header == bb)
> {
> if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> break;
Yes. This will cause blocks after inner loop missed to be check
if they are actually ALWAYS_EXECUTED. I am afraid O(N^2) is
inevitable here...
>
> that I can at this point only explain by possible efficiency
> concerns? Any better idea on that one?
>From experiment, early break from inner loop seems not cost shorter
time than full inner loop walk. I will take more precise
measurement and larger data set on the function fill_always_executed_in_1
if necessary.
My previous v2 patch also tried to update inn_loop level by level
when exiting from inn_loops, but it is proved to be unnecessary
but you worried about the dominance order by get_loop_body_in_dom_order.
>
> I'm going to test the patch below which improves the situation for
>
> volatile int flag, bar;
> double foo (double *valp)
> {
> double sum = 0;
> for (int i = 0; i < 256; ++i)
> {
> for (int j = 0; j < 256; ++j)
> bar = flag;
> if (flag)
> sum += 1.;
> sum += *valp;
> }
> return sum;
> }
The patch still fails to handle cases like this:
struct X { int i; int j; int k;};
volatile int m;
void bar (struct X *x, int n, int l, int k)
{
for (int i = 0; i < l; i++)
{
if (k)
for (int j = 0; j < l; j++)
{
if (m)
x->i = m;
else
x->i = 1 - m;
int *r = &x->k;
int tem2 = *r;
x->k += tem2 * j;
}
x->i = m;
}
}
x->i is still not marked ALWAYS_EXECUTED for outer loop.
>
> Thanks,
> Richard.
>
> diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
> index d9f75d5025e..f0c93d6a882 100644
> --- a/gcc/tree-ssa-loop-im.c
> +++ b/gcc/tree-ssa-loop-im.c
> @@ -3044,23 +3044,27 @@ fill_always_executed_in_1 (class loop *loop,
> sbitmap contains_call)
> edge_iterator ei;
> bb = bbs[i];
>
> + if (!flow_bb_inside_loop_p (inn_loop, bb))
> + {
> + /* When we are leaving a possibly infinite inner loop
> + we have to stop processing. */
> + if (!finite_loop_p (inn_loop))
> + break;
> + /* If the loop was finite we can continue with processing
> + the loop we exited to. */
> + inn_loop = bb->loop_father;
> + }
> +
> if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> last = bb;
>
> if (bitmap_bit_p (contains_call, bb->index))
> break;
>
> + /* If LOOP exits from this BB stop processing. */
> FOR_EACH_EDGE (e, ei, bb->succs)
> - {
> - /* If there is an exit from this BB. */
> - if (!flow_bb_inside_loop_p (loop, e->dest))
> - break;
> - /* Or we enter a possibly non-finite loop. */
> - if (flow_loop_nested_p (bb->loop_father,
> - e->dest->loop_father)
> - && ! finite_loop_p (e->dest->loop_father))
> - break;
> - }
> + if (!flow_bb_inside_loop_p (loop, e->dest))
> + break;
> if (e)
> break;
>
> @@ -3069,16 +3073,14 @@ fill_always_executed_in_1 (class loop *loop,
> sbitmap contains_call)
> if (bb->flags & BB_IRREDUCIBLE_LOOP)
> break;
>
> - if (!flow_bb_inside_loop_p (inn_loop, bb))
> - break;
> -
> if (bb->loop_father->header == bb)
> {
> if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> break;
>
> /* In a loop that is always entered we may proceed anyway.
> - But record that we entered it and stop once we leave it.
> */
> + But record that we entered it and stop once we leave it
> + since it might not be finite. */
> inn_loop = bb->loop_father;
> }
> }
>
--
Thanks,
Xionghu