Re: [PATCH] Fix loop split incorrect count and probability

Xionghu Luo via Gcc-patches Sun, 08 Aug 2021 19:37:46 -0700

Thanks,

On 2021/8/6 19:46, Richard Biener wrote:
> On Tue, 3 Aug 2021, Xionghu Luo wrote:
> 
>> loop split condition is moved between loop1 and loop2, the split bb's
>> count and probability should also be duplicated instead of (100% vs INV),
>> secondly, the original loop1 and loop2 count need be propotional from the
>> original loop.
>>
>>
>> diff base/loop-cond-split-1.c.151t.lsplit  
>> patched/loop-cond-split-1.c.151t.lsplit:
>> ...
>>     int prephitmp_16;
>>     int prephitmp_25;
>>
>>     <bb 2> [local count: 118111600]:
>>     if (n_7(D) > 0)
>>       goto <bb 4>; [89.00%]
>>     else
>>       goto <bb 3>; [11.00%]
>>
>>     <bb 3> [local count: 118111600]:
>>     return;
>>
>>     <bb 4> [local count: 105119324]:
>>     pretmp_3 = ga;
>>
>> -  <bb 5> [local count: 955630225]:
>> +  <bb 5> [local count: 315357973]:
>>     # i_13 = PHI <i_10(20), 0(4)>
>>     # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)>
>>     if (prephitmp_12 != 0)
>>       goto <bb 6>; [33.00%]
>>     else
>>       goto <bb 7>; [67.00%]
>>
>> -  <bb 6> [local count: 315357972]:
>> +  <bb 6> [local count: 104068130]:
>>     _2 = do_something ();
>>     ga = _2;
>>
>> -  <bb 7> [local count: 955630225]:
>> +  <bb 7> [local count: 315357973]:
>>     # prephitmp_5 = PHI <prephitmp_12(5), _2(6)>
>>     i_10 = inc (i_13);
>>     if (n_7(D) > i_10)
>>       goto <bb 21>; [89.00%]
>>     else
>>       goto <bb 11>; [11.00%]
>>
>>     <bb 11> [local count: 105119324]:
>>     goto <bb 3>; [100.00%]
>>
>> -  <bb 21> [local count: 850510901]:
>> +  <bb 21> [local count: 280668596]:
>>     if (prephitmp_12 != 0)
>> -    goto <bb 20>; [100.00%]
>> +    goto <bb 20>; [33.00%]
>>     else
>> -    goto <bb 19>; [INV]
>> +    goto <bb 19>; [67.00%]
>>
>> -  <bb 20> [local count: 850510901]:
>> +  <bb 20> [local count: 280668596]:
>>     goto <bb 5>; [100.00%]
>>
>> -  <bb 19> [count: 0]:
>> +  <bb 19> [local count: 70429947]:
>>     # i_23 = PHI <i_10(21)>
>>     # prephitmp_25 = PHI <prephitmp_5(21)>
>>
>> -  <bb 12> [local count: 955630225]:
>> +  <bb 12> [local count: 640272252]:
>>     # i_15 = PHI <i_23(19), i_22(16)>
>>     # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)>
>>     i_22 = inc (i_15);
>>     if (n_7(D) > i_22)
>>       goto <bb 16>; [89.00%]
>>     else
>>       goto <bb 11>; [11.00%]
>>
>> -  <bb 16> [local count: 850510901]:
>> +  <bb 16> [local count: 569842305]:
>>     goto <bb 12>; [100.00%]
>>
>>   }
>>
>> gcc/ChangeLog:
>>
>>      * tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
>>      (do_split_loop_on_cond): Likewise.
>> ---
>>   gcc/tree-ssa-loop-split.c | 16 ++++++++--------
>>   1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
>> index 3a09bbc39e5..8e5a7ded0f7 100644
>> --- a/gcc/tree-ssa-loop-split.c
>> +++ b/gcc/tree-ssa-loop-split.c
>> @@ -583,10 +583,10 @@ split_loop (class loop *loop1)
>>      basic_block cond_bb;


        if (!initial_true)
-         cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); 
+         cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
+
+       edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE
+                          ? EDGE_SUCC (bbs[i], 0)
+                          : EDGE_SUCC (bbs[i], 1);

>>   
>>      class loop *loop2 = loop_version (loop1, cond, &cond_bb,
>> -                                       profile_probability::always (),
>> -                                       profile_probability::always (),
>> -                                       profile_probability::always (),
>> -                                       profile_probability::always (),
>> +                                       true_edge->probability,
>> +                                       true_edge->probability.invert (),
>> +                                       true_edge->probability,
>> +                                       true_edge->probability.invert (),
>>                                         true);
> 
> there is no 'true_edge' variable at this point.

Sorry, missed the above hunk when split the patch. 

> 
>>      gcc_assert (loop2);
>>   
>> @@ -1486,10 +1486,10 @@ do_split_loop_on_cond (struct loop *loop1, edge 
>> invar_branch)
>>     initialize_original_copy_tables ();
>>   
>>     struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
>> -                                 profile_probability::always (),
>> -                                 profile_probability::never (),
>> -                                 profile_probability::always (),
>> -                                 profile_probability::always (),
>> +                                 invar_branch->probability.invert (),
>> +                                 invar_branch->probability,
>> +                                 invar_branch->probability.invert (),
>> +                                 invar_branch->probability,
>>                                   true);
>>     if (!loop2)
>>       {
> 
> The patch introduction seems to talk about do_split_loop_on_cond only.

split_loop faces similar issue though it sets the two branches to 100% vs 100%
and no scaling which seems also incorrect.

> Since loop versioning inserts a condition with the passed probabilities
> but in this case a 'boolean_true_node' condition the then and else
> probabilities passed look correct.  It's just the scaling arguments
> that look wrong?  This loop_version call should get a comment as to
> why we are passing probabilities the way we do.

This optimization is transforming from:

for (i = 0; i < n; i = inc (i))
    {
      if (ga)
        ga = do_something ();
}

to: 

  for (i = 0; i < x; i = inc (i))
{
    if (true)
         ga = do_something ();
        if (!ga)
          break;
}
  for (; i < n; i = inc (i))
{
    if (false)
         ga = do_something ();
}


`boolean_true_node` is passed in to make the first loop's condition
statement to be 'true', after returning from loop_version, there is a
piece of code forcing the condition in second loop to be false,
and the original condition is moved from *in loop* to *exit edge*
between loop1 and loop2. 

@fxue may have inputs about this since he contributed it years ago.

> 
> It does seem that scaling the loop by the invar_branch probability
> is correct.  Since this does similar things to unswitching, I see
> that unswitching does
> 
> prob_true = edge_true->probability;
> loop_version (loop, unshare_expr (cond),
>                         NULL, prob_true,
>                         prob_true.invert (),
>                         prob_true, prob_true.invert (),
>                         false);
> 
> which maybe suggests that your invar_branch based passing should
> depend on 'true_invar'?
> 
> Also compared to unswitching the first loop is always entered,
> so I wonder if the scaling is correct with respect to that
> given unswitching where only ever one loop is entered?


Scaling is based on the probability calculated on "if (ga)", if it is
33% vs 67% before split, then it is reasonable to scale loop1 to 33%
and loop2 to 67% suppose the branch probability is correct enough?

unswitch also scaled the two loops based on prob_true, if prob_true
is 50%, it decreases X's count to HALF unexpectedly since it should be
executed on both branches?  while loop split still kept execute both first
loop and second loop, it seems even more accurate than loop unswitching?

tree-ssa-loop-unswitch.c:

   while (A)
     {
       if (inv)
         B;

       X;

       if (!inv)
         C;
     }

   where inv is the loop invariant, into

   if (inv)
     {
       while (A)
         {
           B;
           X;
         }
     }
   else
     {
       while (A)
         {
           X;
           C;
         }
     }

-- 
Thanks,
Xionghu

Re: [PATCH] Fix loop split incorrect count and probability

Reply via email to