Yes. Condition to to switch two versioned loops is "true", the first two 
arguments should be 100% and 0%.

It is different from normal loop split, we could not deduce exactly precise 
probability for
condition-based loop split, since cfg inside loop2 would be changed. 
(invar-branch is replaced
to "true", as shown in the comment on do_split_loop_on_cond). Any way, your way 
of scaling
two loops' probabilities according to that of invar-branch, seems to be a 
better heuristics than
original, which would give us more reasonable execution count, at least for 
loop header bb.

Thanks,
Feng

________________________________________
From: Gcc-patches <gcc-patches-bounces+fxue=os.amperecomputing....@gcc.gnu.org> 
on behalf of Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
Sent: Friday, August 6, 2021 7:46 PM
To: Xionghu Luo
Cc: seg...@kernel.crashing.org; wschm...@linux.ibm.com; li...@gcc.gnu.org; 
gcc-patches@gcc.gnu.org; hubi...@ucw.cz; dje....@gmail.com
Subject: Re: [PATCH] Fix loop split incorrect count and probability

On Tue, 3 Aug 2021, Xionghu Luo wrote:

> loop split condition is moved between loop1 and loop2, the split bb's
> count and probability should also be duplicated instead of (100% vs INV),
> secondly, the original loop1 and loop2 count need be propotional from the
> original loop.
>
> Regression tested pass, OK for master?
>
> diff base/loop-cond-split-1.c.151t.lsplit  
> patched/loop-cond-split-1.c.151t.lsplit:
> ...
>    int prephitmp_16;
>    int prephitmp_25;
>
>    <bb 2> [local count: 118111600]:
>    if (n_7(D) > 0)
>      goto <bb 4>; [89.00%]
>    else
>      goto <bb 3>; [11.00%]
>
>    <bb 3> [local count: 118111600]:
>    return;
>
>    <bb 4> [local count: 105119324]:
>    pretmp_3 = ga;
>
> -  <bb 5> [local count: 955630225]:
> +  <bb 5> [local count: 315357973]:
>    # i_13 = PHI <i_10(20), 0(4)>
>    # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)>
>    if (prephitmp_12 != 0)
>      goto <bb 6>; [33.00%]
>    else
>      goto <bb 7>; [67.00%]
>
> -  <bb 6> [local count: 315357972]:
> +  <bb 6> [local count: 104068130]:
>    _2 = do_something ();
>    ga = _2;
>
> -  <bb 7> [local count: 955630225]:
> +  <bb 7> [local count: 315357973]:
>    # prephitmp_5 = PHI <prephitmp_12(5), _2(6)>
>    i_10 = inc (i_13);
>    if (n_7(D) > i_10)
>      goto <bb 21>; [89.00%]
>    else
>      goto <bb 11>; [11.00%]
>
>    <bb 11> [local count: 105119324]:
>    goto <bb 3>; [100.00%]
>
> -  <bb 21> [local count: 850510901]:
> +  <bb 21> [local count: 280668596]:
>    if (prephitmp_12 != 0)
> -    goto <bb 20>; [100.00%]
> +    goto <bb 20>; [33.00%]
>    else
> -    goto <bb 19>; [INV]
> +    goto <bb 19>; [67.00%]
>
> -  <bb 20> [local count: 850510901]:
> +  <bb 20> [local count: 280668596]:
>    goto <bb 5>; [100.00%]
>
> -  <bb 19> [count: 0]:
> +  <bb 19> [local count: 70429947]:
>    # i_23 = PHI <i_10(21)>
>    # prephitmp_25 = PHI <prephitmp_5(21)>
>
> -  <bb 12> [local count: 955630225]:
> +  <bb 12> [local count: 640272252]:
>    # i_15 = PHI <i_23(19), i_22(16)>
>    # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)>
>    i_22 = inc (i_15);
>    if (n_7(D) > i_22)
>      goto <bb 16>; [89.00%]
>    else
>      goto <bb 11>; [11.00%]
>
> -  <bb 16> [local count: 850510901]:
> +  <bb 16> [local count: 569842305]:
>    goto <bb 12>; [100.00%]
>
>  }
>
> gcc/ChangeLog:
>
>       * tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
>       (do_split_loop_on_cond): Likewise.
> ---
>  gcc/tree-ssa-loop-split.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> index 3a09bbc39e5..8e5a7ded0f7 100644
> --- a/gcc/tree-ssa-loop-split.c
> +++ b/gcc/tree-ssa-loop-split.c
> @@ -583,10 +583,10 @@ split_loop (class loop *loop1)
>       basic_block cond_bb;
>
>       class loop *loop2 = loop_version (loop1, cond, &cond_bb,
> -                                        profile_probability::always (),
> -                                        profile_probability::always (),
> -                                        profile_probability::always (),
> -                                        profile_probability::always (),
> +                                        true_edge->probability,
> +                                        true_edge->probability.invert (),
> +                                        true_edge->probability,
> +                                        true_edge->probability.invert (),
>                                          true);

there is no 'true_edge' variable at this point.

>       gcc_assert (loop2);
>
> @@ -1486,10 +1486,10 @@ do_split_loop_on_cond (struct loop *loop1, edge 
> invar_branch)
>    initialize_original_copy_tables ();
>
>    struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
> -                                  profile_probability::always (),
> -                                  profile_probability::never (),
> -                                  profile_probability::always (),
> -                                  profile_probability::always (),
> +                                  invar_branch->probability.invert (),
> +                                  invar_branch->probability,
> +                                  invar_branch->probability.invert (),
> +                                  invar_branch->probability,
>                                    true);
>    if (!loop2)
>      {

The patch introduction seems to talk about do_split_loop_on_cond only.
Since loop versioning inserts a condition with the passed probabilities
but in this case a 'boolean_true_node' condition the then and else
probabilities passed look correct.  It's just the scaling arguments
that look wrong?  This loop_version call should get a comment as to
why we are passing probabilities the way we do.

It does seem that scaling the loop by the invar_branch probability
is correct.  Since this does similar things to unswitching, I see
that unswitching does

prob_true = edge_true->probability;
loop_version (loop, unshare_expr (cond),
                       NULL, prob_true,
                       prob_true.invert (),
                       prob_true, prob_true.invert (),
                       false);

which maybe suggests that your invar_branch based passing should
depend on 'true_invar'?

Also compared to unswitching the first loop is always entered,
so I wonder if the scaling is correct with respect to that
given unswitching where only ever one loop is entered?

Thanks,
Richard.


Reply via email to