Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed

guojiufu via Gcc-patches Sun, 04 Jul 2021 20:13:56 -0700

Hi Honza and All,

After more checks, I'm thinking these patches may still be useful.
For patch 1:


https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html

This patch recalculates the loop's BB-count and could correct
some BB-count mismatch for loops which has a single exit.
From the test result, we could say it reduce mismatched BB-counts
slightly.

For patch 2:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555872.html
I updated as below:

It reset the loop's probability when the loop count becomesunrealisticallysmall. In theory, it seems this would be the right direction to dothis.

Bootstrap/regtest on powerpc64le with no new regressions. I'm thinkingif

this is acceptable for trunk?

BR,
Jiufu Guo

Subject: Reset edge probability and BB-count for peeled/unrolled loop

This patch fix handles the case where unrolling in an unreliable count

number can cause a loop to no longer look hot and therefore not getaligned.This patch scale by profile_probability::likely () if unrolled countgetsunrealistically small. And this patch fixes the COUNT/PROB of peeledloop.


gcc/ChangeLog:
2021-07-01  Jiufu Guo   <guoji...@linux.ibm.com>
            Pat Haugen  <pthau...@us.ibm.com>

        PR rtl-optimization/68212
        * cfgloopmanip.c (duplicate_loop_to_header_edge): Reset probablity
        of unrolled/peeled loop.

testsuite/ChangeLog:
2021-07-01  Jiufu Guo   <guoji...@linux.ibm.com>
            Pat Haugen  <pthau...@us.ibm.com>
        PR rtl-optimization/68212
        * gcc.dg/pr68212.c: New test.


---
 gcc/cfgloopmanip.c             | 20 ++++++++++++++++++--
 gcc/testsuite/gcc.dg/pr68212.c | 13 +++++++++++++
 2 files changed, 31 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr68212.c

diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index 4a9ab74642c..29d858c878a 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c

@@ -1258,14 +1258,30 @@ duplicate_loop_to_header_edge (class loop *loop,edge e,

          /* If original loop is executed COUNT_IN times, the unrolled
             loop will account SCALE_MAIN_DEN times.  */
          scale_main = count_in.probability_in (scale_main_den);
+
+         /* If we are guessing at the number of iterations and count_in
+            becomes unrealistically small, reset probability.  */
+         if (!(count_in.reliable_p () || loop->any_estimate))
+           {

+ profile_count new_count_in = count_in.apply_probability(scale_main);+ profile_count preheader_count = loop_preheader_edge(loop)->count ();

+             if (new_count_in.apply_scale (1, 10) < preheader_count)
+               scale_main = profile_probability::likely ();
+           }
+
          scale_act = scale_main * prob_pass_main;
        }
       else
        {
+         profile_count new_loop_count;
          profile_count preheader_count = e->count ();
-         for (i = 0; i < ndupl; i++)
-           scale_main = scale_main * scale_step[i];
          scale_act = preheader_count.probability_in (count_in);
+         /* Compute final preheader count after peeling NDUPL copies.  */
+         for (i = 0; i < ndupl; i++)

+ preheader_count = preheader_count.apply_probability(scale_step[i]);

+         /* Subtract out exit(s) from peeled copies.  */
+         new_loop_count = count_in - (e->count () - preheader_count);
+         scale_main = new_loop_count.probability_in (count_in);
        }
     }

diff --git a/gcc/testsuite/gcc.dg/pr68212.cb/gcc/testsuite/gcc.dg/pr68212.c

new file mode 100644
index 00000000000..e0cf71d5202
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68212.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */

+/* { dg-options "-O2 -fno-tree-vectorize -funroll-loops --parammax-unroll-times=4 -fdump-rtl-alignments" } */

+
+void foo(long int *a, long int *b, long int n)
+{
+  long int i;
+
+  for (i = 0; i < n; i++)
+    a[i] = *b;
+}
+

+/* { dg-final { scan-rtl-dump-times "internal loop alignment added" 1"alignments"} } */

+
--
2.17.1



On 2021-06-18 16:24, guojiufu via Gcc-patches wrote:

On 2021-06-15 12:57, guojiufu via Gcc-patches wrote:
On 2021-06-14 17:16, Jan Hubicka wrote:
On 5/6/2021 8:36 PM, guojiufu via Gcc-patches wrote:
> Gentle ping.
>
> Original message:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html
I think you need a more aggressive ping  :-)
OK for the trunk. Sorry for the long delay. I kept hoping someoneelse
would step in and look at it.
Sorry, the patch was on my todo list to think through for a while :(
It seems to me that both old and new code needs bit more work.  First
the exit loop frequency is set to
prob = profile_probability::always ().apply_scale (1, new_est_niter+ 1);
which is only correct if the estimated number of iterations isaccurate.If we do not have profile feedback and trip count is not knownpreciselyin most cases it won't be. We estimate loops to iterate about 3timesand then niter_for_unrolled_loop will apply the capping to 5iterations
that is completely arbitrary.

Forcing exit probability to precise may then disable futher loop
optimizations since after the change we will think we know the loop
iterates 5 times and thus it is not worthy for loop opt (which isquiteoposite with the fact that we are just unrolling it thinking it ishot).
Thanks, understand your concern, both new and old code are assumingthe
the number of iterations is accurate.
Maybe we could add code to reset exit probability for the case
where "!count_in.reliable_p ()".
Old code does
 1) scale body down so only one iteration is done
 2) set exit edge probability to be 1/(new_est_iter+1)
    precisely
 3) scale up accoring to the 1/new_nonexit_prob
    which would be correct if the nonexit probability was updated to
    1-exit_probability but that does not seem to happen.

New code does
Yes, this is intended: we know that the enter-count should be
equal to the exit-count of one loop, and then the
"loop-body-count * exit-probability = exit-count".
Also, the entry count of the loop would not be changed before andafter
one optimization (or slightly change,e.g. peeling count).

Based on this, we could adjust the loop body count according to
exit-count (or say enter-count) and exit-probability, when the
exit-probability is easy to estimate.
 1) give up when there are multiple exits.
    I wonder how common this is - we do outer loop vectorizaiton
Hi Honza, and guys:

I just had a statistic for bootstrap/test and spec2017 build and find
there are ~1700 times of single loops are hit this code; in spec2017build,
it hits 226 single-exit loops, and multi-exit loops are not hit.
Had a test with profile-report to see "mismatch count', with thesepatcheswe may say the "mismatch count' is mitigated slightly, but not veryaggressive:
150 mismatch counts are reduced.
But 119 mismatch counts are increased.

Any comments about this patch? Is it acceptable for the trunk? Thanks.


BR,
Jiufu Guo.
The computation in the new code is based on a single exit. This is
also a requirement of old code, and it would be true when run to here.
 2) adjust loop body count according to the exit
 3) updat profile of BB after the exit edge.
Why do you need:
+  if (current_ir_type () != IR_GIMPLE)
+    update_br_prob_note (exit->src);
It is tree_transform_and_unroll_loop, so I think we should alwayshave
IR_GIMPLE?
These two lines are added to "recompute_loop_frequencies" which can beused
in rtl, like the second patch of this:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555872.html
Oh, maybe these two lines code would be put totree_transform_and_unroll_loop
instead of common code recompute_loop_frequencies.

Thanks a lot for the review in your busy time!

BR.
Jiufu Guo
Honza
jeff

Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed

Reply via email to