On Tue, Dec 18, 2018 at 7:15 PM Bin.Cheng <amker.ch...@gmail.com> wrote:
>
> On Sun, Dec 16, 2018 at 9:11 AM Andi Kleen <a...@linux.intel.com> wrote:
> >
> > "bin.cheng" <bin.ch...@linux.alibaba.com> writes:
> >
> > > Hi,
> > >
> > > Due to ICE and mal-functional bugs, indirect call value profile 
> > > transformation
> > > is disabled on GCC-7/8/trunk.  This patch restores the transformation.  
> > > The
> > > main issue is AutoFDO should store cgraph_node's profile_id of callee 
> > > func in
> > > the first histogram value's counter, rather than pointer to callee's name 
> > > string
> > > as it is now.
> > > With the patch, some "Indirect call -> direct call" tests pass with 
> > > autofdo, while
> > > others are unstable.  I think the instability is caused by poor perf data 
> > > collected
> > > during regrets run, and can confirm these tests pass if good perf data 
> > > could be
> > > collected in manual experiments.
> >
> > Would be good to make the tests stable, otherwise we'll just have
> > regressions in the future again.
> >
> > The problem is that the tests don't run long enough and don't get enough 
> > samples?
> Yes, take g++.dg/tree-prof/morefunc.C as an example:
> -  int i;
> -  for (i = 0; i < 1000; i++)
> +  int i, j;
> +  for (i = 0; i < 1000000; i++)
> +    for (j = 0; j < 50; j++)
>       g += tc->foo();
>     if (g<100) g++;
>  }
> @@ -27,8 +28,9 @@ void test1 (A *tc)
>  static __attribute__((always_inline))
>  void test2 (B *tc)
>  {
> -  int i;
> +  int i, j;
>    for (i = 0; i < 1000000; i++)
> +    for (j = 0; j < 50; j++)
>
> I have to increase loop count like this to get stable pass on my
> machine.  The original count (1000) is too small to be sampled.
>
> >
> > Could add some loop?
> > Or possibly increase the sampling frequency in perf (-F or -c)?
> Maybe, I will have a try.
Turned out all "Indirect call" test can be resolved by adding -c 100
to perf command line:
diff --git a/gcc/config/i386/gcc-auto-profile b/gcc/config/i386/gcc-auto-profile
...
-exec perf record -e $E -b "$@"
+exec perf record -e $E -c 100 -b "$@"

Is 100 too small here?  Or is it fine for all scenarios?

Thanks,
bin

> > Or run them multiple times and use gcov_merge to merge the files?
> Without changing loop count or sampling frequency, this is not likely
> to be helpful, since perf doesn't hit the small loop in most cases.
>
> Thanks,
> bin
> >
> >
> > > FYI, an update about AutoFDO status:
> > > All AutoFDO ICEs in regtest are fixed, while several tests still failing 
> > > fall in below
> > > three categories:
> >
> > Great!
> >
> > Of course it still ICEs with LTO?
> >
> > Right now there is no test case for this I think. Probably one should be 
> > added.
> >
> > -Andi

Reply via email to