On Fri, 28 Aug 2020 at 17:33, Alexander Monakov <amona...@ispras.ru> wrote:
>
> On Fri, 28 Aug 2020, Prathamesh Kulkarni via Gcc wrote:
>
> > I wonder if that's (one of) the main factor(s) behind slowdown or it's
> > not too relevant ?
>
> Probably not. Some advice to make your search more directed:
>
> Pass '-n' to 'perf report'. Relative sample ratios are hard to reason about
> when they are computed against different bases, it's much easier to see that
> a loop is slowing down if it went from 4000 to 4500 in absolute sample count
> as opposed to 90% to 91% in relative sample ratio.
>
> Before diving down 'perf report', be sure to fully account for differences
> in 'perf stat' output. Do the programs execute the same number of 
> instructions,
> so the difference only in scheduling? Do the programs suffer from the same
> amount of branch mispredictions? Please show output of 'perf stat' on the
> mailing list too, so everyone is on the same page about that.
>
> I also suspect that the dramatic slowdown has to do with the extra branch.
> Your CPU might have some specialized counters for branch prediction, see
> 'perf list'.
Hi Alexander,
Thanks for the suggestions! I am in the process of doing the
benchmarking experiments,
and will post the results soon.

Thanks,
Prathamesh
>
> Alexander

Reply via email to