Re: [PATCH v4] Repeat jump threading after combine

Segher Boessenkool Wed, 28 Nov 2018 12:51:49 -0800

Hi!

On Tue, Nov 27, 2018 at 05:07:11PM +0100, Ilya Leoshkevich wrote:
> perf diff -c wdiff:1,1 shows, that there is just one function
> (htab_traverse) that is significantly slower now:
> 
>      2.98%     11768891764  exe                [.] htab_traverse
>      1.91%       563949986  exe                [.] 
> compute_dominance_frontiers_1
> 
> The additional cycles consumed by this function matches the overall
> number of additionaly consumed cycles, and the contribution of the
> runner up (compute_dominance_frontiers_1) is 20 times smaller, so I
> think it's really just this one function.
> 
> However, the generated assembly is completely identical in both cases!


Ugh.  We have seen this before :-(

Thanks for investigating  I don't consider the Power degradation as really
caused by your patch, then.

> I saw similar situations in the past, so I tried adding a nop to
> htab_traverse:
> 
> --- hashtab.c
> +++ hashtab.c
> @@ -529,6 +529,8 @@ htab_traverse (htab, callback, info)
>       htab_trav callback;
>       PTR info;
>  {
> +  __asm__ volatile("nop\n");
> +
>    PTR *slot = htab->entries;
>    PTR *limit = slot + htab->size;
> 
> and made a 5x re-run.  The new measurements are 227.01s and 227.44s
> (+0.19%).  With two nops I get 227.25s and 227.29s (+0.02%), which also
> looks like noise.
> 
> Can this be explained by some microarchitectural quirk after all?

Two frequent branch targets that get thrown into the same bin for prediction.
Results change based on random compiler changes, ASLR settings, phase of the
moon, how many people in your neighbourhood have had porridge for breakfast
this morning, etc.


Segher

Re: [PATCH v4] Repeat jump threading after combine

Reply via email to