I wrote:

> "Richard Biener" <richard.guent...@gmail.com> wrote:

[...]
 
>> Whether or not the branch is predicted taken does not matter, what
>> matters is that the continuation is not data dependent on the branch
>> target computation and thus can execute in parallel to it.
> 
> My benchmark shows that this doesn't matter!

>>> I already presented measured numbers: with random data, the branch-free
>>> code is faster, with ordered data the original code.

[...]

>> We do know that mispredicted branches are bad.
>> You show well-predicted branches are good.
> 
> Wrong: I show that no branches are still better.
> 
>> By simple statistics singling out 4 out of 255 values will make the
>> branches well-predicted.
> 
> Your statistic is wrong:
> 1. the branch singles out 224 of 256 values, i.e. 7/8 of all data;
> 2. whitespace lies in the 1/8 which is not singled out.
> 
>>> Now perform a linear interpolation and find the break-even point at
>>> p=0.4, with p=0 for ordered data and p=1 for random data, or just use
>>> the average of these numbers: 2.9 cycles vs. 2.75 cycles.
>>> That's small, but measurable!

I recommend you ALL start to read Daniel Lemire's article
<https://www.infoq.com/articles/making-code-faster-taming-branches/>

| Even when it is impossible to remove all branches, reducing the number
| of branches "almost always taken" or "almost never taken" may help the
| processor better predict the remaining branches.

JFTR: I didn't know his article before, but I hope that you are willing
      to learn.

Stefan

Reply via email to