cbrt_tablegen: unroll table generation loop

Ganesh Ajjanagadde Thu, 31 Dec 2015 09:21:06 -0800

On Thu, Dec 31, 2015 at 8:46 AM, Ronald S. Bultje <[email protected]> wrote:
> Hi,
>
> On Thu, Dec 31, 2015 at 11:39 AM, Ganesh Ajjanagadde
> <[email protected]> wrote:
>>
>> This patch does not seem to have measurable impact, at least on x86-64,
>> though
>> there could be benefits for less than stellar branch predictors.
>
> [..]
>>
>> -        for (i = 0; i < 1<<13; i++) {
>> -            if (!(i & 7))
>> -                cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f;
>> -            else
>> -                cbrt_tab[i].f = i * cbrt(i);
>> +        for (i = 0; i < 1<<13; i+=8) {
>> +            cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f;
>> +            cbrt_tab[i+1].f = (i+1) * cbrt(i+1);
>> +            cbrt_tab[i+2].f = (i+2) * cbrt(i+2);
>> +            cbrt_tab[i+3].f = (i+3) * cbrt(i+3);
>> +            cbrt_tab[i+4].f = (i+4) * cbrt(i+4);
>> +            cbrt_tab[i+5].f = (i+5) * cbrt(i+5);
>> +            cbrt_tab[i+6].f = (i+6) * cbrt(i+6);
>> +            cbrt_tab[i+7].f = (i+7) * cbrt(i+7);
>
>
> gcc (and most other compilers) will unroll the loop automatically, I
> suspect. Check disassembly to confirm?


checked, it does not on gcc at least. I would in fact suspect the
opposite;  this increases the binary size non-negligibly (unlike an
unrolling by two for instance), and is slightly nontrivial for a
compiler due to the i&7 business.

>
> (That doesn't mean the patch shouldn't go in, I'm just trying to help you
> explain the result. I have no comment on the patch itself.)

I think the branch predictor explanation is reasonable, these are very
predictable due to the relatively long length of the loop, and simple,
periodic branches. On the other hand, this varies considerably across
even generations of intel CPU's; hence I posted the patch. Thanks for
the idea though.

>
> Ronald
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 3/3] lavc/cbrt_tablegen: unroll table generation loop

Reply via email to