ffmpeg | branch: master | Ganesh Ajjanagadde <gajjanaga...@gmail.com> | Wed Dec 16 17:39:28 2015 -0800| [05434b0eea3f959b8b44be97c56bad6ab6a0dc22] | committer: Ganesh Ajjanagadde
lavc/cook: get rid of wasteful pow in init_pow2table The table is highly structured, so pow (or exp2 for that matter) can entirely be avoided, yielding a ~ 40x speedup with no loss of accuracy. sample benchmark (Haswell, GNU/Linux): new: 4449 decicycles in init_pow2table(loop 1000), 254 runs, 2 skips 4411 decicycles in init_pow2table(loop 1000), 510 runs, 2 skips 4391 decicycles in init_pow2table(loop 1000), 1022 runs, 2 skips old: 183673 decicycles in init_pow2table(loop 1000), 256 runs, 0 skips 182142 decicycles in init_pow2table(loop 1000), 512 runs, 0 skips 182104 decicycles in init_pow2table(loop 1000), 1024 runs, 0 skips Reviewed-by: Clément Bœsch <u...@pkh.me> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> > http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=05434b0eea3f959b8b44be97c56bad6ab6a0dc22 --- libavcodec/cook.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/libavcodec/cook.c b/libavcodec/cook.c index d8fb736..1b38019 100644 --- a/libavcodec/cook.c +++ b/libavcodec/cook.c @@ -166,10 +166,17 @@ static float rootpow2tab[127]; /* table generator */ static av_cold void init_pow2table(void) { + /* fast way of computing 2^i and 2^(0.5*i) for -63 <= i < 64 */ int i; + static const float exp2_tab[2] = {1, M_SQRT2}; + float exp2_val = powf(2, -63); + float root_val = powf(2, -32); for (i = -63; i < 64; i++) { - pow2tab[63 + i] = pow(2, i); - rootpow2tab[63 + i] = sqrt(pow(2, i)); + if (!(i & 1)) + root_val *= 2; + pow2tab[63 + i] = exp2_val; + rootpow2tab[63 + i] = root_val * exp2_tab[i & 1]; + exp2_val *= 2; } } _______________________________________________ ffmpeg-cvslog mailing list ffmpeg-cvslog@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog