This exploits a very simple property of the cbrt function, obtaining a non-negligible speed-up. Tables turn out to be identical on GNU/Linux+gcc.
Sample benchmark (Haswell, GNU/Linux+gcc): new: 6632898 decicycles in cbrt_tableinit, 256 runs, 0 skips 6623909 decicycles in cbrt_tableinit, 512 runs, 0 skips prev: 7582339 decicycles in cbrt_tableinit, 256 runs, 0 skips 7563556 decicycles in cbrt_tableinit, 512 runs, 0 skips i.e very close to the estimated 12.5% speedup. Tested with FATE. Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> --- libavcodec/cbrt_tablegen.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/libavcodec/cbrt_tablegen.h b/libavcodec/cbrt_tablegen.h index ef4c099..d3614d8 100644 --- a/libavcodec/cbrt_tablegen.h +++ b/libavcodec/cbrt_tablegen.h @@ -43,9 +43,13 @@ static union av_intfloat32 cbrt_tab[1 << 13]; static av_cold void AAC_RENAME(cbrt_tableinit)(void) { if (!cbrt_tab[(1<<13) - 1].i) { + cbrt_tab[0].f = 0; int i; for (i = 0; i < 1<<13; i++) { - cbrt_tab[i].f = i * cbrt(i); + if (!(i & 7)) + cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; + else + cbrt_tab[i].f = i * cbrt(i); } #if USE_FIXED for (i = 0; i < 1<<13; i++) { -- 2.6.4 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel