This gets rid of some branches to speed up table generation slightly (impact higher on mulaw than alaw). Tables are identical to before, tested with FATE.
Sample benchmark (Haswell, GNU/Linux+gcc): old: 313494 decicycles in build_alaw_table, 4094 runs, 2 skips 315959 decicycles in build_alaw_table, 8190 runs, 2 skips 323599 decicycles in build_ulaw_table, 4095 runs, 1 skips 318849 decicycles in build_ulaw_table, 8188 runs, 4 skips new: 261902 decicycles in build_alaw_table, 4096 runs, 0 skips 266519 decicycles in build_alaw_table, 8192 runs, 0 skips 209657 decicycles in build_ulaw_table, 4096 runs, 0 skips 232656 decicycles in build_ulaw_table, 8192 runs, 0 skips Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> --- libavcodec/pcm_tablegen.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/libavcodec/pcm_tablegen.h b/libavcodec/pcm_tablegen.h index 1387210..7269977 100644 --- a/libavcodec/pcm_tablegen.h +++ b/libavcodec/pcm_tablegen.h @@ -87,21 +87,21 @@ static av_cold void build_xlaw_table(uint8_t *linear_to_xlaw, { int i, j, v, v1, v2; - j = 0; - for(i=0;i<128;i++) { - if (i != 127) { - v1 = xlaw2linear(i ^ mask); - v2 = xlaw2linear((i + 1) ^ mask); - v = (v1 + v2 + 4) >> 3; - } else { - v = 8192; - } - for(;j<v;j++) { + j = 1; + linear_to_xlaw[8192] = mask; + for(i=0;i<127;i++) { + v1 = xlaw2linear(i ^ mask); + v2 = xlaw2linear((i + 1) ^ mask); + v = (v1 + v2 + 4) >> 3; + for(;j<v;j+=1) { + linear_to_xlaw[8192 - j] = (i ^ (mask ^ 0x80)); linear_to_xlaw[8192 + j] = (i ^ mask); - if (j > 0) - linear_to_xlaw[8192 - j] = (i ^ (mask ^ 0x80)); } } + for(;j<8192;j++) { + linear_to_xlaw[8192 - j] = (127 ^ (mask ^ 0x80)); + linear_to_xlaw[8192 + j] = (127 ^ mask); + } linear_to_xlaw[0] = linear_to_xlaw[1]; } -- 2.6.4 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel