pcm_tablegen: slight speedup of table generation

Michael Niedermayer Sun, 03 Jan 2016 19:05:21 -0800

On Wed, Dec 30, 2015 at 08:34:55PM -0800, Ganesh Ajjanagadde wrote:
> This gets rid of some branches to speed up table generation slightly
> (impact higher on mulaw than alaw). Tables are identical to before,
> tested with FATE.
> 
> Sample benchmark (Haswell, GNU/Linux+gcc):
> old:
>  313494 decicycles in build_alaw_table,    4094 runs,      2 skips
>  315959 decicycles in build_alaw_table,    8190 runs,      2 skips
> 
>  323599 decicycles in build_ulaw_table,    4095 runs,      1 skips
>  318849 decicycles in build_ulaw_table,    8188 runs,      4 skips
> 
> new:
>  261902 decicycles in build_alaw_table,    4096 runs,      0 skips
>  266519 decicycles in build_alaw_table,    8192 runs,      0 skips
> 
>  209657 decicycles in build_ulaw_table,    4096 runs,      0 skips
>  232656 decicycles in build_ulaw_table,    8192 runs,      0 skips
> 
> Signed-off-by: Ganesh Ajjanagadde <[email protected]>
> ---
>  libavcodec/pcm_tablegen.h | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/libavcodec/pcm_tablegen.h b/libavcodec/pcm_tablegen.h
> index 1387210..7269977 100644
> --- a/libavcodec/pcm_tablegen.h
> +++ b/libavcodec/pcm_tablegen.h
> @@ -87,21 +87,21 @@ static av_cold void build_xlaw_table(uint8_t 
> *linear_to_xlaw,
>  {
>      int i, j, v, v1, v2;
>  
> -    j = 0;
> -    for(i=0;i<128;i++) {
> -        if (i != 127) {
> -            v1 = xlaw2linear(i ^ mask);
> -            v2 = xlaw2linear((i + 1) ^ mask);
> -            v = (v1 + v2 + 4) >> 3;
> -        } else {
> -            v = 8192;
> -        }
> -        for(;j<v;j++) {
> +    j = 1;
> +    linear_to_xlaw[8192] = mask;
> +    for(i=0;i<127;i++) {
> +        v1 = xlaw2linear(i ^ mask);
> +        v2 = xlaw2linear((i + 1) ^ mask);
> +        v = (v1 + v2 + 4) >> 3;
> +        for(;j<v;j+=1) {
> +            linear_to_xlaw[8192 - j] = (i ^ (mask ^ 0x80));
>              linear_to_xlaw[8192 + j] = (i ^ mask);
> -            if (j > 0)
> -                linear_to_xlaw[8192 - j] = (i ^ (mask ^ 0x80));
>          }
>      }
> +    for(;j<8192;j++) {
> +        linear_to_xlaw[8192 - j] = (127 ^ (mask ^ 0x80));
> +        linear_to_xlaw[8192 + j] = (127 ^ mask);
> +    }
>      linear_to_xlaw[0] = linear_to_xlaw[1];


i think you can make the tables 8 times smaller
the points in the table where values transition seemed to be always
a multiple of 8 appart so just adjusting the offset in
pcm_encode_frame() would allow decreasing the >> 2 to >> 5

if that works out it would make the table generation 8 times faster
reduce memory needed and speed up the code runtime due to lower
pressure on L1/L2 caches


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

During times of universal deceit, telling the truth becomes a
revolutionary act. -- George Orwell

signature.asc
Description: Digital signature

_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/2] lavc/pcm_tablegen: slight speedup of table generation

Reply via email to