Re: [FFmpeg-devel] [PATCH] avcodec/nellymoserenc: avoid wasteful pow

Kacper Michajlow Fri, 18 Dec 2015 01:06:49 -0800

One minor nitpick about commit message. You could mention which compiler
was used to generate code for benchmark. For example Clang 3.7 replaces
pow(2,...) with exp2(...) call by itself. So you probably did use gcc.
Anyway since it is already merged I guess take my reply as a hint for next
time :)


Regards,
Kacper
17 gru 2015 5:14 PM "Ganesh Ajjanagadde" <gajja...@mit.edu> napisał(a):

> On Tue, Dec 15, 2015 at 6:40 PM, Ganesh Ajjanagadde <gajja...@mit.edu>
> wrote:
> > On Tue, Dec 15, 2015 at 5:25 PM, Ganesh Ajjanagadde <gajja...@mit.edu>
> wrote:
> >> On Tue, Dec 15, 2015 at 2:23 AM, Michael Niedermayer <michae...@gmx.at>
> wrote:
> >>> On Wed, Dec 09, 2015 at 06:55:25PM -0500, Ganesh Ajjanagadde wrote:
> > [...]
> >>>>
> >>>> diff --git a/libavcodec/nellymoserenc.c b/libavcodec/nellymoserenc.c
> >>>> index d998dba..e6023e3 100644
> >>>> --- a/libavcodec/nellymoserenc.c
> >>>> +++ b/libavcodec/nellymoserenc.c
> >>>> @@ -179,8 +179,15 @@ static av_cold int encode_init(AVCodecContext
> *avctx)
> >>>>
> >>>>      /* Generate overlap window */
> >>>>      ff_init_ff_sine_windows(7);
> >>>> -    for (i = 0; i < POW_TABLE_SIZE; i++)
> >>>> -        pow_table[i] = pow(2, -i / 2048.0 - 3.0 + POW_TABLE_OFFSET);
> >>>> +    pow_table[0] = 1;
> >>>> +    pow_table[1024] = M_SQRT1_2;
> >>>> +    for (i = 1; i < 513; i++) {
> >>>> +        double tmp = exp2(-i / 2048.0);
> >>>> +        pow_table[i] = tmp;
> >>>> +        pow_table[1024-i] = M_SQRT1_2 / tmp;
> >>>> +        pow_table[1024+i] = tmp * M_SQRT1_2;
> >>>> +        pow_table[2048-i] = 0.5 / tmp;
> >>>
> >>> how much overall init time is gained by this ?
> >>> that is time in ffmpeg main() from start to finish when just opening
> >>> the file with no decoding aka ./ffmpeg -i somefile
> >>
> >> Don't know, all I know is cycles are unnecessarily wasted. Will put in
> >> cycle numbers.
> >>
> >
> > Here they are:
> > proposed: 424160 decicycles in pow_table,     512 runs,      0 skips
> > exp2 only: 1262093 decicycles in pow_table,     512 runs,      0 skips
> > old: 2849085 decicycles in pow_table,     512 runs,      0 skips
> >
> > Thus old to exp2 is roughly 2.25x speedup, exp2 to proposed roughly 3x
> > speedup, net ~ 6.7x speedup.
>
> took Michael's comment as a general ack, so pushed with addition of a
> comment and cycle numbers.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] avcodec/nellymoserenc: avoid wasteful pow

Reply via email to