On Fri, Dec 18, 2015 at 1:11 AM, Kacper Michajlow <kaspe...@gmail.com> wrote: > 18 gru 2015 10:06 AM "Kacper Michajlow" <kaspe...@gmail.com> napisał(a): >> >> One minor nitpick about commit message. You could mention which compiler > was used to generate code for benchmark. For example Clang 3.7 replaces > pow(2,...) with exp2(...) call by itself. So you probably did use gcc. > Anyway since it is already merged I guess take my reply as a hint for next > time :)
Thanks: yes, I have been sloppy about this. >> >> Regards, >> Kacper >> >> 17 gru 2015 5:14 PM "Ganesh Ajjanagadde" <gajja...@mit.edu> napisał(a): >>> >>> On Tue, Dec 15, 2015 at 6:40 PM, Ganesh Ajjanagadde <gajja...@mit.edu> > wrote: >>> > On Tue, Dec 15, 2015 at 5:25 PM, Ganesh Ajjanagadde <gajja...@mit.edu> > wrote: >>> >> On Tue, Dec 15, 2015 at 2:23 AM, Michael Niedermayer <michae...@gmx.at> > wrote: >>> >>> On Wed, Dec 09, 2015 at 06:55:25PM -0500, Ganesh Ajjanagadde wrote: >>> > [...] >>> >>>> >>> >>>> diff --git a/libavcodec/nellymoserenc.c b/libavcodec/nellymoserenc.c >>> >>>> index d998dba..e6023e3 100644 >>> >>>> --- a/libavcodec/nellymoserenc.c >>> >>>> +++ b/libavcodec/nellymoserenc.c >>> >>>> @@ -179,8 +179,15 @@ static av_cold int encode_init(AVCodecContext > *avctx) >>> >>>> >>> >>>> /* Generate overlap window */ >>> >>>> ff_init_ff_sine_windows(7); >>> >>>> - for (i = 0; i < POW_TABLE_SIZE; i++) >>> >>>> - pow_table[i] = pow(2, -i / 2048.0 - 3.0 + > POW_TABLE_OFFSET); >>> >>>> + pow_table[0] = 1; >>> >>>> + pow_table[1024] = M_SQRT1_2; >>> >>>> + for (i = 1; i < 513; i++) { >>> >>>> + double tmp = exp2(-i / 2048.0); >>> >>>> + pow_table[i] = tmp; >>> >>>> + pow_table[1024-i] = M_SQRT1_2 / tmp; >>> >>>> + pow_table[1024+i] = tmp * M_SQRT1_2; >>> >>>> + pow_table[2048-i] = 0.5 / tmp; >>> >>> >>> >>> how much overall init time is gained by this ? >>> >>> that is time in ffmpeg main() from start to finish when just opening >>> >>> the file with no decoding aka ./ffmpeg -i somefile >>> >> >>> >> Don't know, all I know is cycles are unnecessarily wasted. Will put in >>> >> cycle numbers. >>> >> >>> > >>> > Here they are: >>> > proposed: 424160 decicycles in pow_table, 512 runs, 0 skips >>> > exp2 only: 1262093 decicycles in pow_table, 512 runs, 0 skips >>> > old: 2849085 decicycles in pow_table, 512 runs, 0 skips >>> > >>> > Thus old to exp2 is roughly 2.25x speedup, exp2 to proposed roughly 3x >>> > speedup, net ~ 6.7x speedup. >>> >>> took Michael's comment as a general ack, so pushed with addition of a >>> comment and cycle numbers. >>> _______________________________________________ >>> ffmpeg-devel mailing list >>> ffmpeg-devel@ffmpeg.org >>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > Sorry for top post. No problem. > > -Kacper > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel