Hi, On Mon, Mar 7, 2016 at 10:48 PM, Ganesh Ajjanagadde <gajja...@gmail.com> wrote:
> This is ~2x faster for y not an integer on Haswell+GCC, and should > generally be faster due to the fact that anyway powf essentially does > this under the hood. > > Note that there are some accuracy differences, that should generally be > negligible. In particular, FATE still passes on this platform. > > Results in ~ 7% speedup in aac encoding with -march=native, Haswell+GCC. > before: > ffmpeg -i sin.flac -acodec aac -y sin_new.aac 6.05s user 0.06s system > 104% cpu 5.821 total > > after: > ffmpeg -i sin.flac -acodec aac -y sin_new.aac 5.67s user 0.03s system > 105% cpu 5.416 total > > This is also faster than an alternative approach that pulls in powf, gets > rid of > the crufty NaN checks and other special cases, exploits knowledge about > the intervals, etc. > This of course does not exclude smarter approaches; just suggests that > there would need to be significant work on this front of lower utility than > searches for hotspots elsewhere. > > Signed-off-by: Ganesh Ajjanagadde <gajja...@gmail.com> > --- > libavcodec/aacenc_utils.h | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/libavcodec/aacenc_utils.h b/libavcodec/aacenc_utils.h > index 56e3462..b7f80c6 100644 > --- a/libavcodec/aacenc_utils.h > +++ b/libavcodec/aacenc_utils.h > @@ -121,7 +121,10 @@ static inline float find_form_factor(int group_len, > int swb_size, float thresh, > if (s >= ethresh) { > nzl += 1.0f; > } else { > - nzl += powf(s / ethresh, nzslope); > + if (nzslope == 2.f) > + nzl += (s / ethresh) * (s / ethresh); > + else > + nzl += expf(logf(s / ethresh) * nzslope); > } > } > There's two changes here. Which gives the speedup? I don't like the second (pow -> exp(log())) if it doesn't give a speedup (I don't see why it would, also). Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel