On 09.03.2016, at 04:16, Ganesh Ajjanagadde <gajja...@gmail.com> wrote:

> Yields 2x improvement in function performance, and boosts aac encoding
> speed by ~ 4% overall. Sample benchmark (Haswell+GCC under -march=native):
> after:
> ffmpeg -i sin.flac -acodec aac -y sin_new.aac  5.22s user 0.03s system 105% 
> cpu 4.970 total
> 
> before:
> ffmpeg -i sin.flac -acodec aac -y sin_new.aac  5.40s user 0.05s system 105% 
> cpu 5.162 total
> 
> Big shame that len-1 is -1 mod 4; 0 mod 4 would have yielded a further 2x 
> through
> additional symmetry. Of course, one could approximate with the 0 mod 4 
> variant,
> error would essentially be ~ 1/len in the worst case.

Note that I have no idea why we are using double here (is there a good reason?)
It doesn't really matter for the rest of the code, but cosf is also at least 
twice as fast as cos...
Probably has smaller error than fudging for symmetry and should be enough to 
push the speed cost of this function close to negligible.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to