On Mon, Jul 4, 2016 at 5:20 PM, Dan Parrot <dan.par...@mail.com> wrote: >> Why is this not faster? > Surprisingly, gcc is producing some badly suboptimal assembly. I need to > follow up with IBM's Linux Technology Center. The major issue is that > multiplication of vector quantities in C is generating as many > multiplications in assembly as would scalar multiplication in a loop. No > way that should be occurring. >
This is the reason why we generally don't allow intrinsic optimizations and instead ask people to write full assembly instead. It behaves more consistently everywhere. - Hendrik _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel