Re: [FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Hendrik Leppkes Mon, 04 Jul 2016 11:56:41 -0700

On Mon, Jul 4, 2016 at 5:20 PM, Dan Parrot <dan.par...@mail.com> wrote:
>> Why is this not faster?
> Surprisingly, gcc is producing some badly suboptimal assembly. I need to
> follow up with IBM's Linux Technology Center. The major issue is that
> multiplication of vector quantities in C is generating as many
> multiplications in assembly as would scalar multiplication in a loop. No
> way that should be occurring.
>


This is the reason why we generally don't allow intrinsic
optimizations and instead ask people to write full assembly instead.
It behaves more consistently everywhere.

- Hendrik
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Reply via email to