Re: [FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Dan Parrot Mon, 04 Jul 2016 12:01:39 -0700

On Mon, 2016-07-04 at 20:55 +0200, Hendrik Leppkes wrote:
> On Mon, Jul 4, 2016 at 5:20 PM, Dan Parrot <dan.par...@mail.com> wrote:
> >> Why is this not faster?
> > Surprisingly, gcc is producing some badly suboptimal assembly. I need to
> > follow up with IBM's Linux Technology Center. The major issue is that
> > multiplication of vector quantities in C is generating as many
> > multiplications in assembly as would scalar multiplication in a loop. No
> > way that should be occurring.
> >
> 
> This is the reason why we generally don't allow intrinsic
> optimizations and instead ask people to write full assembly instead.
> It behaves more consistently everywhere.


Is this then a requirement to abandon the use of intrinsics for PPC64
SIMD and instead re-implement in assembly?




_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Reply via email to