Re: [FFmpeg-devel] [PATCH] swscale/ppc: VSX-optimize hscale_fast

Lauri Kasanen Tue, 30 Apr 2019 04:39:01 -0700

On Wed, 24 Apr 2019 14:02:16 +0300
Lauri Kasanen <c...@gmx.com> wrote:


> ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags 
> fast_bilinear \
>         -s 2400x720 -f rawvideo -vframes 5 -pix_fmt abgr -nostats test.raw
>
> 4.27 speedup for hyscale_fast:
>   24796 UNITS in hyscale_fast,    4096 runs,      0 skips
>    5797 UNITS in hyscale_fast,    4096 runs,      0 skips
>
> 4.48 speedup for hcscale_fast:
>   19911 UNITS in hcscale_fast,    4095 runs,      1 skips
>    4437 UNITS in hcscale_fast,    4096 runs,      0 skips
>
> Signed-off-by: Lauri Kasanen <c...@gmx.com>
> ---
>  libswscale/ppc/swscale_vsx.c | 196 
> +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 196 insertions(+)
>
> This has the same limit as the x86 version, same width or larger only.
> Shrinking would require a gather load, which doesn't exist on PPC and is slow
> even on x86 AVX. I tried a manual gather load, and the vector function was 20%
> slower than C.

Applying.

- Lauri
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] swscale/ppc: VSX-optimize hscale_fast

Reply via email to