On Fri, Oct 2, 2015 at 6:57 PM, Paul B Mahol <one...@gmail.com> wrote:
> +INIT_XMM sse2
> +cglobal blend_xor, 9, 10, 2, 0, top, top_linesize, bottom, bottom_linesize, 
> dst, dst_linesize, width, start, end
[...]
> +cglobal blend_or, 9, 10, 2, 0, top, top_linesize, bottom, bottom_linesize, 
> dst, dst_linesize, width, start, end
[...]
> +cglobal blend_and, 9, 10, 2, 0, top, top_linesize, bottom, bottom_linesize, 
> dst, dst_linesize, width, start, end

You could do those using floating point operations (xorps, orps,
andps), then you only need SSE instead of SSE2 (and AVX instead of
AVX2 if you want to make versions using ymm registers).

> +cglobal blend_addition, 9, 10, 3, 0, top, top_linesize, bottom, 
> bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +        punpcklbw       m0, m2
> +        punpcklbw       m1, m2
> +        paddw           m0, m1
> +        packuswb        m0, m0
> +        movh    [dstq + x], m0
> +        add           r10q, mmsize / 2

paddusb

> +cglobal blend_subtract, 9, 10, 3, 0, top, top_linesize, bottom, 
> bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +        punpcklbw       m0, m2
> +        punpcklbw       m1, m2
> +        psubw           m0, m1
> +        packuswb        m0, m0

psubusb

> +cglobal blend_darken, 9, 10, 2, 0, top, top_linesize, bottom, 
> bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +        movh            m0, [topq + x]
> +        movh            m1, [bottomq + x]
> +        pminub          m0, m1
> +        movh    [dstq + x], m0
[...]
> +cglobal blend_lighten, 9, 10, 2, 0, top, top_linesize, bottom, 
> bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +        movh            m0, [topq + x]
> +        movh            m1, [bottomq + x]
> +        pmaxub          m0, m1
> +        movh    [dstq + x], m0

You're only utilizing the lower half the registers here.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to