vf_blend : add avx2 version for 8b func (WIP)

Henrik Gramner Wed, 13 Dec 2017 08:43:44 -0800

On Sat, Dec 9, 2017 at 1:11 PM, Martin Vignali <martin.vign...@gmail.com> wrote:
> the idea in AVX2 is to load 128bits of data (2x 64 bits)
> then shuffle accross lane, the two 64 bits in the low part of each lane, to
> keep the rest of the process similar
> to the sse version


What about using pmovzxbw instead of movu + vpermq + punpcklbw?

> for the store, the idea is similar in the opposite way (shuffle before
> store)

You could also do vextracti128 + 128-bit packuswb instead of 256-bit
packuswb + vpermq.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] avfilter/x86/vf_blend : add avx2 version for 8b func (WIP)

Reply via email to