I guess it could also be scaled to ymm if you're a big Skylake fan :P (in which case you'd probably want to reorder the shuffle indices so that chroma comes first, i.e. movq [u] + movhps [v] + vinserti32x4[y])

What shuffle or permute did you have in mind when you suggested this for Skylake? Without the permute I'm not sure how the change in ordering helps. Aren't we stuck with data in separate lanes? I'm probably missing something though.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to