+        %else
+            pand       m1, m6, m1
+            pandn      m0, m6, m0
+            por        m0, m0, m1
+        %endif

Isn't that pattern a vpblendb or some such ?

I think Kieran already responded to this on IRC but I will too. Unfortunately not. This blend is at the bit level. This is v210 so the packing has the middle sample overlapping with the bottom sample in the second byte.

I also want to amend my performance numbers on Broadwell. I can confirm Kieran's disagreement and can reproduce the 10% speed up on it:
    1676±14.6 vs 1426±20.9

I will re-check Zen and amend the commit message as necessary.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to