Hi, On Sun, Feb 25, 2024 at 5:30 PM Henrik Gramner via ffmpeg-devel < ffmpeg-devel@ffmpeg.org> wrote:
> On Sun, Feb 25, 2024 at 5:42 PM Ronald S. Bultje <rsbul...@gmail.com> > wrote: > > + mova m13, [pw_8] > > + paddw m10, m12, m12 > > + paddw m12, m10 ; 9 * (q0 - p0) - 3 * ( q1 - p1 ) > > paddw m12, m13; + 8 > > Memory operand > > > + paddw m10, m13, m13 > > + paddw m13, m10 ; abs(9 * (q0 - p0) - 3 * ( q1 - p1 )) > > + paddw m13, [pw_8] > [...] > > + paddw m13, m12, m12 > > + paddw m13, m12 ; 3*abs(m12) > > + paddw m13, [pw_8] > > Another minor improvement would be to reorder the adds like (x + x) + > (x + 8) instead of ((x + x) + x) + 8 to allow for more > instruction-level parallelism. > New version attached. Ronald
0001-hevc-x86-deblock-fix-12bit-overflow.patch
Description: Binary data
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".