On Mon, May 20, 2024 at 7:23 AM Ronald S. Bultje <rsbul...@gmail.com> wrote:
> Hi, > > This is mostly good, the following is tiny nitpicks. > > On Sun, May 19, 2024 at 8:46 PM Stone Chen <chen.stonec...@gmail.com> > wrote: > >> +%macro INIT_OFFSET 6 ; src1, src2, dxq, dyq, off1, off2 >> > > The macro is only used once, so you could inline it in the calling > function. > >> >> + imul %5, 128 >> + imul %6, 128 >> > > I believe shl is typically preferred over imul for powers of two. > > >> + add %5, 2 >> + add %6, 2 >> > > And these can be integrated as a constant offset in the lea below (lea %1, > [%1 + %5 * 2 + 2 * 2], same for %2). > > >> + add %5, %3 >> + sub %6, %3 >> + >> + lea %1, [%1 + %5 * 2] >> + lea %2, [%2 + %6 * 2] > > [..] > >> +cglobal vvc_sad, 6, 11, 5, src1, src2, dx, dy, block_w, block_h, off1, >> off2, row_idx, dx2, dy2 >> + movsxd dx2q, dxd >> + movsxd dy2q, dyd >> > > If you change the argument type from int to intptr_t, this is not > necessary anymore. > > >> + vvc_sad_16_128: >> + .loop_height: >> + mov off1q, src1q >> + mov off2q, src2q >> + mov row_idxd, block_wd >> + sar row_idxd, 4 >> > > You could right-shift block_wd by 4 outside the loop (before .loop_height). > > Ronald > On Mon, May 20, 2024 at 11:53 AM Ronald S. Bultje <rsbul...@gmail.com> wrote: > Hi, > > one more, I forgot. > > On Sun, May 19, 2024 at 8:46 PM Stone Chen <chen.stonec...@gmail.com> > wrote: > >> +pw_1: dw 1 >> > [..] > >> + vpbroadcastw m4, [pw_1] >> > > We typically suggest to use vpbroadcastd, not w (and then pw_1: times 2 dw > 1). agner shows that on e.g. Haswell, the former (d) is 1 uops with 5 > cycles latency, whereas the latter (w) is 3 uops with 7 cycles latency, or > more generally d is faster then w. > > Ronald > Hi Ronald, I've sent a v5 incorporating all the above, thank you for the feedback! -Stone _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".