Re: [FFmpeg-devel] [PATCH 7/7] avcodec/riscv: add h264 qpel

2024-08-28 Thread Niklas Haas
On Wed, 28 Aug 2024 13:30:02 +0200 Niklas Haas wrote: > On Tue, 27 Aug 2024 21:47:59 +0300 Rémi Denis-Courmont > wrote: > > Le 27 août 2024 17:12:03 GMT+03:00, Niklas Haas a écrit : > > >> > +.irp x, \vregs > > >> > +vmax.vx \x, \x, zero > > >> > +.endr > > >> > +

Re: [FFmpeg-devel] [PATCH 7/7] avcodec/riscv: add h264 qpel

2024-08-28 Thread Niklas Haas
On Tue, 27 Aug 2024 21:47:59 +0300 Rémi Denis-Courmont wrote: > Le 27 août 2024 17:12:03 GMT+03:00, Niklas Haas a écrit : > >> > +.irp x, \vregs > >> > +vmax.vx \x, \x, zero > >> > +.endr > >> > +vsetvli zero, zero, e8, \lmul, ta, ma > >> > +.irp x, \vr

Re: [FFmpeg-devel] [PATCH 7/7] avcodec/riscv: add h264 qpel

2024-08-27 Thread Rémi Denis-Courmont
Le 27 août 2024 17:12:03 GMT+03:00, Niklas Haas a écrit : >> > +.irp x, \vregs >> > +vmax.vx \x, \x, zero >> > +.endr >> > +vsetvli zero, zero, e8, \lmul, ta, ma >> > +.irp x, \vregs >> > +vnclipu.wi \x, \x, \shifti >> > +.endr >> > +.en

Re: [FFmpeg-devel] [PATCH 7/7] avcodec/riscv: add h264 qpel

2024-08-27 Thread Niklas Haas
On Mon, 19 Aug 2024 21:27:38 +0300 Rémi Denis-Courmont wrote: > Le tiistaina 13. elokuuta 2024, 17.03.36 EEST J. Dekker a écrit : > > +#include "libavutil/riscv/asm.S" > > + > > +.macro vnclipsu.wi shifti, lmul, lmul2, vregs:vararg > > +vsetvli zero, zero, e16, \lmul2, ta, ma

Re: [FFmpeg-devel] [PATCH 7/7] avcodec/riscv: add h264 qpel

2024-08-19 Thread Rémi Denis-Courmont
Le tiistaina 13. elokuuta 2024, 17.03.36 EEST J. Dekker a écrit : > +#include "libavutil/riscv/asm.S" > + > +.macro vnclipsu.wi shifti, lmul, lmul2, vregs:vararg > +vsetvli zero, zero, e16, \lmul2, ta, ma We don't typically do that for a very good reason. The vsetvli is most o

[FFmpeg-devel] [PATCH 7/7] avcodec/riscv: add h264 qpel

2024-08-13 Thread J. Dekker
From: Niklas Haas checkasm: bench runs 131072 (1 << 17) avg_h264_qpel_4_mc00_8_c: 37.6 ( 1.00x) avg_h264_qpel_4_mc00_8_rvv_i32: 27.4 ( 1.37x) avg_h264_qpel_4_mc01_8_c: 214.6 ( 1.00x) avg_h264_qpel_4_mc01_8_rvv_i32: