From: daichengrong
On Banana PI F3:
hevc_idct_32x32_8_c:118807.4 ( 1.00x)
hevc_idct_32x32_8_rvv_i64: 13853.3 ( 8.58x)
hevc_idct_32x32_8_rvv_i64: 20247.3 ( 5.92x)
(before)
Changes in v5
From: daichengrong
Since there are no comments for v2 and v3, we have continued to optimize
according to the comments of v1.
We spilled the slide to memory to help improve performance,and optimized the
extraction of elements from vector registers.
On Banana PI F3:
hevc_idct_32x32_8_c
From: daichengrong
On Banana PI F3:
hevc_idct_32x32_8_c:118833.7 ( 1.00x)
hevc_idct_32x32_8_rvv_i64: 28718.3 ( 4.14x)
Changes in v3:
remove the slides in transposition and spill values from vector
registers to
ping~
From: daichengrong
riscv/hevcdsp_idct_rvv: Optimize idct_32x32_8
On Banana PI F3:
hevc_idct_32x32_8_c:118945.0 ( 1.00x)
hevc_idct_32x32_8_rvv_i64: 28503.7 ( 4.17x)
Signed-off-by: daichengrong
From: daichengrong
riscv/hevcdsp_idct_rvv: Optimize idct_32x32_8
On Banana PI F3:
hevc_idct_32x32_8_c:118945.0 ( 1.00x)
hevc_idct_32x32_8_rvv_i64: 28503.7 ( 4.17x)
Signed-off-by: daichengrong
---
libavcodec/riscv
From: daichengrong
riscv/hevcdsp_idct_rvv: Optimize idct_32x32_8
On Banana PI F3:
hevc_idct_32x32_8_c:119579.3 ( 1.00x)
hevc_idct_32x32_8_rvv_i64: 51254.4 ( 2.33x)
Signed-off-by: daichengrong
---
libavcodec/riscv
在 2025/3/20 19:17:21, Rémi Denis-Courmont :
Hi,
Le 20 mars 2025 11:27:39 GMT+02:00, daichengrong a
écrit :
Availability of RVV and ZVBB should be determined with dl_hwcap.
No. That's completely superfluous since we already check for kernel support
with hwprobe().
No. If the oper
From: daichengrong
Availability of RVV and ZVBB should be determined with dl_hwcap.
As those extensions rely on vector registers, kernel vector support
is required to save the state of context switching.
FFmpeg requires hwprobe for hardware capability detection, and cooperates
with dl_hwcap
hi,
The reply email was mistakenly classified as spam, resulting in not
being seen in time.
Late reply.
在 2025/3/15 12:03:09, Rémi Denis-Courmont :
Hi,
Le 14 mars 2025 17:32:57 GMT+07:00, daichengr...@iscas.ac.cn a écrit :
From: daichengrong
Availability of RVV and ZVBB should be
From: daichengrong
This patch introduces RVV optimized for conv_flt_to_s16.
On Banana PI F3, it gets an average improvement of 5% for 2 SAMPLES.
---
libswresample/audioconvert.c | 2 +
libswresample/riscv/Makefile | 3 ++
libswresample/riscv/audio_convert_init.c
From: daichengrong
Availability of RVV and ZVBB should be determined with dl_hwcap.
As those extensions rely on vector registers, kernel vector support
is required to save the state of context switching.
FFmpeg requires hwprobe for hardware capability detection, and cooperates
with dl_hwcap
11 matches
Mail list logo