[FFmpeg-devel] [PATCH v5] libavcodec/riscv:add RVV optimized idct_32x32_8 for HEVC

2025-05-30 Thread daichengrong
From: daichengrong On Banana PI F3: hevc_idct_32x32_8_c:118807.4 ( 1.00x) hevc_idct_32x32_8_rvv_i64: 13853.3 ( 8.58x) hevc_idct_32x32_8_rvv_i64: 20247.3 ( 5.92x) (before) Changes in v5

[FFmpeg-devel] [PATCH v4] libavcodec/riscv:add RVV optimized idct_32x32_8 for HEVC

2025-05-20 Thread daichengrong
From: daichengrong Since there are no comments for v2 and v3, we have continued to optimize according to the comments of v1. We spilled the slide to memory to help improve performance,and optimized the extraction of elements from vector registers. On Banana PI F3: hevc_idct_32x32_8_c

[FFmpeg-devel] [PATCH v3] libavcodec/riscv:add RVV optimized idct_32x32_8 for HEVC:

2025-05-07 Thread daichengrong
From: daichengrong On Banana PI F3: hevc_idct_32x32_8_c:118833.7 ( 1.00x) hevc_idct_32x32_8_rvv_i64: 28718.3 ( 4.14x) Changes in v3: remove the slides in transposition and spill values from vector registers to

Re: [FFmpeg-devel] [PATCH v2] libavcodec/riscv:add RVV optimized for idct_32x32_8:

2025-05-05 Thread daichengrong
ping~ From: daichengrong riscv/hevcdsp_idct_rvv: Optimize idct_32x32_8 On Banana PI F3: hevc_idct_32x32_8_c:118945.0 ( 1.00x) hevc_idct_32x32_8_rvv_i64: 28503.7 ( 4.17x) Signed-off-by: daichengrong

[FFmpeg-devel] [PATCH v v2] libavcodec/riscv:add RVV optimized for idct_32x32_8:

2025-04-28 Thread daichengrong
From: daichengrong riscv/hevcdsp_idct_rvv: Optimize idct_32x32_8 On Banana PI F3: hevc_idct_32x32_8_c:118945.0 ( 1.00x) hevc_idct_32x32_8_rvv_i64: 28503.7 ( 4.17x) Signed-off-by: daichengrong --- libavcodec/riscv

[FFmpeg-devel] [PATCH] libavcodec/riscv:add RVV optimized for idct_32x32_8:

2025-04-15 Thread daichengrong
From: daichengrong riscv/hevcdsp_idct_rvv: Optimize idct_32x32_8 On Banana PI F3: hevc_idct_32x32_8_c:119579.3 ( 1.00x) hevc_idct_32x32_8_rvv_i64: 51254.4 ( 2.33x) Signed-off-by: daichengrong --- libavcodec/riscv

Re: [FFmpeg-devel] [PATCH] RISC-V:update ff_get_cpu_flags_riscv for RVV

2025-04-05 Thread daichengrong
在 2025/3/20 19:17:21, Rémi Denis-Courmont : Hi, Le 20 mars 2025 11:27:39 GMT+02:00, daichengrong a écrit : Availability of RVV and ZVBB should be determined with dl_hwcap. No. That's completely superfluous since we already check for kernel support with hwprobe(). No. If the oper

[FFmpeg-devel] [PATCH v2] libavutil/riscv:update hwprobe for RVV and ZVBB with dl_hwcap

2025-04-05 Thread daichengrong
From: daichengrong Availability of RVV and ZVBB should be determined with dl_hwcap. As those extensions rely on vector registers, kernel vector support is required to save the state of context switching. FFmpeg requires hwprobe for hardware capability detection, and cooperates with dl_hwcap

Re: [FFmpeg-devel] [PATCH] RISC-V:update ff_get_cpu_flags_riscv for RVV

2025-03-20 Thread daichengrong
hi, The reply email was mistakenly classified as spam, resulting in not being seen in time. Late reply. 在 2025/3/15 12:03:09, Rémi Denis-Courmont : Hi, Le 14 mars 2025 17:32:57 GMT+07:00, daichengr...@iscas.ac.cn a écrit : From: daichengrong Availability of RVV and ZVBB should be

[FFmpeg-devel] [PATCH] libswresample/riscv:add RVV optimized for conv_flt_to_s16

2025-03-20 Thread daichengrong
From: daichengrong This patch introduces RVV optimized for conv_flt_to_s16. On Banana PI F3, it gets an average improvement of 5% for 2 SAMPLES. --- libswresample/audioconvert.c | 2 + libswresample/riscv/Makefile | 3 ++ libswresample/riscv/audio_convert_init.c

[FFmpeg-devel] [PATCH] RISC-V:update ff_get_cpu_flags_riscv for RVV

2025-03-14 Thread daichengrong
From: daichengrong Availability of RVV and ZVBB should be determined with dl_hwcap. As those extensions rely on vector registers, kernel vector support is required to save the state of context switching. FFmpeg requires hwprobe for hardware capability detection, and cooperates with dl_hwcap