Re: [FFmpeg-devel] [PATCH v3 7/7] avutil/la: Add function performance testing
在 2023/5/20 下午5:38, Rémi Denis-Courmont 写道: Le lauantaina 20. toukokuuta 2023, 10.27.19 EEST Hao Chen a écrit : From: yuanhecai This patch supports the use of the "checkasm --bench" testing feature on loongarch platform. Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3 --- libavutil/loongarch/timer.h | 48 + libavutil/timer.h | 2 ++ 2 files changed, 50 insertions(+) create mode 100644 libavutil/loongarch/timer.h diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h new file mode 100644 index 00..44ed786409 --- /dev/null +++ b/libavutil/loongarch/timer.h @@ -0,0 +1,48 @@ +/* + * Copyright (c) 2023 Loongson Technology Corporation Limited + * Contributed by Hecai Yuan + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_LOONGARCH_TIMER_H +#define AVUTIL_LOONGARCH_TIMER_H + +#include +#include "config.h" + +#if HAVE_INLINE_ASM + +#define AV_READ_TIME read_time + +static inline uint64_t read_time(void) +{ + +#if ARCH_LOONGARCH64 +uint64_t a, id = 0; Initial value is never used. +__asm__ volatile ( "rdtime.d %0, %1" : "=r"(a), "=r"(id) :: "memory" ); +return a; +#else +uint32_t a, id = 0; +__asm__ volatile ( "rdtimel.w %0, %1" : "=r"(a), "=r"(id) :: "memory" ); +return (uint64_t)a; +#endif Why do you clobber memory here? +} + +#endif /* HAVE_INLINE_ASM */ + +#endif /* AVUTIL_LOONGARCH_TIMER_H */ diff --git a/libavutil/timer.h b/libavutil/timer.h index d3db5a27ef..861ba7e9d7 100644 --- a/libavutil/timer.h +++ b/libavutil/timer.h @@ -61,6 +61,8 @@ # include "riscv/timer.h" #elif ARCH_X86 # include "x86/timer.h" +#elif ARCH_LOONGARCH +# include "loongarch/timer.h" #endif #if !defined(AV_READ_TIME) Thanks for your advice. As described in loongarch's instruction manual, the rdtime.d instruction is used as follows: rdtime.d rd, rj. The rj register stores the counter ID. In this application, the value of counter ID is equal to 0. In addition, clobbering memory is really not a necessary operation. I will remove it. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] Add LSX optimization in avcodec and swscale.
v1: Add LSX optimization in avcodec and swscale, due to the 2K series CPUs only support lsx. v2: Modified the implementation of some functions and added support for the checkasm --bench feature. v3: Fix whitespace errors in patch. v4: Remove clobbering memory in libavutil/loongarch/timer.h [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct. [PATCH v4 2/7] avcodec/la: Add LSX optimization for loop filter. [PATCH v4 3/7] avcodec/la: Add LSX optimization for h264 chroma and [PATCH v4 4/7] avcodec/la: Add LSX optimization for h264 qpel. [PATCH v4 5/7] swscale/la: Optimize the functions of the swscale [PATCH v4 6/7] swscale/la: Add following builtin optimized functions [PATCH v4 7/7] avutil/la: Add function performance testing ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.
From: Shiyou Yin loongson_asm.S is LoongArch asm optimization helper. Add functions: ff_h264_idct_add_8_lsx ff_h264_idct8_add_8_lsx ff_h264_idct_dc_add_8_lsx ff_h264_idct8_dc_add_8_lsx ff_h264_idct_add16_8_lsx ff_h264_idct8_add4_8_lsx ff_h264_idct_add8_8_lsx ff_h264_idct_add8_422_8_lsx ff_h264_idct_add16_intra_8_lsx ff_h264_luma_dc_dequant_idct_8_lsx Replaced function(LSX is sufficient for these functions): ff_h264_idct_add_lasx ff_h264_idct4x4_addblk_dc_lasx ff_h264_idct_add16_lasx ff_h264_idct8_add4_lasx ff_h264_idct_add8_lasx ff_h264_idct_add8_422_lasx ff_h264_idct_add16_intra_lasx ff_h264_deq_idct_luma_dc_lasx Renamed functions: ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx ./configure --disable-lasx ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an before: 155fps after: 161fps --- libavcodec/loongarch/Makefile | 3 +- libavcodec/loongarch/h264_deblock_lasx.c | 2 +- libavcodec/loongarch/h264dsp_init_loongarch.c | 39 +- libavcodec/loongarch/h264dsp_lasx.c | 2 +- .../{h264dsp_lasx.h => h264dsp_loongarch.h} | 60 +- libavcodec/loongarch/h264idct.S | 658 libavcodec/loongarch/h264idct_lasx.c | 498 - libavcodec/loongarch/h264idct_loongarch.c | 184 libavcodec/loongarch/loongson_asm.S | 945 ++ 9 files changed, 1848 insertions(+), 543 deletions(-) rename libavcodec/loongarch/{h264dsp_lasx.h => h264dsp_loongarch.h} (68%) create mode 100644 libavcodec/loongarch/h264idct.S delete mode 100644 libavcodec/loongarch/h264idct_lasx.c create mode 100644 libavcodec/loongarch/h264idct_loongarch.c create mode 100644 libavcodec/loongarch/loongson_asm.S diff --git a/libavcodec/loongarch/Makefile b/libavcodec/loongarch/Makefile index c1b5de5c44..34ebbbe133 100644 --- a/libavcodec/loongarch/Makefile +++ b/libavcodec/loongarch/Makefile @@ -12,7 +12,6 @@ OBJS-$(CONFIG_HEVC_DECODER) += loongarch/hevcdsp_init_loongarch.o LASX-OBJS-$(CONFIG_H264CHROMA)+= loongarch/h264chroma_lasx.o LASX-OBJS-$(CONFIG_H264QPEL) += loongarch/h264qpel_lasx.o LASX-OBJS-$(CONFIG_H264DSP) += loongarch/h264dsp_lasx.o \ - loongarch/h264idct_lasx.o \ loongarch/h264_deblock_lasx.o LASX-OBJS-$(CONFIG_H264PRED) += loongarch/h264_intrapred_lasx.o LASX-OBJS-$(CONFIG_VC1_DECODER) += loongarch/vc1dsp_lasx.o @@ -31,3 +30,5 @@ LSX-OBJS-$(CONFIG_HEVC_DECODER) += loongarch/hevcdsp_lsx.o \ loongarch/hevc_mc_bi_lsx.o \ loongarch/hevc_mc_uni_lsx.o \ loongarch/hevc_mc_uniw_lsx.o +LSX-OBJS-$(CONFIG_H264DSP)+= loongarch/h264idct.o \ + loongarch/h264idct_loongarch.o diff --git a/libavcodec/loongarch/h264_deblock_lasx.c b/libavcodec/loongarch/h264_deblock_lasx.c index c89bea9a84..eead931dcf 100644 --- a/libavcodec/loongarch/h264_deblock_lasx.c +++ b/libavcodec/loongarch/h264_deblock_lasx.c @@ -20,7 +20,7 @@ */ #include "libavcodec/bit_depth_template.c" -#include "h264dsp_lasx.h" +#include "h264dsp_loongarch.h" #include "libavutil/loongarch/loongson_intrinsics.h" #define H264_LOOP_FILTER_STRENGTH_ITERATION_LASX(edges, step, mask_mv, dir, \ diff --git a/libavcodec/loongarch/h264dsp_init_loongarch.c b/libavcodec/loongarch/h264dsp_init_loongarch.c index 37633c3e51..cb07deb398 100644 --- a/libavcodec/loongarch/h264dsp_init_loongarch.c +++ b/libavcodec/loongarch/h264dsp_init_loongarch.c @@ -21,13 +21,32 @@ */ #include "libavutil/loongarch/cpu.h" -#include "h264dsp_lasx.h" +#include "h264dsp_loongarch.h" av_cold void ff_h264dsp_init_loongarch(H264DSPContext *c, const int bit_depth, const int chroma_format_idc) { int cpu_flags = av_get_cpu_flags(); +if (have_lsx(cpu_flags)) { +if (bit_depth == 8) { +c->h264_idct_add = ff_h264_idct_add_8_lsx; +c->h264_idct8_add= ff_h264_idct8_add_8_lsx; +c->h264_idct_dc_add = ff_h264_idct_dc_add_8_lsx; +c->h264_idct8_dc_add = ff_h264_idct8_dc_add_8_lsx; + +if (chroma_format_idc <= 1) +c->h264_idct_add8 = ff_h264_idct_add8_8_lsx; +else +c->h264_idct_add8 = ff_h264_idct_add8_422_8_lsx; + +c->h264_idct_add16 = ff_h264_idct_add16_8_lsx; +c->h264_idct8_add4 = ff_h264_idct8_add4_8_lsx; +c->h264_luma_dc_dequant_idct = ff_h264_luma_dc_dequant_idct_8_lsx; +c->h264_idct_add16intra = ff_h264_idct_add16_intra_8_lsx; +} +} +#if HAVE_LASX if (have_lasx(cpu_flags)) { if (chroma_format_idc <= 1) c->h264_loop_filter
[FFmpeg-devel] [PATCH v4 3/7] avcodec/la: Add LSX optimization for h264 chroma and intrapred.
From: Lu Wang ./configure --disable-lasx ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an before: 199fps after: 214fps --- libavcodec/loongarch/Makefile |4 +- .../loongarch/h264_intrapred_init_loongarch.c | 18 +- libavcodec/loongarch/h264_intrapred_lasx.c| 121 -- ...pred_lasx.h => h264_intrapred_loongarch.h} | 12 +- libavcodec/loongarch/h264chroma.S | 966 + .../loongarch/h264chroma_init_loongarch.c | 10 +- libavcodec/loongarch/h264chroma_lasx.c| 1280 - libavcodec/loongarch/h264chroma_lasx.h| 36 - libavcodec/loongarch/h264chroma_loongarch.h | 41 + libavcodec/loongarch/h264intrapred.S | 299 10 files changed, 1342 insertions(+), 1445 deletions(-) delete mode 100644 libavcodec/loongarch/h264_intrapred_lasx.c rename libavcodec/loongarch/{h264_intrapred_lasx.h => h264_intrapred_loongarch.h} (70%) create mode 100644 libavcodec/loongarch/h264chroma.S delete mode 100644 libavcodec/loongarch/h264chroma_lasx.c delete mode 100644 libavcodec/loongarch/h264chroma_lasx.h create mode 100644 libavcodec/loongarch/h264chroma_loongarch.h create mode 100644 libavcodec/loongarch/h264intrapred.S diff --git a/libavcodec/loongarch/Makefile b/libavcodec/loongarch/Makefile index 111bc23e4e..a563055161 100644 --- a/libavcodec/loongarch/Makefile +++ b/libavcodec/loongarch/Makefile @@ -9,11 +9,9 @@ OBJS-$(CONFIG_HPELDSP)+= loongarch/hpeldsp_init_loongarch.o OBJS-$(CONFIG_IDCTDSP)+= loongarch/idctdsp_init_loongarch.o OBJS-$(CONFIG_VIDEODSP) += loongarch/videodsp_init.o OBJS-$(CONFIG_HEVC_DECODER) += loongarch/hevcdsp_init_loongarch.o -LASX-OBJS-$(CONFIG_H264CHROMA)+= loongarch/h264chroma_lasx.o LASX-OBJS-$(CONFIG_H264QPEL) += loongarch/h264qpel_lasx.o LASX-OBJS-$(CONFIG_H264DSP) += loongarch/h264dsp_lasx.o \ loongarch/h264_deblock_lasx.o -LASX-OBJS-$(CONFIG_H264PRED) += loongarch/h264_intrapred_lasx.o LASX-OBJS-$(CONFIG_VC1_DECODER) += loongarch/vc1dsp_lasx.o LASX-OBJS-$(CONFIG_HPELDSP) += loongarch/hpeldsp_lasx.o LASX-OBJS-$(CONFIG_IDCTDSP) += loongarch/simple_idct_lasx.o \ @@ -33,3 +31,5 @@ LSX-OBJS-$(CONFIG_HEVC_DECODER) += loongarch/hevcdsp_lsx.o \ LSX-OBJS-$(CONFIG_H264DSP)+= loongarch/h264idct.o \ loongarch/h264idct_loongarch.o \ loongarch/h264dsp.o +LSX-OBJS-$(CONFIG_H264CHROMA) += loongarch/h264chroma.o +LSX-OBJS-$(CONFIG_H264PRED) += loongarch/h264intrapred.o diff --git a/libavcodec/loongarch/h264_intrapred_init_loongarch.c b/libavcodec/loongarch/h264_intrapred_init_loongarch.c index 12620bd842..c415fa30da 100644 --- a/libavcodec/loongarch/h264_intrapred_init_loongarch.c +++ b/libavcodec/loongarch/h264_intrapred_init_loongarch.c @@ -21,7 +21,7 @@ #include "libavutil/loongarch/cpu.h" #include "libavcodec/h264pred.h" -#include "h264_intrapred_lasx.h" +#include "h264_intrapred_loongarch.h" av_cold void ff_h264_pred_init_loongarch(H264PredContext *h, int codec_id, const int bit_depth, @@ -30,6 +30,22 @@ av_cold void ff_h264_pred_init_loongarch(H264PredContext *h, int codec_id, int cpu_flags = av_get_cpu_flags(); if (bit_depth == 8) { +if (have_lsx(cpu_flags)) { +if (chroma_format_idc <= 1) { +} +if (codec_id == AV_CODEC_ID_VP7 || codec_id == AV_CODEC_ID_VP8) { +} else { +if (chroma_format_idc <= 1) { +} +if (codec_id == AV_CODEC_ID_SVQ3) { +h->pred16x16[PLANE_PRED8x8] = ff_h264_pred16x16_plane_svq3_8_lsx; +} else if (codec_id == AV_CODEC_ID_RV40) { +h->pred16x16[PLANE_PRED8x8] = ff_h264_pred16x16_plane_rv40_8_lsx; +} else { +h->pred16x16[PLANE_PRED8x8] = ff_h264_pred16x16_plane_h264_8_lsx; +} +} +} if (have_lasx(cpu_flags)) { if (chroma_format_idc <= 1) { } diff --git a/libavcodec/loongarch/h264_intrapred_lasx.c b/libavcodec/loongarch/h264_intrapred_lasx.c deleted file mode 100644 index c38cd611b8..00 --- a/libavcodec/loongarch/h264_intrapred_lasx.c +++ /dev/null @@ -1,121 +0,0 @@ -/* - * Copyright (c) 2021 Loongson Technology Corporation Limited - * Contributed by Hao Chen - * - * This file is part of FFmpeg. - * - * FFmpeg is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public - * License as published by the Free Software Foundation; either - * version 2.1 of the License, or (at your option) any later version. - * - * FFmpeg is distributed in the hope that it will be useful, - * but WITHOUT ANY WARR
[FFmpeg-devel] [PATCH v4 4/7] avcodec/la: Add LSX optimization for h264 qpel.
From: yuanhecai ./configure --disable-lasx ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an before: 214fps after: 274fps --- libavcodec/loongarch/Makefile |2 + libavcodec/loongarch/h264qpel.S | 1686 + .../loongarch/h264qpel_init_loongarch.c | 74 +- libavcodec/loongarch/h264qpel_lasx.c | 401 +--- libavcodec/loongarch/h264qpel_lasx.h | 158 -- libavcodec/loongarch/h264qpel_loongarch.h | 312 +++ libavcodec/loongarch/h264qpel_lsx.c | 487 + 7 files changed, 2561 insertions(+), 559 deletions(-) create mode 100644 libavcodec/loongarch/h264qpel.S delete mode 100644 libavcodec/loongarch/h264qpel_lasx.h create mode 100644 libavcodec/loongarch/h264qpel_loongarch.h create mode 100644 libavcodec/loongarch/h264qpel_lsx.c diff --git a/libavcodec/loongarch/Makefile b/libavcodec/loongarch/Makefile index a563055161..06cfab5c20 100644 --- a/libavcodec/loongarch/Makefile +++ b/libavcodec/loongarch/Makefile @@ -31,5 +31,7 @@ LSX-OBJS-$(CONFIG_HEVC_DECODER) += loongarch/hevcdsp_lsx.o \ LSX-OBJS-$(CONFIG_H264DSP)+= loongarch/h264idct.o \ loongarch/h264idct_loongarch.o \ loongarch/h264dsp.o +LSX-OBJS-$(CONFIG_H264QPEL) += loongarch/h264qpel.o \ + loongarch/h264qpel_lsx.o LSX-OBJS-$(CONFIG_H264CHROMA) += loongarch/h264chroma.o LSX-OBJS-$(CONFIG_H264PRED) += loongarch/h264intrapred.o diff --git a/libavcodec/loongarch/h264qpel.S b/libavcodec/loongarch/h264qpel.S new file mode 100644 index 00..3f885b6ce2 --- /dev/null +++ b/libavcodec/loongarch/h264qpel.S @@ -0,0 +1,1686 @@ +/* + * Loongson LSX optimized h264qpel + * + * Copyright (c) 2023 Loongson Technology Corporation Limited + * Contributed by Hecai Yuan + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "loongson_asm.S" + +.macro VLD_QPEL8_H_SSRANI_LSX in0, in1, in2, in3, in4 +vld vr0,\in4, 0 +vldx vr1,\in4, a2 +QPEL8_H_LSX \in0, \in1 +vssrani.bu.h \in0, \in2, 5 +vssrani.bu.h \in1, \in3, 5 +.endm + +.macro VLDX_QPEL8_H_SSRANI_LSX in0, in1, in2, in3, in4 +vldx vr0,\in4, t1 +vldx vr1,\in4, t2 +QPEL8_H_LSX \in0, \in1 +vssrani.bu.h \in0, \in2, 5 +vssrani.bu.h \in1, \in3, 5 +.endm + +.macro VLD_DOUBLE_QPEL8_H_SSRANI_LSX in0, in1, in2, in3, in4, in5, in6, in7, in8 +vld vr0,\in8, 0 +vldx vr1,\in8, a2 +QPEL8_H_LSX \in0, \in1 +vssrani.bu.h \in0, \in4, 5 +vssrani.bu.h \in1, \in5, 5 +vldx vr0,\in8, t1 +vldx vr1,\in8, t2 +QPEL8_H_LSX \in2, \in3 +vssrani.bu.h \in2, \in6, 5 +vssrani.bu.h \in3, \in7, 5 +.endm + +function ff_put_h264_qpel16_mc00_lsx +slli.dt0, a2, 1 +add.d t1, t0, a2 +slli.dt2, t0, 1 +.rept 4 +vld vr0,a1, 0 +vldx vr1,a1, a2 +vldx vr2,a1, t0 +vldx vr3,a1, t1 +add.d a1, a1, t2 +vst vr0,a0, 0 +vstx vr1,a0, a2 +vstx vr2,a0, t0 +vstx vr3,a0, t1 +add.d a0, a0, t2 +.endr +endfunc + +.macro QPEL8_H_LSX out0, out1 +vbsrl.v vr2,vr0,1 +vbsrl.v vr3,vr1,1 +vbsrl.v vr4,vr0,2 +vbsrl.v vr5,vr1,2 +vbsrl.v vr6,vr0,3 +vbsrl.v vr7,vr1,3 +vbsrl.v vr8,vr0,4 +vbsrl.v vr9,vr1,4 +vbsrl.v vr10, vr0,5 +vbsrl.v vr11, vr1,5 + +vilvl.b vr6,vr4,vr6 +vilvl.b vr7,vr5,vr7 +vilvl.b vr8,vr2,vr8 +vilvl.b vr9,vr3,vr9 +vilvl.b vr10, vr0,vr10 +vilvl.b vr11, vr1,vr11 +vhaddw.hu.bu vr6,vr6,vr6 +vhaddw.hu.bu vr7,vr7,vr7 +vhaddw.hu.bu vr8,vr8,vr8 +vh
[FFmpeg-devel] [PATCH v4 7/7] avutil/la: Add function performance testing
From: yuanhecai This patch supports the use of the "checkasm --bench" testing feature on loongarch platform. Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3 --- libavutil/loongarch/timer.h | 48 + libavutil/timer.h | 2 ++ 2 files changed, 50 insertions(+) create mode 100644 libavutil/loongarch/timer.h diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h new file mode 100644 index 00..d70b88c859 --- /dev/null +++ b/libavutil/loongarch/timer.h @@ -0,0 +1,48 @@ +/* + * Copyright (c) 2023 Loongson Technology Corporation Limited + * Contributed by Hecai Yuan + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_LOONGARCH_TIMER_H +#define AVUTIL_LOONGARCH_TIMER_H + +#include +#include "config.h" + +#if HAVE_INLINE_ASM + +#define AV_READ_TIME read_time + +static inline uint64_t read_time(void) +{ + +#if ARCH_LOONGARCH64 +uint64_t a, id = 0; +__asm__ volatile ( "rdtime.d %0, %1" : "=r"(a), "=r"(id)); +return a; +#else +uint32_t a, id = 0; +__asm__ volatile ( "rdtimel.w %0, %1" : "=r"(a), "=r"(id)); +return (uint64_t)a; +#endif +} + +#endif /* HAVE_INLINE_ASM */ + +#endif /* AVUTIL_LOONGARCH_TIMER_H */ diff --git a/libavutil/timer.h b/libavutil/timer.h index d3db5a27ef..861ba7e9d7 100644 --- a/libavutil/timer.h +++ b/libavutil/timer.h @@ -61,6 +61,8 @@ # include "riscv/timer.h" #elif ARCH_X86 # include "x86/timer.h" +#elif ARCH_LOONGARCH +# include "loongarch/timer.h" #endif #if !defined(AV_READ_TIME) -- 2.20.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH v4 6/7] swscale/la: Add following builtin optimized functions
From: Jin Bo yuv420_rgb24_lsx yuv420_bgr24_lsx yuv420_rgba32_lsx yuv420_argb32_lsx yuv420_bgra32_lsx yuv420_abgr32_lsx ./configure --disable-lasx ffmpeg -i ~/media/1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -pix_fmt rgb24 -y /dev/null -an before: 184fps after: 207fps --- libswscale/loongarch/Makefile | 3 +- libswscale/loongarch/swscale_init_loongarch.c | 30 +- libswscale/loongarch/swscale_loongarch.h | 18 + libswscale/loongarch/yuv2rgb_lsx.c| 361 ++ 4 files changed, 410 insertions(+), 2 deletions(-) create mode 100644 libswscale/loongarch/yuv2rgb_lsx.c diff --git a/libswscale/loongarch/Makefile b/libswscale/loongarch/Makefile index c0b6a449c0..c35ba309a4 100644 --- a/libswscale/loongarch/Makefile +++ b/libswscale/loongarch/Makefile @@ -8,4 +8,5 @@ LSX-OBJS-$(CONFIG_SWSCALE) += loongarch/swscale.o \ loongarch/swscale_lsx.o \ loongarch/input.o \ loongarch/output.o \ - loongarch/output_lsx.o + loongarch/output_lsx.o \ + loongarch/yuv2rgb_lsx.o diff --git a/libswscale/loongarch/swscale_init_loongarch.c b/libswscale/loongarch/swscale_init_loongarch.c index c13a1662ec..53e4f970b6 100644 --- a/libswscale/loongarch/swscale_init_loongarch.c +++ b/libswscale/loongarch/swscale_init_loongarch.c @@ -90,8 +90,8 @@ av_cold void rgb2rgb_init_loongarch(void) av_cold SwsFunc ff_yuv2rgb_init_loongarch(SwsContext *c) { -#if HAVE_LASX int cpu_flags = av_get_cpu_flags(); +#if HAVE_LASX if (have_lasx(cpu_flags)) { switch (c->dstFormat) { case AV_PIX_FMT_RGB24: @@ -121,5 +121,33 @@ av_cold SwsFunc ff_yuv2rgb_init_loongarch(SwsContext *c) } } #endif // #if HAVE_LASX +if (have_lsx(cpu_flags)) { +switch (c->dstFormat) { +case AV_PIX_FMT_RGB24: +return yuv420_rgb24_lsx; +case AV_PIX_FMT_BGR24: +return yuv420_bgr24_lsx; +case AV_PIX_FMT_RGBA: +if (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat)) { +break; +} else +return yuv420_rgba32_lsx; +case AV_PIX_FMT_ARGB: +if (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat)) { +break; +} else +return yuv420_argb32_lsx; +case AV_PIX_FMT_BGRA: +if (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat)) { +break; +} else +return yuv420_bgra32_lsx; +case AV_PIX_FMT_ABGR: +if (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat)) { +break; +} else +return yuv420_abgr32_lsx; +} +} return NULL; } diff --git a/libswscale/loongarch/swscale_loongarch.h b/libswscale/loongarch/swscale_loongarch.h index bc29913ac6..0514abae21 100644 --- a/libswscale/loongarch/swscale_loongarch.h +++ b/libswscale/loongarch/swscale_loongarch.h @@ -62,6 +62,24 @@ void ff_yuv2planeX_8_lsx(const int16_t *filter, int filterSize, av_cold void ff_sws_init_output_lsx(SwsContext *c); +int yuv420_rgb24_lsx(SwsContext *c, const uint8_t *src[], int srcStride[], + int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]); + +int yuv420_bgr24_lsx(SwsContext *c, const uint8_t *src[], int srcStride[], + int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]); + +int yuv420_rgba32_lsx(SwsContext *c, const uint8_t *src[], int srcStride[], + int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]); + +int yuv420_bgra32_lsx(SwsContext *c, const uint8_t *src[], int srcStride[], + int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]); + +int yuv420_argb32_lsx(SwsContext *c, const uint8_t *src[], int srcStride[], + int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]); + +int yuv420_abgr32_lsx(SwsContext *c, const uint8_t *src[], int srcStride[], + int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]); + #if HAVE_LASX void ff_hscale_8_to_15_lasx(SwsContext *c, int16_t *dst, int dstW, const uint8_t *src, const int16_t *filter, diff --git a/libswscale/loongarch/yuv2rgb_lsx.c b/libswscale/loongarch/yuv2rgb_lsx.c new file mode 100644 index 00..11cd2f79d9 --- /dev/null +++ b/libswscale/loongarch/yuv2rgb_lsx.c @@ -0,0 +1,361 @@ +/* + * Copyright (C) 2023 Loongson Technology Co. Ltd. + * Contributed by Bo Jin(ji...@loongson.cn) + * All rights reserved. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by th
Re: [FFmpeg-devel] [PATCH v3 7/7] avutil/la: Add function performance testing
Le 24 mai 2023 10:39:59 GMT+03:00, Hao Chen a écrit : > >在 2023/5/20 下午5:38, Rémi Denis-Courmont 写道: >> Le lauantaina 20. toukokuuta 2023, 10.27.19 EEST Hao Chen a écrit : >>> From: yuanhecai >>> >>> This patch supports the use of the "checkasm --bench" testing feature >>> on loongarch platform. >>> >>> Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3 >>> --- >>> libavutil/loongarch/timer.h | 48 + >>> libavutil/timer.h | 2 ++ >>> 2 files changed, 50 insertions(+) >>> create mode 100644 libavutil/loongarch/timer.h >>> >>> diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h >>> new file mode 100644 >>> index 00..44ed786409 >>> --- /dev/null >>> +++ b/libavutil/loongarch/timer.h >>> @@ -0,0 +1,48 @@ >>> +/* >>> + * Copyright (c) 2023 Loongson Technology Corporation Limited >>> + * Contributed by Hecai Yuan >>> + * >>> + * This file is part of FFmpeg. >>> + * >>> + * FFmpeg is free software; you can redistribute it and/or >>> + * modify it under the terms of the GNU Lesser General Public >>> + * License as published by the Free Software Foundation; either >>> + * version 2.1 of the License, or (at your option) any later version. >>> + * >>> + * FFmpeg is distributed in the hope that it will be useful, >>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>> + * Lesser General Public License for more details. >>> + * >>> + * You should have received a copy of the GNU Lesser General Public >>> + * License along with FFmpeg; if not, write to the Free Software >>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 >>> USA + */ >>> + >>> +#ifndef AVUTIL_LOONGARCH_TIMER_H >>> +#define AVUTIL_LOONGARCH_TIMER_H >>> + >>> +#include >>> +#include "config.h" >>> + >>> +#if HAVE_INLINE_ASM >>> + >>> +#define AV_READ_TIME read_time >>> + >>> +static inline uint64_t read_time(void) >>> +{ >>> + >>> +#if ARCH_LOONGARCH64 >>> +uint64_t a, id = 0; >> Initial value is never used. >> >>> +__asm__ volatile ( "rdtime.d %0, %1" : "=r"(a), "=r"(id) :: "memory" >>> ); +return a; >>> +#else >>> +uint32_t a, id = 0; >>> +__asm__ volatile ( "rdtimel.w %0, %1" : "=r"(a), "=r"(id) :: "memory" >>> ); +return (uint64_t)a; >>> +#endif >> Why do you clobber memory here? >> >>> +} >>> + >>> +#endif /* HAVE_INLINE_ASM */ >>> + >>> +#endif /* AVUTIL_LOONGARCH_TIMER_H */ >>> diff --git a/libavutil/timer.h b/libavutil/timer.h >>> index d3db5a27ef..861ba7e9d7 100644 >>> --- a/libavutil/timer.h >>> +++ b/libavutil/timer.h >>> @@ -61,6 +61,8 @@ >>> # include "riscv/timer.h" >>> #elif ARCH_X86 >>> # include "x86/timer.h" >>> +#elif ARCH_LOONGARCH >>> +# include "loongarch/timer.h" >>> #endif >>> >>> #if !defined(AV_READ_TIME) >>> >> Thanks for your advice. As described in loongarch's instruction manual, >> the rdtime.d instruction is used as follows: >> rdtime.d rd, rj. The rj register stores the counter ID. In this application, >> the value of counter ID is equal to 0. You're setting a value, zero, to a variable `id`, that is then used as output operand. As far as the compiler is concerned, the value zero is never used and the initialisation can be elided. The value of register %1 is unspecified. If you meant for `id` to be an input operand, the constraints are incorrect. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] fftools/ffmpeg_dec: abort if avcodec_send_packet() returns EAGAIN
As the comment in the code mentions, EAGAIN is not an expected value here because we call avcodec_receive_frame() until all frames have been returned. avcodec_send_packet() returning EAGAIN means a packet is still buffered, which hints that the underlying decoder is buggy and not fetching packets as it should. An example of this behavior was in the libdav1d wrapper before f209614290, where feeding it split frames (or individual OBUs) would result in the CLI eventually printing the unuseful "Error submitting packet to decoder: Resource temporarily unavailable" error message, and just keep until EOF without returning new frames. Signed-off-by: James Almer --- Now compiling. fftools/ffmpeg_dec.c | 5 + 1 file changed, 5 insertions(+) diff --git a/fftools/ffmpeg_dec.c b/fftools/ffmpeg_dec.c index e06747d9c4..b1db9b30d0 100644 --- a/fftools/ffmpeg_dec.c +++ b/fftools/ffmpeg_dec.c @@ -390,6 +390,11 @@ int dec_packet(InputStream *ist, const AVPacket *pkt, int no_eof) if (ret < 0 && !(ret == AVERROR_EOF && !pkt)) { // In particular, we don't expect AVERROR(EAGAIN), because we read all // decoded frames with avcodec_receive_frame() until done. +if (ret == AVERROR(EAGAIN)) { +av_log(ist, AV_LOG_FATAL, "A decoder returned an unexpected error code. " + "This is a bug, please report it.\n"); +exit_program(1); +} av_log(ist, AV_LOG_ERROR, "Error submitting %s to decoder: %s\n", pkt ? "packet" : "EOF", av_err2str(ret)); if (exit_on_error) -- 2.40.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 15/15] fftools/sync_queue: make sure non-limiting streams are not used as queue head
On 5/23/2023 10:58 AM, Anton Khirnov wrote: A non-limiting stream could mistakenly end up being the queue head, which would then produce incorrect synchronization, seen e.g. in fate-matroska-flac-extradata-update for certain number of frame threads (e.g. 5). Found-By: James Almer Strictly speaking, it was FATE. --- fftools/sync_queue.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fftools/sync_queue.c b/fftools/sync_queue.c index c0f33e9235..bc107ba4fe 100644 --- a/fftools/sync_queue.c +++ b/fftools/sync_queue.c @@ -217,17 +217,26 @@ static void finish_stream(SyncQueue *sq, unsigned int stream_idx) static void queue_head_update(SyncQueue *sq) { +av_assert0(sq->have_limiting); + if (sq->head_stream < 0) { +unsigned first_limiting = UINT_MAX; + /* wait for one timestamp in each stream before determining * the queue head */ for (unsigned int i = 0; i < sq->nb_streams; i++) { SyncQueueStream *st = &sq->streams[i]; -if (st->limiting && st->head_ts == AV_NOPTS_VALUE) +if (!st->limiting) +continue; +if (st->head_ts == AV_NOPTS_VALUE) return; +if (first_limiting == UINT_MAX) +first_limiting = i; } // placeholder value, correct one will be found below -sq->head_stream = 0; +av_assert0(first_limiting < UINT_MAX); +sq->head_stream = first_limiting; } for (unsigned int i = 0; i < sq->nb_streams; i++) { Can confirm it fixes the issue. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] avcodec/videotoolboxenc: replace VT_H264Profile with avctx profile
On Mon, May 22, 2023 at 12:17 AM 徐福隆 <839789...@qq.com> wrote: > It's my mistake that forget to remove H264_PROF_AUTO. I will fix that. > Any other suggestions about the profile_options? Thanks. > Nothing else from me - just the default profile selection behavior zhilizhao mentioned. > > -- 原始邮件 -- > *发件人:* ""zhilizhao(赵志立)"" ; > *发送时间:* 2023年5月22日(星期一) 中午11:11 > *收件人:* "FFmpeg development discussions and patches"< > ffmpeg-devel@ffmpeg.org>; > *抄送:* "徐福隆"<839789...@qq.com>;"Rick Kern"; > *主题:* Re: [FFmpeg-devel] [PATCH] avcodec/videotoolboxenc: replace > VT_H264Profile with avctx profile > > > > > On May 22, 2023, at 11:05, zhilizhao(赵志立) > wrote: > > > >> On May 21, 2023, at 22:41, xufuji456 <839789...@qq.com> wrote: > >> > >> For compatibility with constrained_baseline in the future, > >> replace VT_H264Profile/VT_HEVCProfile with avctx->profile. > >> > >> Signed-off-by: xufuji456 <839789...@qq.com> > >> --- > >> libavcodec/videotoolboxenc.c | 55 +++- > >> 1 file changed, 16 insertions(+), 39 deletions(-) > >> > >> diff --git a/libavcodec/videotoolboxenc.c b/libavcodec/videotoolboxenc.c > >> index b017c90c36..4966ab36ae 100644 > >> --- a/libavcodec/videotoolboxenc.c > >> +++ b/libavcodec/videotoolboxenc.c > >> @@ -190,28 +190,12 @@ static void loadVTEncSymbols(void){ > >>"EnableLowLatencyRateControl"); > >> } > >> > >> -typedef enum VT_H264Profile { > >> -H264_PROF_AUTO, > >> -H264_PROF_BASELINE, > >> -H264_PROF_MAIN, > >> -H264_PROF_HIGH, > >> -H264_PROF_EXTENDED, > >> -H264_PROF_COUNT > >> -} VT_H264Profile; > >> - > >> typedef enum VTH264Entropy{ > >>VT_ENTROPY_NOT_SET, > >>VT_CAVLC, > >>VT_CABAC > >> } VTH264Entropy; > >> > >> -typedef enum VT_HEVCProfile { > >> -HEVC_PROF_AUTO, > >> -HEVC_PROF_MAIN, > >> -HEVC_PROF_MAIN10, > >> -HEVC_PROF_COUNT > >> -} VT_HEVCProfile; > >> - > >> static const uint8_t start_code[] = { 0, 0, 0, 1 }; > >> > >> typedef struct ExtraSEI { > >> @@ -730,18 +714,13 @@ static bool > get_vt_h264_profile_level(AVCodecContext *avctx, > >>VTEncContext *vtctx = avctx->priv_data; > >>int64_t profile = vtctx->profile; > >> > >> -if (profile == H264_PROF_AUTO && vtctx->level) { > >> -//Need to pick a profile if level is not auto-selected. > >> -profile = vtctx->has_b_frames ? H264_PROF_MAIN : > H264_PROF_BASELINE; > >> -} > >> - > >>*profile_level_val = NULL; > >> > >>switch (profile) { > >>case H264_PROF_AUTO: > >>return true; > > Isn’t it failed to build since H264_PROF_AUTO isn’t defined? > Please be sure to compile and test before submitting a patch. > > >> > >> -case H264_PROF_BASELINE: > >> +case FF_PROFILE_H264_BASELINE: > >>switch (vtctx->level) { > >>case 0: *profile_level_val = > >> > compat_keys.kVTProfileLevel_H264_Baseline_AutoLevel; break; > >> @@ -763,7 +742,7 @@ static bool > get_vt_h264_profile_level(AVCodecContext *avctx, > >>} > >>break; > >> > >> -case H264_PROF_MAIN: > >> +case FF_PROFILE_H264_MAIN: > >>switch (vtctx->level) { > >>case 0: *profile_level_val = > >> > compat_keys.kVTProfileLevel_H264_Main_AutoLevel; break; > >> @@ -782,7 +761,7 @@ static bool > get_vt_h264_profile_level(AVCodecContext *avctx, > >>} > >>break; > >> > >> -case H264_PROF_HIGH: > >> +case FF_PROFILE_H264_HIGH: > >>switch (vtctx->level) { > >>case 0: *profile_level_val = > >> > compat_keys.kVTProfileLevel_H264_High_AutoLevel; break; > >> @@ -805,7 +784,7 @@ static bool > get_vt_h264_profile_level(AVCodecContext *avctx, > >> > compat_keys.kVTProfileLevel_H264_High_5_2; break; > >>} > >>break; > >> -case H264_PROF_EXTENDED: > >> +case FF_PROFILE_H264_EXTENDED: > >>switch (vtctx->level) { > >>case 0: *profile_level_val = > >> > compat_keys.kVTProfileLevel_H264_Extended_AutoLevel; break; > >> @@ -838,13 +817,11 @@ static bool > get_vt_hevc_profile_level(AVCodecContext *avctx, > >>*profile_level_val = NULL; > >> > >>switch (profile) { > >> -case HEVC_PROF_AUTO: > >> -return true; > >> -case HEVC_PROF_MAIN: > >> +case FF_PROFILE_HEVC_MAIN: > >>*profile_level_val = > >>compat_keys.kVTProfileLevel_HEVC_Main_AutoLevel; > >>break; > >> -case HEVC_PROF_MAIN10: > >> +case FF_PROFILE_HEVC_MAIN_10: > >>*profile_level_val = > >>compat_keys.kVTProfileLevel_HEVC_Main10_AutoLevel; > >>break; > >> @@ -1515,12 +1492,12 @@ static int > vtenc_configure_encoder(AVCodecContext *avctx) > >>vtctx->get_param_set_func = > CMVideoFormatDescriptionGetH264ParameterSetAtIndex; > >> > >>vtctx->has_b_frames = avctx->ma
Re: [FFmpeg-devel] [PATCH v4] lavc/h264chroma: RISC-V V add motion compensation for 8x8 chroma blocks
Le keskiviikkona 24. toukokuuta 2023, 8.28.08 EEST Arnie Chang a écrit : > diff --git a/libavcodec/riscv/h264_mc_chroma.S > b/libavcodec/riscv/h264_mc_chroma.S new file mode 100644 > index 00..9fcd2e34b3 > --- /dev/null > +++ b/libavcodec/riscv/h264_mc_chroma.S > @@ -0,0 +1,307 @@ > +/* > + * Copyright (c) 2023 SiFive, Inc. All rights reserved. > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 > USA + */ > +#include "libavutil/riscv/asm.S" > + > +.macro h264_chroma_mc8 type > +func h264_\type\()_chroma_mc8_rvv, zve32x > +csrwvxrm, zero > +sllit2, a5, 3 > +mulwt1, a5, a4 This should probably be mul. > +sh3add a5, a4, t2 > +sllia4, a4, 3 > +sub a5, t1, a5 > +sub a7, a4, t1 > +addia6, a5, 64 > +sub t0, t2, t1 > +vsetivlit3, 8, e8, m1, ta, mu > +beqzt1, 2f > +bleza3, 8f > +li t4, 0 > +li t2, 0 > +li t5, 1 > +addia5, t3, 1 > +sllit3, a2, 2 -- 雷米‧德尼-库尔蒙 http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] lavu/tx: stop using av_log(NULL, )
Patch attached. >From 2813dcb5b885bdf0c3f78f8aead43f4b11149a70 Mon Sep 17 00:00:00 2001 From: Lynne Date: Wed, 24 May 2023 21:57:25 +0200 Subject: [PATCH] lavu/tx: stop using av_log(NULL, ) --- libavutil/tx.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/libavutil/tx.c b/libavutil/tx.c index e25abf998f..34fbe3f6c7 100644 --- a/libavutil/tx.c +++ b/libavutil/tx.c @@ -29,6 +29,12 @@ ((x) == AV_TX_DOUBLE_ ## type) || \ ((x) == AV_TX_INT32_ ## type)) +static AVClass tx_class = { +.class_name= "tx", +.item_name = av_default_item_name, +.version = LIBAVUTIL_VERSION_INT, +}; + /* Calculates the modular multiplicative inverse */ static av_always_inline int mulinv(int n, int m) { @@ -631,7 +637,7 @@ static void print_cd_info(const FFTXCodelet *cd, int prio, int len, int print_pr if (print_prio) av_bprintf(&bp, ", prio: %i", prio); -av_log(NULL, AV_LOG_DEBUG, "%s\n", bp.str); +av_log((void *)&tx_class, AV_LOG_DEBUG, "%s\n", bp.str); } static void print_tx_structure(AVTXContext *s, int depth) @@ -639,7 +645,7 @@ static void print_tx_structure(AVTXContext *s, int depth) const FFTXCodelet *cd = s->cd_self; for (int i = 0; i <= depth; i++) -av_log(NULL, AV_LOG_DEBUG, ""); +av_log((void *)&tx_class, AV_LOG_DEBUG, ""); print_cd_info(cd, cd->prio, s->len, 0); @@ -798,10 +804,10 @@ av_cold int ff_tx_init_subtx(AVTXContext *s, enum AVTXType type, AV_QSORT(cd_matches, nb_cd_matches, TXCodeletMatch, cmp_matches); #if !CONFIG_SMALL -av_log(NULL, AV_LOG_DEBUG, "%s\n", bp.str); +av_log((void *)&tx_class, AV_LOG_DEBUG, "%s\n", bp.str); for (int i = 0; i < nb_cd_matches; i++) { -av_log(NULL, AV_LOG_DEBUG, "%i: ", i + 1); +av_log((void *)&tx_class, AV_LOG_DEBUG, "%i: ", i + 1); print_cd_info(cd_matches[i].cd, cd_matches[i].prio, 0, 1); } #endif @@ -909,7 +915,7 @@ av_cold int av_tx_init(AVTXContext **ctx, av_tx_fn *tx, enum AVTXType type, *tx = tmp.fn[0]; #if !CONFIG_SMALL -av_log(NULL, AV_LOG_DEBUG, "Transform tree:\n"); +av_log((void *)&tx_class, AV_LOG_DEBUG, "Transform tree:\n"); print_tx_structure(*ctx, 0); #endif -- 2.40.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] lavu/tx: stop using av_log(NULL, )
On 5/24/23, Lynne wrote: > Patch attached. > > Probably fine. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] lavu/tx: stop using av_log(NULL, )
On 5/24/23 16:35, Lynne wrote: Patch attached. +av_log((void *)&tx_class, AV_LOG_DEBUG, "%s\n", bp.str); The type of the first argument to av_log should be AVClass **, but this only appears to be AVClass *. See libavutil/log.c line 428. - Leo Izen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.
On Wed, May 24, 2023 at 03:48:27PM +0800, Hao Chen wrote: > From: Shiyou Yin > > loongson_asm.S is LoongArch asm optimization helper. > Add functions: > ff_h264_idct_add_8_lsx > ff_h264_idct8_add_8_lsx > ff_h264_idct_dc_add_8_lsx > ff_h264_idct8_dc_add_8_lsx > ff_h264_idct_add16_8_lsx > ff_h264_idct8_add4_8_lsx > ff_h264_idct_add8_8_lsx > ff_h264_idct_add8_422_8_lsx > ff_h264_idct_add16_intra_8_lsx > ff_h264_luma_dc_dequant_idct_8_lsx > Replaced function(LSX is sufficient for these functions): > ff_h264_idct_add_lasx > ff_h264_idct4x4_addblk_dc_lasx > ff_h264_idct_add16_lasx > ff_h264_idct8_add4_lasx > ff_h264_idct_add8_lasx > ff_h264_idct_add8_422_lasx > ff_h264_idct_add16_intra_lasx > ff_h264_deq_idct_luma_dc_lasx > Renamed functions: > ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx > ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx > > ./configure --disable-lasx > ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an > before: 155fps > after: 161fps > --- > libavcodec/loongarch/Makefile | 3 +- > libavcodec/loongarch/h264_deblock_lasx.c | 2 +- > libavcodec/loongarch/h264dsp_init_loongarch.c | 39 +- > libavcodec/loongarch/h264dsp_lasx.c | 2 +- > .../{h264dsp_lasx.h => h264dsp_loongarch.h} | 60 +- > libavcodec/loongarch/h264idct.S | 658 > libavcodec/loongarch/h264idct_lasx.c | 498 - > libavcodec/loongarch/h264idct_loongarch.c | 184 > libavcodec/loongarch/loongson_asm.S | 945 ++ > 9 files changed, 1848 insertions(+), 543 deletions(-) > rename libavcodec/loongarch/{h264dsp_lasx.h => h264dsp_loongarch.h} (68%) > create mode 100644 libavcodec/loongarch/h264idct.S > delete mode 100644 libavcodec/loongarch/h264idct_lasx.c > create mode 100644 libavcodec/loongarch/h264idct_loongarch.c > create mode 100644 libavcodec/loongarch/loongson_asm.S Applying: avcodec/la: add LSX optimization for h264 idct. .git/rebase-apply/patch:1431: tab in indent. } else if (nnz) { warning: 1 line adds whitespace errors. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If the United States is serious about tackling the national security threats related to an insecure 5G network, it needs to rethink the extent to which it values corporate profits and government espionage over security.-Bruce Schneier signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 55/97] Vulkan patchset part 2 - hwcontext rewrite and filtering
May 22, 2023, 10:26 by d...@lynne.ee: > Planning on pushing this partially (no encoding) tomorrow unless there are > more comments. > All known issues have been fixed, and if there are more issues, they can be > found as users test it. > Added APIchanges and bumped minor for lavu and lavc. Planning to push this in 2 days unless there are more comments. All known issues have been addressed. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] lavu/tx: stop using av_log(NULL, )
May 24, 2023, 23:24 by leo.i...@gmail.com: > On 5/24/23 16:35, Lynne wrote: > >> Patch attached. >> > > +av_log((void *)&tx_class, AV_LOG_DEBUG, "%s\n", bp.str); > > The type of the first argument to av_log should be AVClass **, but this only > appears to be AVClass *. See libavutil/log.c line 428. > > - Leo Izen > Right, thanks, changed to: > static const AVClass tx_class = { > .class_name = "tx", > .item_name = av_default_item_name, > .version = LIBAVUTIL_VERSION_INT, > }; > > static const struct { > const AVClass *tx_class; > } tx_log = { > &tx_class, > }; Will push this tomorrow. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v3 7/7] avutil/la: Add function performance testing
在 2023/5/24 下午7:03, Rémi Denis-Courmont 写道: Le 24 mai 2023 10:39:59 GMT+03:00, Hao Chen a écrit : 在 2023/5/20 下午5:38, Rémi Denis-Courmont 写道: Le lauantaina 20. toukokuuta 2023, 10.27.19 EEST Hao Chen a écrit : From: yuanhecai This patch supports the use of the "checkasm --bench" testing feature on loongarch platform. Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3 --- libavutil/loongarch/timer.h | 48 + libavutil/timer.h | 2 ++ 2 files changed, 50 insertions(+) create mode 100644 libavutil/loongarch/timer.h diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h new file mode 100644 index 00..44ed786409 --- /dev/null +++ b/libavutil/loongarch/timer.h @@ -0,0 +1,48 @@ +/* + * Copyright (c) 2023 Loongson Technology Corporation Limited + * Contributed by Hecai Yuan + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_LOONGARCH_TIMER_H +#define AVUTIL_LOONGARCH_TIMER_H + +#include +#include "config.h" + +#if HAVE_INLINE_ASM + +#define AV_READ_TIME read_time + +static inline uint64_t read_time(void) +{ + +#if ARCH_LOONGARCH64 +uint64_t a, id = 0; Initial value is never used. +__asm__ volatile ( "rdtime.d %0, %1" : "=r"(a), "=r"(id) :: "memory" ); +return a; +#else +uint32_t a, id = 0; +__asm__ volatile ( "rdtimel.w %0, %1" : "=r"(a), "=r"(id) :: "memory" ); +return (uint64_t)a; +#endif Why do you clobber memory here? +} + +#endif /* HAVE_INLINE_ASM */ + +#endif /* AVUTIL_LOONGARCH_TIMER_H */ diff --git a/libavutil/timer.h b/libavutil/timer.h index d3db5a27ef..861ba7e9d7 100644 --- a/libavutil/timer.h +++ b/libavutil/timer.h @@ -61,6 +61,8 @@ # include "riscv/timer.h" #elif ARCH_X86 # include "x86/timer.h" +#elif ARCH_LOONGARCH +# include "loongarch/timer.h" #endif #if !defined(AV_READ_TIME) Thanks for your advice. As described in loongarch's instruction manual, the rdtime.d instruction is used as follows: rdtime.d rd, rj. The rj register stores the counter ID. In this application, the value of counter ID is equal to 0. You're setting a value, zero, to a variable `id`, that is then used as output operand. As far as the compiler is concerned, the value zero is never used and the initialisation can be elided. The value of register %1 is unspecified. If you meant for `id` to be an input operand, the constraints are incorrect. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". You are right! Thank you very much for your reminder. I will correct it. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.
在 2023/5/25 上午5:28, Michael Niedermayer 写道: On Wed, May 24, 2023 at 03:48:27PM +0800, Hao Chen wrote: From: Shiyou Yin loongson_asm.S is LoongArch asm optimization helper. Add functions: ff_h264_idct_add_8_lsx ff_h264_idct8_add_8_lsx ff_h264_idct_dc_add_8_lsx ff_h264_idct8_dc_add_8_lsx ff_h264_idct_add16_8_lsx ff_h264_idct8_add4_8_lsx ff_h264_idct_add8_8_lsx ff_h264_idct_add8_422_8_lsx ff_h264_idct_add16_intra_8_lsx ff_h264_luma_dc_dequant_idct_8_lsx Replaced function(LSX is sufficient for these functions): ff_h264_idct_add_lasx ff_h264_idct4x4_addblk_dc_lasx ff_h264_idct_add16_lasx ff_h264_idct8_add4_lasx ff_h264_idct_add8_lasx ff_h264_idct_add8_422_lasx ff_h264_idct_add16_intra_lasx ff_h264_deq_idct_luma_dc_lasx Renamed functions: ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx ./configure --disable-lasx ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an before: 155fps after: 161fps --- libavcodec/loongarch/Makefile | 3 +- libavcodec/loongarch/h264_deblock_lasx.c | 2 +- libavcodec/loongarch/h264dsp_init_loongarch.c | 39 +- libavcodec/loongarch/h264dsp_lasx.c | 2 +- .../{h264dsp_lasx.h => h264dsp_loongarch.h} | 60 +- libavcodec/loongarch/h264idct.S | 658 libavcodec/loongarch/h264idct_lasx.c | 498 - libavcodec/loongarch/h264idct_loongarch.c | 184 libavcodec/loongarch/loongson_asm.S | 945 ++ 9 files changed, 1848 insertions(+), 543 deletions(-) rename libavcodec/loongarch/{h264dsp_lasx.h => h264dsp_loongarch.h} (68%) create mode 100644 libavcodec/loongarch/h264idct.S delete mode 100644 libavcodec/loongarch/h264idct_lasx.c create mode 100644 libavcodec/loongarch/h264idct_loongarch.c create mode 100644 libavcodec/loongarch/loongson_asm.S Applying: avcodec/la: add LSX optimization for h264 idct. .git/rebase-apply/patch:1431: tab in indent. } else if (nnz) { warning: 1 line adds whitespace errors. [...] ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". Thank you for your feedback. My local git does not have the core.whitespace option set, causing this problem to not be detected. I will retest all patches and try to avoid similar problems from happening again. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.
> 2023年5月25日 05:28,Michael Niedermayer 写道: > > On Wed, May 24, 2023 at 03:48:27PM +0800, Hao Chen wrote: >> From: Shiyou Yin >> >> loongson_asm.S is LoongArch asm optimization helper. >> Add functions: >> ff_h264_idct_add_8_lsx >> ff_h264_idct8_add_8_lsx >> ff_h264_idct_dc_add_8_lsx >> ff_h264_idct8_dc_add_8_lsx >> ff_h264_idct_add16_8_lsx >> ff_h264_idct8_add4_8_lsx >> ff_h264_idct_add8_8_lsx >> ff_h264_idct_add8_422_8_lsx >> ff_h264_idct_add16_intra_8_lsx >> ff_h264_luma_dc_dequant_idct_8_lsx >> Replaced function(LSX is sufficient for these functions): >> ff_h264_idct_add_lasx >> ff_h264_idct4x4_addblk_dc_lasx >> ff_h264_idct_add16_lasx >> ff_h264_idct8_add4_lasx >> ff_h264_idct_add8_lasx >> ff_h264_idct_add8_422_lasx >> ff_h264_idct_add16_intra_lasx >> ff_h264_deq_idct_luma_dc_lasx >> Renamed functions: >> ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx >> ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx >> >> ./configure --disable-lasx >> ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an >> before: 155fps >> after: 161fps >> --- >> libavcodec/loongarch/Makefile | 3 +- >> libavcodec/loongarch/h264_deblock_lasx.c | 2 +- >> libavcodec/loongarch/h264dsp_init_loongarch.c | 39 +- >> libavcodec/loongarch/h264dsp_lasx.c | 2 +- >> .../{h264dsp_lasx.h => h264dsp_loongarch.h} | 60 +- >> libavcodec/loongarch/h264idct.S | 658 >> libavcodec/loongarch/h264idct_lasx.c | 498 - >> libavcodec/loongarch/h264idct_loongarch.c | 184 >> libavcodec/loongarch/loongson_asm.S | 945 ++ >> 9 files changed, 1848 insertions(+), 543 deletions(-) >> rename libavcodec/loongarch/{h264dsp_lasx.h => h264dsp_loongarch.h} (68%) >> create mode 100644 libavcodec/loongarch/h264idct.S >> delete mode 100644 libavcodec/loongarch/h264idct_lasx.c >> create mode 100644 libavcodec/loongarch/h264idct_loongarch.c >> create mode 100644 libavcodec/loongarch/loongson_asm.S > > Applying: avcodec/la: add LSX optimization for h264 idct. > .git/rebase-apply/patch:1431: tab in indent. > } else if (nnz) { > warning: 1 line adds whitespace errors. > Thanks, will set core.witespace in gitconfig to avoid these error. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v3 7/7] avutil/la: Add function performance testing
> 2023年5月25日 10:36,Hao Chen 写道: > > > 在 2023/5/24 下午7:03, Rémi Denis-Courmont 写道: >> >> Le 24 mai 2023 10:39:59 GMT+03:00, Hao Chen a écrit : >>> 在 2023/5/20 下午5:38, Rémi Denis-Courmont 写道: Le lauantaina 20. toukokuuta 2023, 10.27.19 EEST Hao Chen a écrit : > From: yuanhecai > > This patch supports the use of the "checkasm --bench" testing feature > on loongarch platform. > > Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3 > --- > libavutil/loongarch/timer.h | 48 + > libavutil/timer.h | 2 ++ > 2 files changed, 50 insertions(+) > create mode 100644 libavutil/loongarch/timer.h > > diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h > new file mode 100644 > index 00..44ed786409 > --- /dev/null > +++ b/libavutil/loongarch/timer.h > @@ -0,0 +1,48 @@ > +/* > + * Copyright (c) 2023 Loongson Technology Corporation Limited > + * Contributed by Hecai Yuan > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA > 02110-1301 > USA + */ > + > +#ifndef AVUTIL_LOONGARCH_TIMER_H > +#define AVUTIL_LOONGARCH_TIMER_H > + > +#include > +#include "config.h" > + > +#if HAVE_INLINE_ASM > + > +#define AV_READ_TIME read_time > + > +static inline uint64_t read_time(void) > +{ > + > +#if ARCH_LOONGARCH64 > +uint64_t a, id = 0; Initial value is never used. > +__asm__ volatile ( "rdtime.d %0, %1" : "=r"(a), "=r"(id) :: "memory" > ); +return a; > +#else > +uint32_t a, id = 0; > +__asm__ volatile ( "rdtimel.w %0, %1" : "=r"(a), "=r"(id) :: > "memory" > ); +return (uint64_t)a; > +#endif Why do you clobber memory here? > +} > + > +#endif /* HAVE_INLINE_ASM */ > + > +#endif /* AVUTIL_LOONGARCH_TIMER_H */ > diff --git a/libavutil/timer.h b/libavutil/timer.h > index d3db5a27ef..861ba7e9d7 100644 > --- a/libavutil/timer.h > +++ b/libavutil/timer.h > @@ -61,6 +61,8 @@ > # include "riscv/timer.h" > #elif ARCH_X86 > # include "x86/timer.h" > +#elif ARCH_LOONGARCH > +# include "loongarch/timer.h" > #endif > > #if !defined(AV_READ_TIME) > Thanks for your advice. As described in loongarch's instruction manual, the rdtime.d instruction is used as follows: rdtime.d rd, rj. The rj register stores the counter ID. In this application, the value of counter ID is equal to 0. >> You're setting a value, zero, to a variable `id`, that is then used as >> output operand. As far as the compiler is concerned, the value zero is never >> used and the initialisation can be elided. The value of register %1 is >> unspecified. >> >> If you meant for `id` to be an input operand, the constraints are incorrect. >> > > > You are right! Thank you very much for your reminder. I will correct it. > id is output operand, the constraints is correct, and initilazation of id is not necessary. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".