Re: [FFmpeg-devel] [PATCH v3 7/7] avutil/la: Add function performance testing

2023-05-24 Thread Hao Chen


在 2023/5/20 下午5:38, Rémi Denis-Courmont 写道:

Le lauantaina 20. toukokuuta 2023, 10.27.19 EEST Hao Chen a écrit :

From: yuanhecai 

This patch supports the use of the "checkasm --bench" testing feature
on loongarch platform.

Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3
---
  libavutil/loongarch/timer.h | 48 +
  libavutil/timer.h   |  2 ++
  2 files changed, 50 insertions(+)
  create mode 100644 libavutil/loongarch/timer.h

diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h
new file mode 100644
index 00..44ed786409
--- /dev/null
+++ b/libavutil/loongarch/timer.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright (c) 2023 Loongson Technology Corporation Limited
+ * Contributed by Hecai Yuan 
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
USA + */
+
+#ifndef AVUTIL_LOONGARCH_TIMER_H
+#define AVUTIL_LOONGARCH_TIMER_H
+
+#include 
+#include "config.h"
+
+#if HAVE_INLINE_ASM
+
+#define AV_READ_TIME read_time
+
+static inline uint64_t read_time(void)
+{
+
+#if ARCH_LOONGARCH64
+uint64_t a, id = 0;

Initial value is never used.


+__asm__ volatile ( "rdtime.d  %0, %1" : "=r"(a), "=r"(id) :: "memory"
); +return a;
+#else
+uint32_t a, id = 0;
+__asm__ volatile ( "rdtimel.w  %0, %1" : "=r"(a), "=r"(id) :: "memory"
); +return (uint64_t)a;
+#endif

Why do you clobber memory here?


+}
+
+#endif /* HAVE_INLINE_ASM */
+
+#endif /* AVUTIL_LOONGARCH_TIMER_H */
diff --git a/libavutil/timer.h b/libavutil/timer.h
index d3db5a27ef..861ba7e9d7 100644
--- a/libavutil/timer.h
+++ b/libavutil/timer.h
@@ -61,6 +61,8 @@
  #   include "riscv/timer.h"
  #elif ARCH_X86
  #   include "x86/timer.h"
+#elif ARCH_LOONGARCH
+#   include "loongarch/timer.h"
  #endif

  #if !defined(AV_READ_TIME)

    Thanks for your advice.  As described in loongarch's instruction 
manual, the rdtime.d instruction is used as follows:
rdtime.d rd, rj. The rj register stores the counter ID. In this 
application, the value of counter ID is equal to 0.
    In addition, clobbering memory is really not a necessary 
operation. I will remove it. 


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] Add LSX optimization in avcodec and swscale.

2023-05-24 Thread Hao Chen
v1: Add LSX optimization in avcodec and swscale, due to the 2K series CPUs only 
support lsx.
v2: Modified the implementation of some functions and added support for the 
checkasm --bench feature.
v3: Fix whitespace errors in patch.
v4: Remove clobbering memory in libavutil/loongarch/timer.h

[PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.
[PATCH v4 2/7] avcodec/la: Add LSX optimization for loop filter.
[PATCH v4 3/7] avcodec/la: Add LSX optimization for h264 chroma and
[PATCH v4 4/7] avcodec/la: Add LSX optimization for h264 qpel.
[PATCH v4 5/7] swscale/la: Optimize the functions of the swscale
[PATCH v4 6/7] swscale/la: Add following builtin optimized functions
[PATCH v4 7/7] avutil/la: Add function performance testing

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.

2023-05-24 Thread Hao Chen
From: Shiyou Yin 

loongson_asm.S is LoongArch asm optimization helper.
Add functions:
  ff_h264_idct_add_8_lsx
  ff_h264_idct8_add_8_lsx
  ff_h264_idct_dc_add_8_lsx
  ff_h264_idct8_dc_add_8_lsx
  ff_h264_idct_add16_8_lsx
  ff_h264_idct8_add4_8_lsx
  ff_h264_idct_add8_8_lsx
  ff_h264_idct_add8_422_8_lsx
  ff_h264_idct_add16_intra_8_lsx
  ff_h264_luma_dc_dequant_idct_8_lsx
Replaced function(LSX is sufficient for these functions):
  ff_h264_idct_add_lasx
  ff_h264_idct4x4_addblk_dc_lasx
  ff_h264_idct_add16_lasx
  ff_h264_idct8_add4_lasx
  ff_h264_idct_add8_lasx
  ff_h264_idct_add8_422_lasx
  ff_h264_idct_add16_intra_lasx
  ff_h264_deq_idct_luma_dc_lasx
Renamed functions:
  ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx
  ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx

./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 155fps
after:  161fps
---
 libavcodec/loongarch/Makefile |   3 +-
 libavcodec/loongarch/h264_deblock_lasx.c  |   2 +-
 libavcodec/loongarch/h264dsp_init_loongarch.c |  39 +-
 libavcodec/loongarch/h264dsp_lasx.c   |   2 +-
 .../{h264dsp_lasx.h => h264dsp_loongarch.h}   |  60 +-
 libavcodec/loongarch/h264idct.S   | 658 
 libavcodec/loongarch/h264idct_lasx.c  | 498 -
 libavcodec/loongarch/h264idct_loongarch.c | 184 
 libavcodec/loongarch/loongson_asm.S   | 945 ++
 9 files changed, 1848 insertions(+), 543 deletions(-)
 rename libavcodec/loongarch/{h264dsp_lasx.h => h264dsp_loongarch.h} (68%)
 create mode 100644 libavcodec/loongarch/h264idct.S
 delete mode 100644 libavcodec/loongarch/h264idct_lasx.c
 create mode 100644 libavcodec/loongarch/h264idct_loongarch.c
 create mode 100644 libavcodec/loongarch/loongson_asm.S

diff --git a/libavcodec/loongarch/Makefile b/libavcodec/loongarch/Makefile
index c1b5de5c44..34ebbbe133 100644
--- a/libavcodec/loongarch/Makefile
+++ b/libavcodec/loongarch/Makefile
@@ -12,7 +12,6 @@ OBJS-$(CONFIG_HEVC_DECODER)   += 
loongarch/hevcdsp_init_loongarch.o
 LASX-OBJS-$(CONFIG_H264CHROMA)+= loongarch/h264chroma_lasx.o
 LASX-OBJS-$(CONFIG_H264QPEL)  += loongarch/h264qpel_lasx.o
 LASX-OBJS-$(CONFIG_H264DSP)   += loongarch/h264dsp_lasx.o \
- loongarch/h264idct_lasx.o \
  loongarch/h264_deblock_lasx.o
 LASX-OBJS-$(CONFIG_H264PRED)  += loongarch/h264_intrapred_lasx.o
 LASX-OBJS-$(CONFIG_VC1_DECODER)   += loongarch/vc1dsp_lasx.o
@@ -31,3 +30,5 @@ LSX-OBJS-$(CONFIG_HEVC_DECODER)   += 
loongarch/hevcdsp_lsx.o \
  loongarch/hevc_mc_bi_lsx.o \
  loongarch/hevc_mc_uni_lsx.o \
  loongarch/hevc_mc_uniw_lsx.o
+LSX-OBJS-$(CONFIG_H264DSP)+= loongarch/h264idct.o \
+ loongarch/h264idct_loongarch.o
diff --git a/libavcodec/loongarch/h264_deblock_lasx.c 
b/libavcodec/loongarch/h264_deblock_lasx.c
index c89bea9a84..eead931dcf 100644
--- a/libavcodec/loongarch/h264_deblock_lasx.c
+++ b/libavcodec/loongarch/h264_deblock_lasx.c
@@ -20,7 +20,7 @@
  */
 
 #include "libavcodec/bit_depth_template.c"
-#include "h264dsp_lasx.h"
+#include "h264dsp_loongarch.h"
 #include "libavutil/loongarch/loongson_intrinsics.h"
 
 #define H264_LOOP_FILTER_STRENGTH_ITERATION_LASX(edges, step, mask_mv, dir, \
diff --git a/libavcodec/loongarch/h264dsp_init_loongarch.c 
b/libavcodec/loongarch/h264dsp_init_loongarch.c
index 37633c3e51..cb07deb398 100644
--- a/libavcodec/loongarch/h264dsp_init_loongarch.c
+++ b/libavcodec/loongarch/h264dsp_init_loongarch.c
@@ -21,13 +21,32 @@
  */
 
 #include "libavutil/loongarch/cpu.h"
-#include "h264dsp_lasx.h"
+#include "h264dsp_loongarch.h"
 
 av_cold void ff_h264dsp_init_loongarch(H264DSPContext *c, const int bit_depth,
const int chroma_format_idc)
 {
 int cpu_flags = av_get_cpu_flags();
 
+if (have_lsx(cpu_flags)) {
+if (bit_depth == 8) {
+c->h264_idct_add = ff_h264_idct_add_8_lsx;
+c->h264_idct8_add= ff_h264_idct8_add_8_lsx;
+c->h264_idct_dc_add  = ff_h264_idct_dc_add_8_lsx;
+c->h264_idct8_dc_add = ff_h264_idct8_dc_add_8_lsx;
+
+if (chroma_format_idc <= 1)
+c->h264_idct_add8 = ff_h264_idct_add8_8_lsx;
+else
+c->h264_idct_add8 = ff_h264_idct_add8_422_8_lsx;
+
+c->h264_idct_add16 = ff_h264_idct_add16_8_lsx;
+c->h264_idct8_add4 = ff_h264_idct8_add4_8_lsx;
+c->h264_luma_dc_dequant_idct = ff_h264_luma_dc_dequant_idct_8_lsx;
+c->h264_idct_add16intra = ff_h264_idct_add16_intra_8_lsx;
+}
+}
+#if HAVE_LASX
 if (have_lasx(cpu_flags)) {
 if (chroma_format_idc <= 1)
 c->h264_loop_filter

[FFmpeg-devel] [PATCH v4 3/7] avcodec/la: Add LSX optimization for h264 chroma and intrapred.

2023-05-24 Thread Hao Chen
From: Lu Wang 

./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 199fps
after:  214fps
---
 libavcodec/loongarch/Makefile |4 +-
 .../loongarch/h264_intrapred_init_loongarch.c |   18 +-
 libavcodec/loongarch/h264_intrapred_lasx.c|  121 --
 ...pred_lasx.h => h264_intrapred_loongarch.h} |   12 +-
 libavcodec/loongarch/h264chroma.S |  966 +
 .../loongarch/h264chroma_init_loongarch.c |   10 +-
 libavcodec/loongarch/h264chroma_lasx.c| 1280 -
 libavcodec/loongarch/h264chroma_lasx.h|   36 -
 libavcodec/loongarch/h264chroma_loongarch.h   |   41 +
 libavcodec/loongarch/h264intrapred.S  |  299 
 10 files changed, 1342 insertions(+), 1445 deletions(-)
 delete mode 100644 libavcodec/loongarch/h264_intrapred_lasx.c
 rename libavcodec/loongarch/{h264_intrapred_lasx.h => 
h264_intrapred_loongarch.h} (70%)
 create mode 100644 libavcodec/loongarch/h264chroma.S
 delete mode 100644 libavcodec/loongarch/h264chroma_lasx.c
 delete mode 100644 libavcodec/loongarch/h264chroma_lasx.h
 create mode 100644 libavcodec/loongarch/h264chroma_loongarch.h
 create mode 100644 libavcodec/loongarch/h264intrapred.S

diff --git a/libavcodec/loongarch/Makefile b/libavcodec/loongarch/Makefile
index 111bc23e4e..a563055161 100644
--- a/libavcodec/loongarch/Makefile
+++ b/libavcodec/loongarch/Makefile
@@ -9,11 +9,9 @@ OBJS-$(CONFIG_HPELDSP)+= 
loongarch/hpeldsp_init_loongarch.o
 OBJS-$(CONFIG_IDCTDSP)+= loongarch/idctdsp_init_loongarch.o
 OBJS-$(CONFIG_VIDEODSP)   += loongarch/videodsp_init.o
 OBJS-$(CONFIG_HEVC_DECODER)   += loongarch/hevcdsp_init_loongarch.o
-LASX-OBJS-$(CONFIG_H264CHROMA)+= loongarch/h264chroma_lasx.o
 LASX-OBJS-$(CONFIG_H264QPEL)  += loongarch/h264qpel_lasx.o
 LASX-OBJS-$(CONFIG_H264DSP)   += loongarch/h264dsp_lasx.o \
  loongarch/h264_deblock_lasx.o
-LASX-OBJS-$(CONFIG_H264PRED)  += loongarch/h264_intrapred_lasx.o
 LASX-OBJS-$(CONFIG_VC1_DECODER)   += loongarch/vc1dsp_lasx.o
 LASX-OBJS-$(CONFIG_HPELDSP)   += loongarch/hpeldsp_lasx.o
 LASX-OBJS-$(CONFIG_IDCTDSP)   += loongarch/simple_idct_lasx.o  \
@@ -33,3 +31,5 @@ LSX-OBJS-$(CONFIG_HEVC_DECODER)   += 
loongarch/hevcdsp_lsx.o \
 LSX-OBJS-$(CONFIG_H264DSP)+= loongarch/h264idct.o \
  loongarch/h264idct_loongarch.o \
  loongarch/h264dsp.o
+LSX-OBJS-$(CONFIG_H264CHROMA) += loongarch/h264chroma.o
+LSX-OBJS-$(CONFIG_H264PRED)   += loongarch/h264intrapred.o
diff --git a/libavcodec/loongarch/h264_intrapred_init_loongarch.c 
b/libavcodec/loongarch/h264_intrapred_init_loongarch.c
index 12620bd842..c415fa30da 100644
--- a/libavcodec/loongarch/h264_intrapred_init_loongarch.c
+++ b/libavcodec/loongarch/h264_intrapred_init_loongarch.c
@@ -21,7 +21,7 @@
 
 #include "libavutil/loongarch/cpu.h"
 #include "libavcodec/h264pred.h"
-#include "h264_intrapred_lasx.h"
+#include "h264_intrapred_loongarch.h"
 
 av_cold void ff_h264_pred_init_loongarch(H264PredContext *h, int codec_id,
  const int bit_depth,
@@ -30,6 +30,22 @@ av_cold void ff_h264_pred_init_loongarch(H264PredContext *h, 
int codec_id,
 int cpu_flags = av_get_cpu_flags();
 
 if (bit_depth == 8) {
+if (have_lsx(cpu_flags)) {
+if (chroma_format_idc <= 1) {
+}
+if (codec_id == AV_CODEC_ID_VP7 || codec_id == AV_CODEC_ID_VP8) {
+} else {
+if (chroma_format_idc <= 1) {
+}
+if (codec_id == AV_CODEC_ID_SVQ3) {
+h->pred16x16[PLANE_PRED8x8] = 
ff_h264_pred16x16_plane_svq3_8_lsx;
+} else if (codec_id == AV_CODEC_ID_RV40) {
+h->pred16x16[PLANE_PRED8x8] = 
ff_h264_pred16x16_plane_rv40_8_lsx;
+} else {
+h->pred16x16[PLANE_PRED8x8] = 
ff_h264_pred16x16_plane_h264_8_lsx;
+}
+}
+}
 if (have_lasx(cpu_flags)) {
 if (chroma_format_idc <= 1) {
 }
diff --git a/libavcodec/loongarch/h264_intrapred_lasx.c 
b/libavcodec/loongarch/h264_intrapred_lasx.c
deleted file mode 100644
index c38cd611b8..00
--- a/libavcodec/loongarch/h264_intrapred_lasx.c
+++ /dev/null
@@ -1,121 +0,0 @@
-/*
- * Copyright (c) 2021 Loongson Technology Corporation Limited
- * Contributed by Hao Chen 
- *
- * This file is part of FFmpeg.
- *
- * FFmpeg is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- *
- * FFmpeg is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARR

[FFmpeg-devel] [PATCH v4 4/7] avcodec/la: Add LSX optimization for h264 qpel.

2023-05-24 Thread Hao Chen
From: yuanhecai 

./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 214fps
after:  274fps
---
 libavcodec/loongarch/Makefile |2 +
 libavcodec/loongarch/h264qpel.S   | 1686 +
 .../loongarch/h264qpel_init_loongarch.c   |   74 +-
 libavcodec/loongarch/h264qpel_lasx.c  |  401 +---
 libavcodec/loongarch/h264qpel_lasx.h  |  158 --
 libavcodec/loongarch/h264qpel_loongarch.h |  312 +++
 libavcodec/loongarch/h264qpel_lsx.c   |  487 +
 7 files changed, 2561 insertions(+), 559 deletions(-)
 create mode 100644 libavcodec/loongarch/h264qpel.S
 delete mode 100644 libavcodec/loongarch/h264qpel_lasx.h
 create mode 100644 libavcodec/loongarch/h264qpel_loongarch.h
 create mode 100644 libavcodec/loongarch/h264qpel_lsx.c

diff --git a/libavcodec/loongarch/Makefile b/libavcodec/loongarch/Makefile
index a563055161..06cfab5c20 100644
--- a/libavcodec/loongarch/Makefile
+++ b/libavcodec/loongarch/Makefile
@@ -31,5 +31,7 @@ LSX-OBJS-$(CONFIG_HEVC_DECODER)   += 
loongarch/hevcdsp_lsx.o \
 LSX-OBJS-$(CONFIG_H264DSP)+= loongarch/h264idct.o \
  loongarch/h264idct_loongarch.o \
  loongarch/h264dsp.o
+LSX-OBJS-$(CONFIG_H264QPEL)   += loongarch/h264qpel.o \
+ loongarch/h264qpel_lsx.o
 LSX-OBJS-$(CONFIG_H264CHROMA) += loongarch/h264chroma.o
 LSX-OBJS-$(CONFIG_H264PRED)   += loongarch/h264intrapred.o
diff --git a/libavcodec/loongarch/h264qpel.S b/libavcodec/loongarch/h264qpel.S
new file mode 100644
index 00..3f885b6ce2
--- /dev/null
+++ b/libavcodec/loongarch/h264qpel.S
@@ -0,0 +1,1686 @@
+/*
+ * Loongson LSX optimized h264qpel
+ *
+ * Copyright (c) 2023 Loongson Technology Corporation Limited
+ * Contributed by Hecai Yuan 
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "loongson_asm.S"
+
+.macro VLD_QPEL8_H_SSRANI_LSX in0, in1, in2, in3, in4
+vld   vr0,\in4,   0
+vldx  vr1,\in4,   a2
+QPEL8_H_LSX   \in0,   \in1
+vssrani.bu.h  \in0,   \in2,   5
+vssrani.bu.h  \in1,   \in3,   5
+.endm
+
+.macro VLDX_QPEL8_H_SSRANI_LSX in0, in1, in2, in3, in4
+vldx  vr0,\in4,   t1
+vldx  vr1,\in4,   t2
+QPEL8_H_LSX   \in0,   \in1
+vssrani.bu.h  \in0,   \in2,   5
+vssrani.bu.h  \in1,   \in3,   5
+.endm
+
+.macro VLD_DOUBLE_QPEL8_H_SSRANI_LSX in0, in1, in2, in3, in4, in5, in6, in7, 
in8
+vld   vr0,\in8,   0
+vldx  vr1,\in8,   a2
+QPEL8_H_LSX   \in0,   \in1
+vssrani.bu.h  \in0,   \in4,   5
+vssrani.bu.h  \in1,   \in5,   5
+vldx  vr0,\in8,   t1
+vldx  vr1,\in8,   t2
+QPEL8_H_LSX   \in2,   \in3
+vssrani.bu.h  \in2,   \in6,   5
+vssrani.bu.h  \in3,   \in7,   5
+.endm
+
+function ff_put_h264_qpel16_mc00_lsx
+slli.dt0, a2, 1
+add.d t1, t0, a2
+slli.dt2, t0, 1
+.rept 4
+vld   vr0,a1, 0
+vldx  vr1,a1, a2
+vldx  vr2,a1, t0
+vldx  vr3,a1, t1
+add.d a1, a1, t2
+vst   vr0,a0, 0
+vstx  vr1,a0, a2
+vstx  vr2,a0, t0
+vstx  vr3,a0, t1
+add.d a0, a0, t2
+.endr
+endfunc
+
+.macro QPEL8_H_LSX out0, out1
+vbsrl.v   vr2,vr0,1
+vbsrl.v   vr3,vr1,1
+vbsrl.v   vr4,vr0,2
+vbsrl.v   vr5,vr1,2
+vbsrl.v   vr6,vr0,3
+vbsrl.v   vr7,vr1,3
+vbsrl.v   vr8,vr0,4
+vbsrl.v   vr9,vr1,4
+vbsrl.v   vr10,   vr0,5
+vbsrl.v   vr11,   vr1,5
+
+vilvl.b   vr6,vr4,vr6
+vilvl.b   vr7,vr5,vr7
+vilvl.b   vr8,vr2,vr8
+vilvl.b   vr9,vr3,vr9
+vilvl.b   vr10,   vr0,vr10
+vilvl.b   vr11,   vr1,vr11
+vhaddw.hu.bu  vr6,vr6,vr6
+vhaddw.hu.bu  vr7,vr7,vr7
+vhaddw.hu.bu  vr8,vr8,vr8
+vh

[FFmpeg-devel] [PATCH v4 7/7] avutil/la: Add function performance testing

2023-05-24 Thread Hao Chen
From: yuanhecai 

This patch supports the use of the "checkasm --bench" testing feature
on loongarch platform.

Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3
---
 libavutil/loongarch/timer.h | 48 +
 libavutil/timer.h   |  2 ++
 2 files changed, 50 insertions(+)
 create mode 100644 libavutil/loongarch/timer.h

diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h
new file mode 100644
index 00..d70b88c859
--- /dev/null
+++ b/libavutil/loongarch/timer.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright (c) 2023 Loongson Technology Corporation Limited
+ * Contributed by Hecai Yuan 
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_LOONGARCH_TIMER_H
+#define AVUTIL_LOONGARCH_TIMER_H
+
+#include 
+#include "config.h"
+
+#if HAVE_INLINE_ASM
+
+#define AV_READ_TIME read_time
+
+static inline uint64_t read_time(void)
+{
+
+#if ARCH_LOONGARCH64
+uint64_t a, id = 0;
+__asm__ volatile ( "rdtime.d  %0, %1" : "=r"(a), "=r"(id));
+return a;
+#else
+uint32_t a, id = 0;
+__asm__ volatile ( "rdtimel.w  %0, %1" : "=r"(a), "=r"(id));
+return (uint64_t)a;
+#endif
+}
+
+#endif /* HAVE_INLINE_ASM */
+
+#endif /* AVUTIL_LOONGARCH_TIMER_H */
diff --git a/libavutil/timer.h b/libavutil/timer.h
index d3db5a27ef..861ba7e9d7 100644
--- a/libavutil/timer.h
+++ b/libavutil/timer.h
@@ -61,6 +61,8 @@
 #   include "riscv/timer.h"
 #elif ARCH_X86
 #   include "x86/timer.h"
+#elif ARCH_LOONGARCH
+#   include "loongarch/timer.h"
 #endif
 
 #if !defined(AV_READ_TIME)
-- 
2.20.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v4 6/7] swscale/la: Add following builtin optimized functions

2023-05-24 Thread Hao Chen
From: Jin Bo 

yuv420_rgb24_lsx
yuv420_bgr24_lsx
yuv420_rgba32_lsx
yuv420_argb32_lsx
yuv420_bgra32_lsx
yuv420_abgr32_lsx
./configure --disable-lasx
ffmpeg -i ~/media/1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo
-pix_fmt rgb24 -y /dev/null -an
before: 184fps
after:  207fps
---
 libswscale/loongarch/Makefile |   3 +-
 libswscale/loongarch/swscale_init_loongarch.c |  30 +-
 libswscale/loongarch/swscale_loongarch.h  |  18 +
 libswscale/loongarch/yuv2rgb_lsx.c| 361 ++
 4 files changed, 410 insertions(+), 2 deletions(-)
 create mode 100644 libswscale/loongarch/yuv2rgb_lsx.c

diff --git a/libswscale/loongarch/Makefile b/libswscale/loongarch/Makefile
index c0b6a449c0..c35ba309a4 100644
--- a/libswscale/loongarch/Makefile
+++ b/libswscale/loongarch/Makefile
@@ -8,4 +8,5 @@ LSX-OBJS-$(CONFIG_SWSCALE)  += loongarch/swscale.o \
loongarch/swscale_lsx.o \
loongarch/input.o   \
loongarch/output.o  \
-   loongarch/output_lsx.o
+   loongarch/output_lsx.o  \
+   loongarch/yuv2rgb_lsx.o
diff --git a/libswscale/loongarch/swscale_init_loongarch.c 
b/libswscale/loongarch/swscale_init_loongarch.c
index c13a1662ec..53e4f970b6 100644
--- a/libswscale/loongarch/swscale_init_loongarch.c
+++ b/libswscale/loongarch/swscale_init_loongarch.c
@@ -90,8 +90,8 @@ av_cold void rgb2rgb_init_loongarch(void)
 
 av_cold SwsFunc ff_yuv2rgb_init_loongarch(SwsContext *c)
 {
-#if HAVE_LASX
 int cpu_flags = av_get_cpu_flags();
+#if HAVE_LASX
 if (have_lasx(cpu_flags)) {
 switch (c->dstFormat) {
 case AV_PIX_FMT_RGB24:
@@ -121,5 +121,33 @@ av_cold SwsFunc ff_yuv2rgb_init_loongarch(SwsContext *c)
 }
 }
 #endif // #if HAVE_LASX
+if (have_lsx(cpu_flags)) {
+switch (c->dstFormat) {
+case AV_PIX_FMT_RGB24:
+return yuv420_rgb24_lsx;
+case AV_PIX_FMT_BGR24:
+return yuv420_bgr24_lsx;
+case AV_PIX_FMT_RGBA:
+if (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat)) {
+break;
+} else
+return yuv420_rgba32_lsx;
+case AV_PIX_FMT_ARGB:
+if (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat)) {
+break;
+} else
+return yuv420_argb32_lsx;
+case AV_PIX_FMT_BGRA:
+if (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat)) {
+break;
+} else
+return yuv420_bgra32_lsx;
+case AV_PIX_FMT_ABGR:
+if (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat)) {
+break;
+} else
+return yuv420_abgr32_lsx;
+}
+}
 return NULL;
 }
diff --git a/libswscale/loongarch/swscale_loongarch.h 
b/libswscale/loongarch/swscale_loongarch.h
index bc29913ac6..0514abae21 100644
--- a/libswscale/loongarch/swscale_loongarch.h
+++ b/libswscale/loongarch/swscale_loongarch.h
@@ -62,6 +62,24 @@ void ff_yuv2planeX_8_lsx(const int16_t *filter, int 
filterSize,
 
 av_cold void ff_sws_init_output_lsx(SwsContext *c);
 
+int yuv420_rgb24_lsx(SwsContext *c, const uint8_t *src[], int srcStride[],
+ int srcSliceY, int srcSliceH, uint8_t *dst[], int 
dstStride[]);
+
+int yuv420_bgr24_lsx(SwsContext *c, const uint8_t *src[], int srcStride[],
+ int srcSliceY, int srcSliceH, uint8_t *dst[], int 
dstStride[]);
+
+int yuv420_rgba32_lsx(SwsContext *c, const uint8_t *src[], int srcStride[],
+  int srcSliceY, int srcSliceH, uint8_t *dst[], int 
dstStride[]);
+
+int yuv420_bgra32_lsx(SwsContext *c, const uint8_t *src[], int srcStride[],
+  int srcSliceY, int srcSliceH, uint8_t *dst[], int 
dstStride[]);
+
+int yuv420_argb32_lsx(SwsContext *c, const uint8_t *src[], int srcStride[],
+  int srcSliceY, int srcSliceH, uint8_t *dst[], int 
dstStride[]);
+
+int yuv420_abgr32_lsx(SwsContext *c, const uint8_t *src[], int srcStride[],
+  int srcSliceY, int srcSliceH, uint8_t *dst[], int 
dstStride[]);
+
 #if HAVE_LASX
 void ff_hscale_8_to_15_lasx(SwsContext *c, int16_t *dst, int dstW,
 const uint8_t *src, const int16_t *filter,
diff --git a/libswscale/loongarch/yuv2rgb_lsx.c 
b/libswscale/loongarch/yuv2rgb_lsx.c
new file mode 100644
index 00..11cd2f79d9
--- /dev/null
+++ b/libswscale/loongarch/yuv2rgb_lsx.c
@@ -0,0 +1,361 @@
+/*
+ * Copyright (C) 2023 Loongson Technology Co. Ltd.
+ * Contributed by Bo Jin(ji...@loongson.cn)
+ * All rights reserved.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by th

Re: [FFmpeg-devel] [PATCH v3 7/7] avutil/la: Add function performance testing

2023-05-24 Thread Rémi Denis-Courmont


Le 24 mai 2023 10:39:59 GMT+03:00, Hao Chen  a écrit :
>
>在 2023/5/20 下午5:38, Rémi Denis-Courmont 写道:
>> Le lauantaina 20. toukokuuta 2023, 10.27.19 EEST Hao Chen a écrit :
>>> From: yuanhecai 
>>> 
>>> This patch supports the use of the "checkasm --bench" testing feature
>>> on loongarch platform.
>>> 
>>> Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3
>>> ---
>>>   libavutil/loongarch/timer.h | 48 +
>>>   libavutil/timer.h   |  2 ++
>>>   2 files changed, 50 insertions(+)
>>>   create mode 100644 libavutil/loongarch/timer.h
>>> 
>>> diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h
>>> new file mode 100644
>>> index 00..44ed786409
>>> --- /dev/null
>>> +++ b/libavutil/loongarch/timer.h
>>> @@ -0,0 +1,48 @@
>>> +/*
>>> + * Copyright (c) 2023 Loongson Technology Corporation Limited
>>> + * Contributed by Hecai Yuan 
>>> + *
>>> + * This file is part of FFmpeg.
>>> + *
>>> + * FFmpeg is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU Lesser General Public
>>> + * License as published by the Free Software Foundation; either
>>> + * version 2.1 of the License, or (at your option) any later version.
>>> + *
>>> + * FFmpeg is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * Lesser General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU Lesser General Public
>>> + * License along with FFmpeg; if not, write to the Free Software
>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
>>> USA + */
>>> +
>>> +#ifndef AVUTIL_LOONGARCH_TIMER_H
>>> +#define AVUTIL_LOONGARCH_TIMER_H
>>> +
>>> +#include 
>>> +#include "config.h"
>>> +
>>> +#if HAVE_INLINE_ASM
>>> +
>>> +#define AV_READ_TIME read_time
>>> +
>>> +static inline uint64_t read_time(void)
>>> +{
>>> +
>>> +#if ARCH_LOONGARCH64
>>> +uint64_t a, id = 0;
>> Initial value is never used.
>> 
>>> +__asm__ volatile ( "rdtime.d  %0, %1" : "=r"(a), "=r"(id) :: "memory"
>>> ); +return a;
>>> +#else
>>> +uint32_t a, id = 0;
>>> +__asm__ volatile ( "rdtimel.w  %0, %1" : "=r"(a), "=r"(id) :: "memory"
>>> ); +return (uint64_t)a;
>>> +#endif
>> Why do you clobber memory here?
>> 
>>> +}
>>> +
>>> +#endif /* HAVE_INLINE_ASM */
>>> +
>>> +#endif /* AVUTIL_LOONGARCH_TIMER_H */
>>> diff --git a/libavutil/timer.h b/libavutil/timer.h
>>> index d3db5a27ef..861ba7e9d7 100644
>>> --- a/libavutil/timer.h
>>> +++ b/libavutil/timer.h
>>> @@ -61,6 +61,8 @@
>>>   #   include "riscv/timer.h"
>>>   #elif ARCH_X86
>>>   #   include "x86/timer.h"
>>> +#elif ARCH_LOONGARCH
>>> +#   include "loongarch/timer.h"
>>>   #endif
>>> 
>>>   #if !defined(AV_READ_TIME)
>>> 
>>     Thanks for your advice.  As described in loongarch's instruction manual, 
>> the rdtime.d instruction is used as follows:
>> rdtime.d rd, rj. The rj register stores the counter ID. In this application, 
>> the value of counter ID is equal to 0.

You're setting a value, zero, to a variable `id`, that is then used as output 
operand. As far as the compiler is concerned, the value zero is never used and 
the initialisation can be elided. The value of register %1 is unspecified.

If you meant for `id` to be an input operand, the constraints are incorrect.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] fftools/ffmpeg_dec: abort if avcodec_send_packet() returns EAGAIN

2023-05-24 Thread James Almer
As the comment in the code mentions, EAGAIN is not an expected value here
because we call avcodec_receive_frame() until all frames have been returned.
avcodec_send_packet() returning EAGAIN means a packet is still buffered, which
hints that the underlying decoder is buggy and not fetching packets as it
should.

An example of this behavior was in the libdav1d wrapper before f209614290,
where feeding it split frames (or individual OBUs) would result in the CLI
eventually printing the unuseful "Error submitting packet to decoder: Resource
temporarily unavailable" error message, and just keep until EOF without
returning new frames.

Signed-off-by: James Almer 
---
Now compiling.

 fftools/ffmpeg_dec.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/fftools/ffmpeg_dec.c b/fftools/ffmpeg_dec.c
index e06747d9c4..b1db9b30d0 100644
--- a/fftools/ffmpeg_dec.c
+++ b/fftools/ffmpeg_dec.c
@@ -390,6 +390,11 @@ int dec_packet(InputStream *ist, const AVPacket *pkt, int 
no_eof)
 if (ret < 0 && !(ret == AVERROR_EOF && !pkt)) {
 // In particular, we don't expect AVERROR(EAGAIN), because we read all
 // decoded frames with avcodec_receive_frame() until done.
+if (ret == AVERROR(EAGAIN)) {
+av_log(ist, AV_LOG_FATAL, "A decoder returned an unexpected error 
code. "
+   "This is a bug, please report it.\n");
+exit_program(1);
+}
 av_log(ist, AV_LOG_ERROR, "Error submitting %s to decoder: %s\n",
pkt ? "packet" : "EOF", av_err2str(ret));
 if (exit_on_error)
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 15/15] fftools/sync_queue: make sure non-limiting streams are not used as queue head

2023-05-24 Thread James Almer

On 5/23/2023 10:58 AM, Anton Khirnov wrote:

A non-limiting stream could mistakenly end up being the queue head,
which would then produce incorrect synchronization, seen e.g. in
fate-matroska-flac-extradata-update for certain number of frame threads
(e.g. 5).

Found-By: James Almer


Strictly speaking, it was FATE.


---
  fftools/sync_queue.c | 13 +++--
  1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fftools/sync_queue.c b/fftools/sync_queue.c
index c0f33e9235..bc107ba4fe 100644
--- a/fftools/sync_queue.c
+++ b/fftools/sync_queue.c
@@ -217,17 +217,26 @@ static void finish_stream(SyncQueue *sq, unsigned int 
stream_idx)
  
  static void queue_head_update(SyncQueue *sq)

  {
+av_assert0(sq->have_limiting);
+
  if (sq->head_stream < 0) {
+unsigned first_limiting = UINT_MAX;
+
  /* wait for one timestamp in each stream before determining
   * the queue head */
  for (unsigned int i = 0; i < sq->nb_streams; i++) {
  SyncQueueStream *st = &sq->streams[i];
-if (st->limiting && st->head_ts == AV_NOPTS_VALUE)
+if (!st->limiting)
+continue;
+if (st->head_ts == AV_NOPTS_VALUE)
  return;
+if (first_limiting == UINT_MAX)
+first_limiting = i;
  }
  
  // placeholder value, correct one will be found below

-sq->head_stream = 0;
+av_assert0(first_limiting < UINT_MAX);
+sq->head_stream = first_limiting;
  }
  
  for (unsigned int i = 0; i < sq->nb_streams; i++) {


Can confirm it fixes the issue.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/videotoolboxenc: replace VT_H264Profile with avctx profile

2023-05-24 Thread Rick Kern
On Mon, May 22, 2023 at 12:17 AM 徐福隆 <839789...@qq.com> wrote:

> It's my mistake that forget to remove H264_PROF_AUTO. I will fix that.
> Any other suggestions about the profile_options? Thanks.
>
Nothing else from me - just the default profile selection behavior
zhilizhao mentioned.


>
> -- 原始邮件 --
> *发件人:* ""zhilizhao(赵志立)"" ;
> *发送时间:* 2023年5月22日(星期一) 中午11:11
> *收件人:* "FFmpeg development discussions and patches"<
> ffmpeg-devel@ffmpeg.org>;
> *抄送:* "徐福隆"<839789...@qq.com>;"Rick Kern";
> *主题:* Re: [FFmpeg-devel] [PATCH] avcodec/videotoolboxenc: replace
> VT_H264Profile with avctx profile
>
>
>
> > On May 22, 2023, at 11:05, zhilizhao(赵志立) 
> wrote:
> >
> >> On May 21, 2023, at 22:41, xufuji456 <839789...@qq.com> wrote:
> >>
> >> For compatibility with constrained_baseline in the future,
> >> replace VT_H264Profile/VT_HEVCProfile with avctx->profile.
> >>
> >> Signed-off-by: xufuji456 <839789...@qq.com>
> >> ---
> >> libavcodec/videotoolboxenc.c | 55 +++-
> >> 1 file changed, 16 insertions(+), 39 deletions(-)
> >>
> >> diff --git a/libavcodec/videotoolboxenc.c b/libavcodec/videotoolboxenc.c
> >> index b017c90c36..4966ab36ae 100644
> >> --- a/libavcodec/videotoolboxenc.c
> >> +++ b/libavcodec/videotoolboxenc.c
> >> @@ -190,28 +190,12 @@ static void loadVTEncSymbols(void){
> >>"EnableLowLatencyRateControl");
> >> }
> >>
> >> -typedef enum VT_H264Profile {
> >> -H264_PROF_AUTO,
> >> -H264_PROF_BASELINE,
> >> -H264_PROF_MAIN,
> >> -H264_PROF_HIGH,
> >> -H264_PROF_EXTENDED,
> >> -H264_PROF_COUNT
> >> -} VT_H264Profile;
> >> -
> >> typedef enum VTH264Entropy{
> >>VT_ENTROPY_NOT_SET,
> >>VT_CAVLC,
> >>VT_CABAC
> >> } VTH264Entropy;
> >>
> >> -typedef enum VT_HEVCProfile {
> >> -HEVC_PROF_AUTO,
> >> -HEVC_PROF_MAIN,
> >> -HEVC_PROF_MAIN10,
> >> -HEVC_PROF_COUNT
> >> -} VT_HEVCProfile;
> >> -
> >> static const uint8_t start_code[] = { 0, 0, 0, 1 };
> >>
> >> typedef struct ExtraSEI {
> >> @@ -730,18 +714,13 @@ static bool
> get_vt_h264_profile_level(AVCodecContext *avctx,
> >>VTEncContext *vtctx = avctx->priv_data;
> >>int64_t profile = vtctx->profile;
> >>
> >> -if (profile == H264_PROF_AUTO && vtctx->level) {
> >> -//Need to pick a profile if level is not auto-selected.
> >> -profile = vtctx->has_b_frames ? H264_PROF_MAIN :
> H264_PROF_BASELINE;
> >> -}
> >> -
> >>*profile_level_val = NULL;
> >>
> >>switch (profile) {
> >>case H264_PROF_AUTO:
> >>return true;
>
> Isn’t it failed to build since H264_PROF_AUTO isn’t defined?
>
Please be sure to compile and test before submitting a patch.


>
> >>
> >> -case H264_PROF_BASELINE:
> >> +case FF_PROFILE_H264_BASELINE:
> >>switch (vtctx->level) {
> >>case  0: *profile_level_val =
> >>
> compat_keys.kVTProfileLevel_H264_Baseline_AutoLevel; break;
> >> @@ -763,7 +742,7 @@ static bool
> get_vt_h264_profile_level(AVCodecContext *avctx,
> >>}
> >>break;
> >>
> >> -case H264_PROF_MAIN:
> >> +case FF_PROFILE_H264_MAIN:
> >>switch (vtctx->level) {
> >>case  0: *profile_level_val =
> >>
> compat_keys.kVTProfileLevel_H264_Main_AutoLevel; break;
> >> @@ -782,7 +761,7 @@ static bool
> get_vt_h264_profile_level(AVCodecContext *avctx,
> >>}
> >>break;
> >>
> >> -case H264_PROF_HIGH:
> >> +case FF_PROFILE_H264_HIGH:
> >>switch (vtctx->level) {
> >>case  0: *profile_level_val =
> >>
> compat_keys.kVTProfileLevel_H264_High_AutoLevel; break;
> >> @@ -805,7 +784,7 @@ static bool
> get_vt_h264_profile_level(AVCodecContext *avctx,
> >>
> compat_keys.kVTProfileLevel_H264_High_5_2;   break;
> >>}
> >>break;
> >> -case H264_PROF_EXTENDED:
> >> +case FF_PROFILE_H264_EXTENDED:
> >>switch (vtctx->level) {
> >>case  0: *profile_level_val =
> >>
> compat_keys.kVTProfileLevel_H264_Extended_AutoLevel; break;
> >> @@ -838,13 +817,11 @@ static bool
> get_vt_hevc_profile_level(AVCodecContext *avctx,
> >>*profile_level_val = NULL;
> >>
> >>switch (profile) {
> >> -case HEVC_PROF_AUTO:
> >> -return true;
> >> -case HEVC_PROF_MAIN:
> >> +case FF_PROFILE_HEVC_MAIN:
> >>*profile_level_val =
> >>compat_keys.kVTProfileLevel_HEVC_Main_AutoLevel;
> >>break;
> >> -case HEVC_PROF_MAIN10:
> >> +case FF_PROFILE_HEVC_MAIN_10:
> >>*profile_level_val =
> >>compat_keys.kVTProfileLevel_HEVC_Main10_AutoLevel;
> >>break;
> >> @@ -1515,12 +1492,12 @@ static int
> vtenc_configure_encoder(AVCodecContext *avctx)
> >>vtctx->get_param_set_func =
> CMVideoFormatDescriptionGetH264ParameterSetAtIndex;
> >>
> >>vtctx->has_b_frames = avctx->ma

Re: [FFmpeg-devel] [PATCH v4] lavc/h264chroma: RISC-V V add motion compensation for 8x8 chroma blocks

2023-05-24 Thread Rémi Denis-Courmont
Le keskiviikkona 24. toukokuuta 2023, 8.28.08 EEST Arnie Chang a écrit :
> diff --git a/libavcodec/riscv/h264_mc_chroma.S
> b/libavcodec/riscv/h264_mc_chroma.S new file mode 100644
> index 00..9fcd2e34b3
> --- /dev/null
> +++ b/libavcodec/riscv/h264_mc_chroma.S
> @@ -0,0 +1,307 @@
> +/*
> + * Copyright (c) 2023 SiFive, Inc. All rights reserved.
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
> USA + */
> +#include "libavutil/riscv/asm.S"
> +
> +.macro  h264_chroma_mc8 type
> +func h264_\type\()_chroma_mc8_rvv, zve32x
> +csrwvxrm, zero
> +sllit2, a5, 3
> +mulwt1, a5, a4

This should probably be mul.

> +sh3add  a5, a4, t2
> +sllia4, a4, 3
> +sub a5, t1, a5
> +sub a7, a4, t1
> +addia6, a5, 64
> +sub t0, t2, t1
> +vsetivlit3, 8, e8, m1, ta, mu
> +beqzt1, 2f
> +bleza3, 8f
> +li  t4, 0
> +li  t2, 0
> +li  t5, 1
> +addia5, t3, 1
> +sllit3, a2, 2

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] lavu/tx: stop using av_log(NULL, )

2023-05-24 Thread Lynne
Patch attached.

>From 2813dcb5b885bdf0c3f78f8aead43f4b11149a70 Mon Sep 17 00:00:00 2001
From: Lynne 
Date: Wed, 24 May 2023 21:57:25 +0200
Subject: [PATCH] lavu/tx: stop using av_log(NULL, )

---
 libavutil/tx.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/libavutil/tx.c b/libavutil/tx.c
index e25abf998f..34fbe3f6c7 100644
--- a/libavutil/tx.c
+++ b/libavutil/tx.c
@@ -29,6 +29,12 @@
  ((x) == AV_TX_DOUBLE_ ## type) || \
  ((x) == AV_TX_INT32_ ## type))
 
+static AVClass tx_class = {
+.class_name= "tx",
+.item_name = av_default_item_name,
+.version   = LIBAVUTIL_VERSION_INT,
+};
+
 /* Calculates the modular multiplicative inverse */
 static av_always_inline int mulinv(int n, int m)
 {
@@ -631,7 +637,7 @@ static void print_cd_info(const FFTXCodelet *cd, int prio, int len, int print_pr
 if (print_prio)
 av_bprintf(&bp, ", prio: %i", prio);
 
-av_log(NULL, AV_LOG_DEBUG, "%s\n", bp.str);
+av_log((void *)&tx_class, AV_LOG_DEBUG, "%s\n", bp.str);
 }
 
 static void print_tx_structure(AVTXContext *s, int depth)
@@ -639,7 +645,7 @@ static void print_tx_structure(AVTXContext *s, int depth)
 const FFTXCodelet *cd = s->cd_self;
 
 for (int i = 0; i <= depth; i++)
-av_log(NULL, AV_LOG_DEBUG, "");
+av_log((void *)&tx_class, AV_LOG_DEBUG, "");
 
 print_cd_info(cd, cd->prio, s->len, 0);
 
@@ -798,10 +804,10 @@ av_cold int ff_tx_init_subtx(AVTXContext *s, enum AVTXType type,
 AV_QSORT(cd_matches, nb_cd_matches, TXCodeletMatch, cmp_matches);
 
 #if !CONFIG_SMALL
-av_log(NULL, AV_LOG_DEBUG, "%s\n", bp.str);
+av_log((void *)&tx_class, AV_LOG_DEBUG, "%s\n", bp.str);
 
 for (int i = 0; i < nb_cd_matches; i++) {
-av_log(NULL, AV_LOG_DEBUG, "%i: ", i + 1);
+av_log((void *)&tx_class, AV_LOG_DEBUG, "%i: ", i + 1);
 print_cd_info(cd_matches[i].cd, cd_matches[i].prio, 0, 1);
 }
 #endif
@@ -909,7 +915,7 @@ av_cold int av_tx_init(AVTXContext **ctx, av_tx_fn *tx, enum AVTXType type,
 *tx  = tmp.fn[0];
 
 #if !CONFIG_SMALL
-av_log(NULL, AV_LOG_DEBUG, "Transform tree:\n");
+av_log((void *)&tx_class, AV_LOG_DEBUG, "Transform tree:\n");
 print_tx_structure(*ctx, 0);
 #endif
 
-- 
2.40.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] lavu/tx: stop using av_log(NULL, )

2023-05-24 Thread Paul B Mahol
On 5/24/23, Lynne  wrote:
> Patch attached.
>
>

Probably fine.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] lavu/tx: stop using av_log(NULL, )

2023-05-24 Thread Leo Izen

On 5/24/23 16:35, Lynne wrote:

Patch attached.



+av_log((void *)&tx_class, AV_LOG_DEBUG, "%s\n", bp.str);

The type of the first argument to av_log should be AVClass **, but this 
only appears to be AVClass *. See libavutil/log.c line 428.


- Leo Izen
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.

2023-05-24 Thread Michael Niedermayer
On Wed, May 24, 2023 at 03:48:27PM +0800, Hao Chen wrote:
> From: Shiyou Yin 
> 
> loongson_asm.S is LoongArch asm optimization helper.
> Add functions:
>   ff_h264_idct_add_8_lsx
>   ff_h264_idct8_add_8_lsx
>   ff_h264_idct_dc_add_8_lsx
>   ff_h264_idct8_dc_add_8_lsx
>   ff_h264_idct_add16_8_lsx
>   ff_h264_idct8_add4_8_lsx
>   ff_h264_idct_add8_8_lsx
>   ff_h264_idct_add8_422_8_lsx
>   ff_h264_idct_add16_intra_8_lsx
>   ff_h264_luma_dc_dequant_idct_8_lsx
> Replaced function(LSX is sufficient for these functions):
>   ff_h264_idct_add_lasx
>   ff_h264_idct4x4_addblk_dc_lasx
>   ff_h264_idct_add16_lasx
>   ff_h264_idct8_add4_lasx
>   ff_h264_idct_add8_lasx
>   ff_h264_idct_add8_422_lasx
>   ff_h264_idct_add16_intra_lasx
>   ff_h264_deq_idct_luma_dc_lasx
> Renamed functions:
>   ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx
>   ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx
> 
> ./configure --disable-lasx
> ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
> before: 155fps
> after:  161fps
> ---
>  libavcodec/loongarch/Makefile |   3 +-
>  libavcodec/loongarch/h264_deblock_lasx.c  |   2 +-
>  libavcodec/loongarch/h264dsp_init_loongarch.c |  39 +-
>  libavcodec/loongarch/h264dsp_lasx.c   |   2 +-
>  .../{h264dsp_lasx.h => h264dsp_loongarch.h}   |  60 +-
>  libavcodec/loongarch/h264idct.S   | 658 
>  libavcodec/loongarch/h264idct_lasx.c  | 498 -
>  libavcodec/loongarch/h264idct_loongarch.c | 184 
>  libavcodec/loongarch/loongson_asm.S   | 945 ++
>  9 files changed, 1848 insertions(+), 543 deletions(-)
>  rename libavcodec/loongarch/{h264dsp_lasx.h => h264dsp_loongarch.h} (68%)
>  create mode 100644 libavcodec/loongarch/h264idct.S
>  delete mode 100644 libavcodec/loongarch/h264idct_lasx.c
>  create mode 100644 libavcodec/loongarch/h264idct_loongarch.c
>  create mode 100644 libavcodec/loongarch/loongson_asm.S

Applying: avcodec/la: add LSX optimization for h264 idct.
.git/rebase-apply/patch:1431: tab in indent.
} else if (nnz) {
warning: 1 line adds whitespace errors.

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If the United States is serious about tackling the national security threats 
related to an insecure 5G network, it needs to rethink the extent to which it
values corporate profits and government espionage over security.-Bruce Schneier


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 55/97] Vulkan patchset part 2 - hwcontext rewrite and filtering

2023-05-24 Thread Lynne
May 22, 2023, 10:26 by d...@lynne.ee:

> Planning on pushing this partially (no encoding) tomorrow unless there are 
> more comments.
> All known issues have been fixed, and if there are more issues, they can be 
> found as users test it.
>

Added APIchanges and bumped minor for lavu and lavc.
Planning to push this in 2 days unless there are more comments.
All known issues have been addressed.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] lavu/tx: stop using av_log(NULL, )

2023-05-24 Thread Lynne
May 24, 2023, 23:24 by leo.i...@gmail.com:

> On 5/24/23 16:35, Lynne wrote:
>
>> Patch attached.
>>
>
> +av_log((void *)&tx_class, AV_LOG_DEBUG, "%s\n", bp.str);
>
> The type of the first argument to av_log should be AVClass **, but this only 
> appears to be AVClass *. See libavutil/log.c line 428.
>
> - Leo Izen
>

Right, thanks, changed to:

> static const AVClass tx_class = {
>     .class_name    = "tx",
>     .item_name = av_default_item_name,
>     .version   = LIBAVUTIL_VERSION_INT,
> };
>
> static const struct {
>     const AVClass *tx_class;
> } tx_log = {
>     &tx_class,
> };
Will push this tomorrow.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 7/7] avutil/la: Add function performance testing

2023-05-24 Thread Hao Chen


在 2023/5/24 下午7:03, Rémi Denis-Courmont 写道:


Le 24 mai 2023 10:39:59 GMT+03:00, Hao Chen  a écrit :

在 2023/5/20 下午5:38, Rémi Denis-Courmont 写道:

Le lauantaina 20. toukokuuta 2023, 10.27.19 EEST Hao Chen a écrit :

From: yuanhecai 

This patch supports the use of the "checkasm --bench" testing feature
on loongarch platform.

Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3
---
   libavutil/loongarch/timer.h | 48 +
   libavutil/timer.h   |  2 ++
   2 files changed, 50 insertions(+)
   create mode 100644 libavutil/loongarch/timer.h

diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h
new file mode 100644
index 00..44ed786409
--- /dev/null
+++ b/libavutil/loongarch/timer.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright (c) 2023 Loongson Technology Corporation Limited
+ * Contributed by Hecai Yuan 
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
USA + */
+
+#ifndef AVUTIL_LOONGARCH_TIMER_H
+#define AVUTIL_LOONGARCH_TIMER_H
+
+#include 
+#include "config.h"
+
+#if HAVE_INLINE_ASM
+
+#define AV_READ_TIME read_time
+
+static inline uint64_t read_time(void)
+{
+
+#if ARCH_LOONGARCH64
+uint64_t a, id = 0;

Initial value is never used.


+__asm__ volatile ( "rdtime.d  %0, %1" : "=r"(a), "=r"(id) :: "memory"
); +return a;
+#else
+uint32_t a, id = 0;
+__asm__ volatile ( "rdtimel.w  %0, %1" : "=r"(a), "=r"(id) :: "memory"
); +return (uint64_t)a;
+#endif

Why do you clobber memory here?


+}
+
+#endif /* HAVE_INLINE_ASM */
+
+#endif /* AVUTIL_LOONGARCH_TIMER_H */
diff --git a/libavutil/timer.h b/libavutil/timer.h
index d3db5a27ef..861ba7e9d7 100644
--- a/libavutil/timer.h
+++ b/libavutil/timer.h
@@ -61,6 +61,8 @@
   #   include "riscv/timer.h"
   #elif ARCH_X86
   #   include "x86/timer.h"
+#elif ARCH_LOONGARCH
+#   include "loongarch/timer.h"
   #endif

   #if !defined(AV_READ_TIME)


     Thanks for your advice.  As described in loongarch's instruction manual, 
the rdtime.d instruction is used as follows:
rdtime.d rd, rj. The rj register stores the counter ID. In this application, 
the value of counter ID is equal to 0.

You're setting a value, zero, to a variable `id`, that is then used as output 
operand. As far as the compiler is concerned, the value zero is never used and 
the initialisation can be elided. The value of register %1 is unspecified.

If you meant for `id` to be an input operand, the constraints are incorrect.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".



You are right! Thank you very much for your reminder. I will correct it.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.

2023-05-24 Thread Hao Chen


在 2023/5/25 上午5:28, Michael Niedermayer 写道:

On Wed, May 24, 2023 at 03:48:27PM +0800, Hao Chen wrote:

From: Shiyou Yin 

loongson_asm.S is LoongArch asm optimization helper.
Add functions:
   ff_h264_idct_add_8_lsx
   ff_h264_idct8_add_8_lsx
   ff_h264_idct_dc_add_8_lsx
   ff_h264_idct8_dc_add_8_lsx
   ff_h264_idct_add16_8_lsx
   ff_h264_idct8_add4_8_lsx
   ff_h264_idct_add8_8_lsx
   ff_h264_idct_add8_422_8_lsx
   ff_h264_idct_add16_intra_8_lsx
   ff_h264_luma_dc_dequant_idct_8_lsx
Replaced function(LSX is sufficient for these functions):
   ff_h264_idct_add_lasx
   ff_h264_idct4x4_addblk_dc_lasx
   ff_h264_idct_add16_lasx
   ff_h264_idct8_add4_lasx
   ff_h264_idct_add8_lasx
   ff_h264_idct_add8_422_lasx
   ff_h264_idct_add16_intra_lasx
   ff_h264_deq_idct_luma_dc_lasx
Renamed functions:
   ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx
   ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx

./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 155fps
after:  161fps
---
  libavcodec/loongarch/Makefile |   3 +-
  libavcodec/loongarch/h264_deblock_lasx.c  |   2 +-
  libavcodec/loongarch/h264dsp_init_loongarch.c |  39 +-
  libavcodec/loongarch/h264dsp_lasx.c   |   2 +-
  .../{h264dsp_lasx.h => h264dsp_loongarch.h}   |  60 +-
  libavcodec/loongarch/h264idct.S   | 658 
  libavcodec/loongarch/h264idct_lasx.c  | 498 -
  libavcodec/loongarch/h264idct_loongarch.c | 184 
  libavcodec/loongarch/loongson_asm.S   | 945 ++
  9 files changed, 1848 insertions(+), 543 deletions(-)
  rename libavcodec/loongarch/{h264dsp_lasx.h => h264dsp_loongarch.h} (68%)
  create mode 100644 libavcodec/loongarch/h264idct.S
  delete mode 100644 libavcodec/loongarch/h264idct_lasx.c
  create mode 100644 libavcodec/loongarch/h264idct_loongarch.c
  create mode 100644 libavcodec/loongarch/loongson_asm.S

Applying: avcodec/la: add LSX optimization for h264 idct.
.git/rebase-apply/patch:1431: tab in indent.
} else if (nnz) {
warning: 1 line adds whitespace errors.

[...]

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".



Thank you for your feedback. My local git does not have the 
core.whitespace option set, causing this problem to not be detected. I 
will retest all patches and try to avoid similar problems from happening 
again.


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v4 1/7] avcodec/la: add LSX optimization for h264 idct.

2023-05-24 Thread Shiyou Yin


> 2023年5月25日 05:28,Michael Niedermayer  写道:
> 
> On Wed, May 24, 2023 at 03:48:27PM +0800, Hao Chen wrote:
>> From: Shiyou Yin 
>> 
>> loongson_asm.S is LoongArch asm optimization helper.
>> Add functions:
>> ff_h264_idct_add_8_lsx
>> ff_h264_idct8_add_8_lsx
>> ff_h264_idct_dc_add_8_lsx
>> ff_h264_idct8_dc_add_8_lsx
>> ff_h264_idct_add16_8_lsx
>> ff_h264_idct8_add4_8_lsx
>> ff_h264_idct_add8_8_lsx
>> ff_h264_idct_add8_422_8_lsx
>> ff_h264_idct_add16_intra_8_lsx
>> ff_h264_luma_dc_dequant_idct_8_lsx
>> Replaced function(LSX is sufficient for these functions):
>> ff_h264_idct_add_lasx
>> ff_h264_idct4x4_addblk_dc_lasx
>> ff_h264_idct_add16_lasx
>> ff_h264_idct8_add4_lasx
>> ff_h264_idct_add8_lasx
>> ff_h264_idct_add8_422_lasx
>> ff_h264_idct_add16_intra_lasx
>> ff_h264_deq_idct_luma_dc_lasx
>> Renamed functions:
>> ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx
>> ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx
>> 
>> ./configure --disable-lasx
>> ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
>> before: 155fps
>> after: 161fps
>> ---
>> libavcodec/loongarch/Makefile | 3 +-
>> libavcodec/loongarch/h264_deblock_lasx.c | 2 +-
>> libavcodec/loongarch/h264dsp_init_loongarch.c | 39 +-
>> libavcodec/loongarch/h264dsp_lasx.c | 2 +-
>> .../{h264dsp_lasx.h => h264dsp_loongarch.h} | 60 +-
>> libavcodec/loongarch/h264idct.S | 658 
>> libavcodec/loongarch/h264idct_lasx.c | 498 -
>> libavcodec/loongarch/h264idct_loongarch.c | 184 
>> libavcodec/loongarch/loongson_asm.S | 945 ++
>> 9 files changed, 1848 insertions(+), 543 deletions(-)
>> rename libavcodec/loongarch/{h264dsp_lasx.h => h264dsp_loongarch.h} (68%)
>> create mode 100644 libavcodec/loongarch/h264idct.S
>> delete mode 100644 libavcodec/loongarch/h264idct_lasx.c
>> create mode 100644 libavcodec/loongarch/h264idct_loongarch.c
>> create mode 100644 libavcodec/loongarch/loongson_asm.S
> 
> Applying: avcodec/la: add LSX optimization for h264 idct.
> .git/rebase-apply/patch:1431: tab in indent.
>   } else if (nnz) {
> warning: 1 line adds whitespace errors.
> 
Thanks, will set core.witespace in gitconfig to avoid these error.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 7/7] avutil/la: Add function performance testing

2023-05-24 Thread Shiyou Yin


> 2023年5月25日 10:36,Hao Chen  写道:
> 
> 
> 在 2023/5/24 下午7:03, Rémi Denis-Courmont 写道:
>> 
>> Le 24 mai 2023 10:39:59 GMT+03:00, Hao Chen  a écrit :
>>> 在 2023/5/20 下午5:38, Rémi Denis-Courmont 写道:
 Le lauantaina 20. toukokuuta 2023, 10.27.19 EEST Hao Chen a écrit :
> From: yuanhecai 
> 
> This patch supports the use of the "checkasm --bench" testing feature
> on loongarch platform.
> 
> Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3
> ---
>   libavutil/loongarch/timer.h | 48 +
>   libavutil/timer.h   |  2 ++
>   2 files changed, 50 insertions(+)
>   create mode 100644 libavutil/loongarch/timer.h
> 
> diff --git a/libavutil/loongarch/timer.h b/libavutil/loongarch/timer.h
> new file mode 100644
> index 00..44ed786409
> --- /dev/null
> +++ b/libavutil/loongarch/timer.h
> @@ -0,0 +1,48 @@
> +/*
> + * Copyright (c) 2023 Loongson Technology Corporation Limited
> + * Contributed by Hecai Yuan 
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 
> 02110-1301
> USA + */
> +
> +#ifndef AVUTIL_LOONGARCH_TIMER_H
> +#define AVUTIL_LOONGARCH_TIMER_H
> +
> +#include 
> +#include "config.h"
> +
> +#if HAVE_INLINE_ASM
> +
> +#define AV_READ_TIME read_time
> +
> +static inline uint64_t read_time(void)
> +{
> +
> +#if ARCH_LOONGARCH64
> +uint64_t a, id = 0;
 Initial value is never used.
 
> +__asm__ volatile ( "rdtime.d  %0, %1" : "=r"(a), "=r"(id) :: "memory"
> ); +return a;
> +#else
> +uint32_t a, id = 0;
> +__asm__ volatile ( "rdtimel.w  %0, %1" : "=r"(a), "=r"(id) :: 
> "memory"
> ); +return (uint64_t)a;
> +#endif
 Why do you clobber memory here?
 
> +}
> +
> +#endif /* HAVE_INLINE_ASM */
> +
> +#endif /* AVUTIL_LOONGARCH_TIMER_H */
> diff --git a/libavutil/timer.h b/libavutil/timer.h
> index d3db5a27ef..861ba7e9d7 100644
> --- a/libavutil/timer.h
> +++ b/libavutil/timer.h
> @@ -61,6 +61,8 @@
>   #   include "riscv/timer.h"
>   #elif ARCH_X86
>   #   include "x86/timer.h"
> +#elif ARCH_LOONGARCH
> +#   include "loongarch/timer.h"
>   #endif
> 
>   #if !defined(AV_READ_TIME)
> 
 Thanks for your advice.  As described in loongarch's instruction 
 manual, the rdtime.d instruction is used as follows:
 rdtime.d rd, rj. The rj register stores the counter ID. In this 
 application, the value of counter ID is equal to 0.
>> You're setting a value, zero, to a variable `id`, that is then used as 
>> output operand. As far as the compiler is concerned, the value zero is never 
>> used and the initialisation can be elided. The value of register %1 is 
>> unspecified.
>> 
>> If you meant for `id` to be an input operand, the constraints are incorrect.
>> 
> 
> 
> You are right! Thank you very much for your reminder. I will correct it.
> 

 id is output operand, the constraints is correct, and initilazation of id is 
not necessary.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".