On 4/17/2025 2:07 PM, Andreas Rheinhardt wrote:
gkha...@spectre-music.com:
From: Guillaume Khayat <gkha...@spectre-music.com>

Improve performance (+17%) of ebur_128 filter using AVX2 and FMA instruction in 
the body of the filter_frame function.

## Benchmark

Tested with hyperfine

hyperfine --warmup 2 "./ffmpeg_reference -i ~/test.wav -vn -af 
ebur128=peak=none:framelog=quiet -f null -" "./ffmpeg_avx -i ~/test.wav -vn -af 
ebur128=peak=none:framelog=quiet -f null -"
Benchmark 1: ./ffmpeg_reference -i ~/test.wav -vn -af 
ebur128=peak=none:framelog=quiet -f null -
   Time (mean ± σ):      7.118 s ±  0.037 s    [User: 9.114 s, System: 1.038 s]
   Range (min … max):    7.073 s …  7.177 s    10 runs
Benchmark 2: ./ffmpeg_avx -i ~/test.wav -vn -af ebur128=peak=none:framelog=quiet -f null -
   Time (mean ± σ):      6.073 s ±  0.108 s    [User: 7.903 s, System: 1.058 s]
   Range (min … max):    5.955 s …  6.327 s    10 runs
Summary
   ./ffmpeg_avx -i ~/test.wav -vn -af ebur128=peak=none:framelog=quiet -f null 
- ran
     1.17 ± 0.02 times faster than ./ffmpeg_reference -i ~/test.wav -vn -af 
ebur128=peak=none:framelog=quiet -f null -

## Tests

- all FATE tests pass, tested on Darwin/arm64 and Linux/x86_64 w/ AVX2/FMA 
support
- On AVX2/FMA-capable system, all test files from the EBU yield the exact same 
output values (I/LRA) after and before optimization. See 
https://tech.ebu.ch/publications/ebu_loudness_test_set

Disclaimer: this is my first ever patch submission to FFmpeg, and first ever 
time using git send-email to submit a patch anywhere.

Signed-off-by: Cesar Matheus <cesar.math...@telecom-paris.fr>
Signed-off-by: Guillaume Khayat <gkha...@spectre-music.com>
---
  libavfilter/f_ebur128.c | 246 ++++++++++++++++++++++++++++++++++------
  1 file changed, 214 insertions(+), 32 deletions(-)

diff --git a/libavfilter/f_ebur128.c b/libavfilter/f_ebur128.c
index 768f062bac..e305b0a3ce 100644
--- a/libavfilter/f_ebur128.c
+++ b/libavfilter/f_ebur128.c
@@ -28,7 +28,7 @@
#include <float.h>
  #include <math.h>
-
+#include "libavutil/intmath.h"
  #include "libavutil/avassert.h"
  #include "libavutil/channel_layout.h"
  #include "libavutil/dict.h"
@@ -199,7 +199,7 @@ static const AVOption ebur128_options[] = {
  };
AVFILTER_DEFINE_CLASS(ebur128);
-
+#define MIN(a, b) ((a) < (b) ? (a) : (b))
  static const uint8_t graph_colors[] = {
      0xdd, 0x66, 0x66,   // value above 1LU non reached below -1LU (impossible)
      0x66, 0x66, 0xdd,   // value below 1LU non reached below -1LU
@@ -628,13 +628,61 @@ static int gate_update(struct integrator *integ, double 
power,
static int filter_frame(AVFilterLink *inlink, AVFrame *insamples)
  {
-    int i, ch, idx_insample, ret;
+
+    int i, ch, idx_insample, ret,bin_id_400,bin_id_3000;
      AVFilterContext *ctx = inlink->dst;
      EBUR128Context *ebur128 = ctx->priv;
      const int nb_channels = ebur128->nb_channels;
      const int nb_samples  = insamples->nb_samples;
      const double *samples = (double *)insamples->data[0];
      AVFrame *pic;
+
+#if HAVE_AVX2_EXTERNAL && HAVE_AVX2

This is completely wrong: This only checks whether your assembler
supports AVX2 and whether it was not disabled in configure. But this
does not imply that the CPU where this code runs is actually capable of
AVX2; I don't even know whether this check ensures that the compiler
understands __m256d.
For actual runtime support you need to check via av_get_cpu_flags(). See
how other DSP code does it.

- Andreas

This also needs to be written in NASM syntax assembly, not Intel intrinsics, and it should be in a separate file in the x86/ folder, using function pointers like every other SIMD implementation.

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to