Re: [FFmpeg-devel] [PATCH 00/17] swscale v2: new framework [RFC]

2025-05-02 Thread Niklas Haas
On Sat, 26 Apr 2025 19:41:04 +0200 Niklas Haas  wrote:
> Hi all,
>
> After extensive amounts of refactoring and iteration on the design and API,
> and the implementation of an x86 SIMD backend, I'm happy to present the
> revised version of my ongoing swscale rewrite. Now with 100% less reliance on
> compiler autovectorization.
>
> As before, I recommend (re)reading the design document to understand the
> motivation, structure and implementation details of this rewrite. At this
> point, I expect the major API and internal organization decisions to remain
> stable.
>
> I will preface with some benchmark figures, on my (new) AMD Ryzen 9 9950X3D:
>
> All formats:
>   - single thread: Overall speedup=2.109x faster, min=0.018x max=40.309x
>   - multi thread:  Overall speedup=2.607x faster, min=0.112x max=254.738x
>
> "Common" formats: (referenced >100 times in FFmpeg source code)
>   - single thread: Overall speedup=2.797x faster, min=0.408x max=16.514x
>   - multi thread:  Overall speedup=2.870x faster, min=0.715x max=21.983x

Small update: I noticed that one code path was accidentally not enabled. I
also implemented asm for the remaining bit-packed formats. After those two
changes, the new numbers are:

All formats:
  - single thread: Overall speedup=4.247x faster, min=0.177x max=224.809x
  - multi thread:  Overall speedup=4.000x faster, min=0.256x max=968.725x

"Common" formats:
  - single thread: Overall speedup=3.174x faster, min=0.596x max=12.616x
  - multi thread:  Overall speedup=3.005x faster, min=0.617x max=14.739x

>
> However, the main goal of this rewrite is not to improve performance, but to
> improve the maintainability, extensibility and correctness of the code. Most 
> of
> the slowdowns for "common" formats are due to increased correctness (e.g.
> accurate rounding and dithering), and not the result of a regression per se.
>
> All of the remaining slowdowns (notably, the 0.1x cases) are due to incomplete
> coverage of the x86 SIMD. Notably, this currently affects bit packed formats
> (e.g. rgb8, rgb4). (I also did not yet incorporate any AVX-512 code, which
> some of the existing routines take advantage of)
>
> While I will continue working on this and expanding coverage to all remaining
> operations, I felt that now is a good point in time to get some code review
> and feedback regardless. I would especially appreciate code review of the x86
> SIMD code inside libswscale/x86/ops_*.asm, as this is my first time writing
> x86 assembly code.
>
>  doc/APIchanges|   3 +
>  doc/scaler.texi   |   3 +
>  doc/swscale-v2.txt| 344 +++
>  libswscale/Makefile   |   9 +
>  libswscale/format.c   | 945 
> -
>  libswscale/format.h   |  29 ++-
>  libswscale/graph.c| 151 
>  libswscale/graph.h|  37 ++-
>  libswscale/ops.c  | 850 
> +
>  libswscale/ops.h  | 263 +
>  libswscale/ops_backend.c  | 101 
>  libswscale/ops_backend.h  | 181 ++
>  libswscale/ops_chain.c| 291 +++
>  libswscale/ops_chain.h| 108 +
>  libswscale/ops_internal.h | 103 
>  libswscale/ops_optimizer.c| 810 
> ++
>  libswscale/ops_tmpl_common.c  | 176 ++
>  libswscale/ops_tmpl_float.c   | 255 
>  libswscale/ops_tmpl_int.c | 609 
> +++
>  libswscale/options.c  |   1 +
>  libswscale/swscale.h  |   7 +
>  libswscale/tests/swscale.c|  11 +-
>  libswscale/version.h  |   2 +-
>  libswscale/x86/Makefile   |   3 +
>  libswscale/x86/ops.c  | 735 
> +
>  libswscale/x86/ops_common.asm | 208 
>  libswscale/x86/ops_float.asm  | 376 +
>  libswscale/x86/ops_int.asm| 882 
> 
>  tests/checkasm/Makefile   |   8 +-
>  tests/checkasm/checkasm.c |   4 +-
>  tests/checkasm/checkasm.h |  26 +-
>  tests/checkasm/sw_ops.c   | 748 
> +
>  32 files changed, 8206 insertions(+), 73 deletions(-)
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org wi

Re: [FFmpeg-devel] [PATCH] lavc: Add unit test for APV entropy decode

2025-05-02 Thread Michael Niedermayer
On Wed, Apr 30, 2025 at 10:09:11PM +0100, Mark Thompson wrote:
> ---
> Cleaned up a bit from use earlier.  This program allows testing of changes to 
> entropy decode in a form much easier to debug than a whole stream.
> 
> I had a more complete randomised test of transform here as well along with 
> test code for checking intermediates, though it seems like the checkasm only 
> might be preferred.  Can add it if useful.
> 
> Thanks,
> 
> - Mark
> 
>  libavcodec/Makefile   |   1 +
>  libavcodec/tests/apv.c| 306 ++
>  tests/fate/libavcodec.mak |   5 +
>  3 files changed, 312 insertions(+)
>  create mode 100644 libavcodec/tests/apv.c

breaks make -j32 fate-apv

TESTapv-422-10
--- /dev/null   2025-01-24 23:36:38.397802529 +0100
+++ tests/data/fate/apv-422-10  2025-05-02 19:49:42.277166925 +0200
@@ -0,0 +1,8 @@
+#tb 0: 1/30
+#media_type 0: video
+#codec_id 0: rawvideo
+#dimensions 0: 320x180
+#sar 0: 1/1
+0,  0,  0,1,   230400, 0x07f1e56d
+0,  1,  1,1,   230400, 0x0bd1c913
+0,  2,  2,1,   230400, 0xefd02824
Test apv-422-10 failed. Look at tests/data/fate/apv-422-10.err for details.
make: *** [tests/Makefile:317: fate-apv-422-10] Error 1

Note "make fate-apv-422-10" works

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

It is what and why we do it that matters, not just one of them.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] lavc: Add unit test for APV entropy decode

2025-05-02 Thread James Almer

On 4/30/2025 6:09 PM, Mark Thompson wrote:

diff --git a/tests/fate/libavcodec.mak b/tests/fate/libavcodec.mak
index ef6e6ec40e..5f7ec97ff3 100644
--- a/tests/fate/libavcodec.mak
+++ b/tests/fate/libavcodec.mak
@@ -3,6 +3,11 @@ fate-av1-levels: libavcodec/tests/av1_levels$(EXESUF)
  fate-av1-levels: CMD = run libavcodec/tests/av1_levels$(EXESUF)
  fate-av1-levels: REF = /dev/null
  
+FATE_LIBAVCODEC-$(CONFIG_APV_DECODER) += fate-apv

+fate-apv: libavcodec/tests/apv$(EXESUF)
+fate-apv: CMD = run libavcodec/tests/apv$(EXESUF)
+fate-apv: REF = /dev/null


fate-apv already exists in tests/fate/apv.mak, as a target to run all 
apv decoding tests (only one so far).


Maybe just rename it to fate-apv-entropy.



OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/2] avcodec/hevc/hevcdec: ensure a bit was read when checking for alignment_bit_equal_to_one

2025-05-02 Thread James Almer
Prevents printing bogus errors about the value being 0, when in fact we
overread the available slice buffer.

Signed-off-by: James Almer 
---
 libavcodec/hevc/hevcdec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c
index 83acf3511f..df186d6194 100644
--- a/libavcodec/hevc/hevcdec.c
+++ b/libavcodec/hevc/hevcdec.c
@@ -1154,7 +1154,7 @@ static int hls_slice_header(SliceHeader *sh, const 
HEVCContext *s, GetBitContext
 }
 
 ret = get_bits1(gb);
-if (!ret) {
+if (!ret && get_bits_left(gb) >= 0) {
 av_log(s->avctx, AV_LOG_ERROR, "alignment_bit_equal_to_one=0\n");
 return AVERROR_INVALIDDATA;
 }
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 2/2] avcodec/hevc/hevcdec: move the slice header buffer overread check up in the function

2025-05-02 Thread James Almer
Abort as soon as we're done reading the slice header instead of running extra 
checks
that assume slice data may follow.

Signed-off-by: James Almer 
---
 libavcodec/hevc/hevcdec.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c
index df186d6194..a7a91769fe 100644
--- a/libavcodec/hevc/hevcdec.c
+++ b/libavcodec/hevc/hevcdec.c
@@ -1160,6 +1160,12 @@ static int hls_slice_header(SliceHeader *sh, const 
HEVCContext *s, GetBitContext
 }
 sh->data_offset = align_get_bits(gb) - gb->buffer;
 
+if (get_bits_left(gb) < 0) {
+av_log(s->avctx, AV_LOG_ERROR,
+   "Overread slice header by %d bits\n", -get_bits_left(gb));
+return AVERROR_INVALIDDATA;
+}
+
 // Inferred parameters
 sh->slice_qp = 26U + pps->pic_init_qp_minus26 + sh->slice_qp_delta;
 if (sh->slice_qp > 51 ||
@@ -1180,12 +1186,6 @@ static int hls_slice_header(SliceHeader *sh, const 
HEVCContext *s, GetBitContext
 return AVERROR_INVALIDDATA;
 }
 
-if (get_bits_left(gb) < 0) {
-av_log(s->avctx, AV_LOG_ERROR,
-   "Overread slice header by %d bits\n", -get_bits_left(gb));
-return AVERROR_INVALIDDATA;
-}
-
 return 0;
 }
 
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 5/5] checkasm: add vvc_sao

2025-05-02 Thread Nuo Mi
On Fri, May 2, 2025 at 3:49 PM Martin Storsjö  wrote:

> On Fri, 2 May 2025, Nuo Mi wrote:
>
> > From: Shaun Loo 
> >
> > This is a part of Google Summer of Code 2023
> >
> > AVX2:
> > - vvc_sao.sao_band [OK]
> > - vvc_sao.sao_edge [OK]
> >
> > Co-authored-by: Nuo Mi 
> > ---
> > tests/checkasm/Makefile   |   2 +-
> > tests/checkasm/checkasm.c |   1 +
> > tests/checkasm/checkasm.h |   1 +
> > tests/checkasm/vvc_sao.c  | 161 ++
> > 4 files changed, 164 insertions(+), 1 deletion(-)
> > create mode 100644 tests/checkasm/vvc_sao.c
> >
> > diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
> > index 193c1e4633..fabbf595b4 100644
> > --- a/tests/checkasm/Makefile
> > +++ b/tests/checkasm/Makefile
> > @@ -47,7 +47,7 @@ AVCODECOBJS-$(CONFIG_V210_DECODER)  += v210dec.o
> > AVCODECOBJS-$(CONFIG_V210_ENCODER)  += v210enc.o
> > AVCODECOBJS-$(CONFIG_VORBIS_DECODER)+= vorbisdsp.o
> > AVCODECOBJS-$(CONFIG_VP9_DECODER)   += vp9dsp.o
> > -AVCODECOBJS-$(CONFIG_VVC_DECODER)   += vvc_alf.o vvc_mc.o
> > +AVCODECOBJS-$(CONFIG_VVC_DECODER)   += vvc_alf.o vvc_mc.o vvc_sao.o
> >
> > CHECKASMOBJS-$(CONFIG_AVCODEC)  += $(AVCODECOBJS-yes)
> >
> > diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
> > index 3bb82ed0e5..0734cd26bf 100644
> > --- a/tests/checkasm/checkasm.c
> > +++ b/tests/checkasm/checkasm.c
> > @@ -256,6 +256,7 @@ static const struct {
> > #if CONFIG_VVC_DECODER
> > { "vvc_alf", checkasm_check_vvc_alf },
> > { "vvc_mc",  checkasm_check_vvc_mc  },
> > +{ "vvc_sao", checkasm_check_vvc_sao },
> > #endif
> > #endif
> > #if CONFIG_AVFILTER
> > diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
> > index a6b5965e02..146bfdec35 100644
> > --- a/tests/checkasm/checkasm.h
> > +++ b/tests/checkasm/checkasm.h
> > @@ -149,6 +149,7 @@ void checkasm_check_videodsp(void);
> > void checkasm_check_vorbisdsp(void);
> > void checkasm_check_vvc_alf(void);
> > void checkasm_check_vvc_mc(void);
> > +void checkasm_check_vvc_sao(void);
> >
> > struct CheckasmPerf;
> >
> > diff --git a/tests/checkasm/vvc_sao.c b/tests/checkasm/vvc_sao.c
> > new file mode 100644
> > index 00..026078ff02
> > --- /dev/null
> > +++ b/tests/checkasm/vvc_sao.c
> > @@ -0,0 +1,161 @@
> > +/*
> > + * Copyright (c) 2018 Yingming Fan 
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> along
> > + * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
> > + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> > + */
> > +
> > +#include 
> > +
> > +#include "libavutil/intreadwrite.h"
> > +#include "libavutil/mem_internal.h"
> > +
> > +#include "libavcodec/vvc/dsp.h"
> > +#include "libavcodec/vvc/ctu.h"
> > +
> > +#include "checkasm.h"
> > +
> > +static const uint32_t pixel_mask[3] = { 0x, 0x03ff03ff,
> 0x0fff0fff };
> > +static const uint32_t sao_size[] = {8, 16, 32, 48, 64, 80, 96, 112,
> 128};
> > +
> > +#define SIZEOF_PIXEL ((bit_depth + 7) / 8)
> > +#define PIXEL_STRIDE (2*MAX_PB_SIZE + AV_INPUT_BUFFER_PADDING_SIZE)
> //same with sao_edge src_stride
> > +#define BUF_SIZE (PIXEL_STRIDE * (MAX_PB_SIZE+2) * 2) //+2 for top and
> bottom row, *2 for high bit depth
> > +#define OFFSET_THRESH (1 << (bit_depth - 5))
> > +#define OFFSET_LENGTH 5
> > +
> > +#define randomize_buffers(buf0, buf1, size) \
> > +do {\
> > +uint32_t mask = pixel_mask[(bit_depth - 8) >> 1];   \
> > +int k;  \
> > +for (k = 0; k < size; k += 4) { \
> > +uint32_t r = rnd() & mask;  \
> > +AV_WN32A(buf0 + k, r);  \
> > +AV_WN32A(buf1 + k, r);  \
> > +}   \
> > +} while (0)
> > +
> > +#define randomize_buffers2(buf, size)   \
> > +do {\
> > +uint32_t max_offset = OFFSET_THRESH;\
> > +int k;  \
> > +if (bit_depth == 8) {   \
> > +for (k = 0; k < size; k++) {\
> > +uint

Re: [FFmpeg-devel] [PATCH 16/17] swscale/format: add new format decode/encode logic

2025-05-02 Thread Michael Niedermayer
On Sat, Apr 26, 2025 at 07:41:20PM +0200, Niklas Haas wrote:
> From: Niklas Haas 
> 
> This patch adds format handling code for the new operations. This entails
> fully decoding a format to standardized RGB, and the inverse.
> 
> Handling it this way means we can always guarantee that a conversion path
> exists from A to B without having to explicitly cover logic for each path;
> and choosing RGB instead of YUV as the intermediate (as was done in swscale
> v1) is more flexible with regards to enabling further operations such as
> primaries conversions, linear scaling, etc.
> 
> In the case of YUV->YUV transform, the redundant matrix multiplication will
> be canceled out anyways.
> ---
>  libswscale/format.c | 925 
>  libswscale/format.h |  23 ++
>  2 files changed, 948 insertions(+)

this or rather the equivalent from your repo breaks here:

In file included from libswscale/ops.h:24,
 from libswscale/ops_internal.h:26,
 from libswscale/format.c:28:
libswscale/format.c: In function ‘trc_is_hdr’:
libswscale/format.c:1249:9: error: a label can only be part of a statement and 
a declaration is not a statement
 1249 | static_assert(AVCOL_TRC_NB == 19, "Update this list when adding 
TRCs");
  | ^
make: *** [ffbuild/common.mak:81: libswscale/format.o] Error 1
make: *** Waiting for unfinished jobs


[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Some people wanted to paint the bikeshed green, some blue and some pink.
People argued and fought, when they finally agreed, only rust was left.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 16/17] swscale/format: add new format decode/encode logic

2025-05-02 Thread Niklas Haas
On Fri, 02 May 2025 16:10:57 +0200 Michael Niedermayer  
wrote:
> On Sat, Apr 26, 2025 at 07:41:20PM +0200, Niklas Haas wrote:
> > From: Niklas Haas 
> >
> > This patch adds format handling code for the new operations. This entails
> > fully decoding a format to standardized RGB, and the inverse.
> >
> > Handling it this way means we can always guarantee that a conversion path
> > exists from A to B without having to explicitly cover logic for each path;
> > and choosing RGB instead of YUV as the intermediate (as was done in swscale
> > v1) is more flexible with regards to enabling further operations such as
> > primaries conversions, linear scaling, etc.
> >
> > In the case of YUV->YUV transform, the redundant matrix multiplication will
> > be canceled out anyways.
> > ---
> >  libswscale/format.c | 925 
> >  libswscale/format.h |  23 ++
> >  2 files changed, 948 insertions(+)
>
> this or rather the equivalent from your repo breaks here:
>
> In file included from libswscale/ops.h:24,
>  from libswscale/ops_internal.h:26,
>  from libswscale/format.c:28:
> libswscale/format.c: In function ‘trc_is_hdr’:
> libswscale/format.c:1249:9: error: a label can only be part of a statement 
> and a declaration is not a statement
>  1249 | static_assert(AVCOL_TRC_NB == 19, "Update this list when 
> adding TRCs");
>   | ^
> make: *** [ffbuild/common.mak:81: libswscale/format.o] Error 1
> make: *** Waiting for unfinished jobs

Fixed (by moving the static_assert out of the switch/case).

>
>
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Some people wanted to paint the bikeshed green, some blue and some pink.
> People argued and fought, when they finally agreed, only rust was left.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 10/17] swscale/ops_backend: add reference backend basend on C templates

2025-05-02 Thread Michael Niedermayer
On Sat, Apr 26, 2025 at 07:41:14PM +0200, Niklas Haas wrote:
> From: Niklas Haas 
> 
> This will serve as a reference for the SIMD backends to come. That said,
> with auto-vectorization enabled, the performance of this is not atrocious, and
> can often beat even the old SIMD.
> 
> In theory, we can dramatically speed it up by using GCC vectors instead of
> arrays, but the performance gains from this are too dependent on exact GCC
> versions and flags, so it practice it's not a substitute for a SIMD
> implementation.
> ---
>  libswscale/Makefile  |   6 +
>  libswscale/ops.c |   3 +
>  libswscale/ops.h |   2 -
>  libswscale/ops_backend.c | 101 ++
>  libswscale/ops_backend.h | 181 +++
>  libswscale/ops_tmpl_common.c | 176 ++
>  libswscale/ops_tmpl_float.c  | 255 +++
>  libswscale/ops_tmpl_int.c| 609 +++
>  8 files changed, 1331 insertions(+), 2 deletions(-)
>  create mode 100644 libswscale/ops_backend.c
>  create mode 100644 libswscale/ops_backend.h
>  create mode 100644 libswscale/ops_tmpl_common.c
>  create mode 100644 libswscale/ops_tmpl_float.c
>  create mode 100644 libswscale/ops_tmpl_int.c

arm breaker

CC  libswscale/ops_backend.o
In file included from src/libswscale/ops_backend.c:21:0:
src/libswscale/ops_tmpl_int.c:492:12: error: initializer element is not constant
 fn(op_read_planar1),
^
src/libswscale/ops_backend.h:78:27: note: in definition of macro ‘bitfn2’
 #define bitfn2(name, ext) name ## _ ## ext
   ^~~~
src/libswscale/ops_backend.h:82:19: note: in expansion of macro ‘bitfn’
 #define fn(name)  bitfn(name, FN_SUFFIX)
   ^
src/libswscale/ops_tmpl_int.c:492:9: note: in expansion of macro ‘fn’
 fn(op_read_planar1),
 ^~
src/libswscale/ops_tmpl_int.c:492:12: note: (near initialization for 
‘op_table_int_u8.entries[0]’)
 fn(op_read_planar1),
^
src/libswscale/ops_backend.h:78:27: note: in definition of macro ‘bitfn2’
 #define bitfn2(name, ext) name ## _ ## ext
   ^~~~
src/libswscale/ops_backend.h:82:19: note: in expansion of macro ‘bitfn’
 #define fn(name)  bitfn(name, FN_SUFFIX)
   ^
src/libswscale/ops_tmpl_int.c:492:9: note: in expansion of macro ‘fn’
 fn(op_read_planar1),
 ^~
src/libswscale/ops_tmpl_int.c:493:12: error: initializer element is not constant
 fn(op_read_planar2),
^
src/libswscale/ops_backend.h:78:27: note: in definition of macro ‘bitfn2’
 #define bitfn2(name, ext) name ## _ ## ext
   ^~~~
src/libswscale/ops_backend.h:82:19: note: in expansion of macro ‘bitfn’
 #define fn(name)  bitfn(name, FN_SUFFIX)
   ^
src/libswscale/ops_tmpl_int.c:493:9: note: in expansion of macro ‘fn’
 fn(op_read_planar2),
 ^~
src/libswscale/ops_tmpl_int.c:493:12: note: (near initialization for 
‘op_table_int_u8.entries[1]’)



[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 2/2] ogg/{vorbis, flac, opus}: Remove header packets from subsequent ogg streams from the demuxer output.

2025-05-02 Thread Michael Niedermayer
On Tue, Apr 29, 2025 at 07:49:54PM -0500, Romain Beauxis wrote:
> Le mar. 29 avr. 2025 à 19:17, Michael Niedermayer 
> a écrit :
> >
> > Hi Romain
> >
> > On Tue, Apr 29, 2025 at 05:42:22PM -0500, Romain Beauxis wrote:
> > > Le mar. 29 avr. 2025 à 16:35, Michael Niedermayer
> > >  a écrit :
> > > >
> > > > Hi
> > > >
> > > > On Mon, Apr 28, 2025 at 06:31:36PM -0500, Romain Beauxis wrote:
> > > > > ---
> > > > >  libavformat/oggdec.c   | 26
> ++--
> > > > >  libavformat/oggdec.h   |  6 +
> > > > >  libavformat/oggparseflac.c | 28
> --
> > > > >  libavformat/oggparseopus.c | 12 ++
> > > > >  libavformat/oggparsevorbis.c   | 12 --
> > > > >  tests/ref/fate/ogg-flac-chained-meta.txt   |  2 --
> > > > >  tests/ref/fate/ogg-opus-chained-meta.txt   |  1 -
> > > > >  tests/ref/fate/ogg-vorbis-chained-meta.txt |  3 ---
> > > > >  8 files changed, 68 insertions(+), 22 deletions(-)
> > > > >
> > > > > diff --git a/libavformat/oggdec.c b/libavformat/oggdec.c
> > > > > index 5339fdd32c..5557eb4a14 100644
> > > > > --- a/libavformat/oggdec.c
> > > > > +++ b/libavformat/oggdec.c
> > > > > @@ -239,10 +239,6 @@ static int ogg_replace_stream(AVFormatContext
> *s, uint32_t serial, char *magic,
> > > > >  os->start_trimming = 0;
> > > > >  os->end_trimming = 0;
> > > > >
> > > > > -/* Chained files have extradata as a new packet */
> > > > > -if (codec == &ff_opus_codec)
> > > > > -os->header = -1;
> > > > > -
> > > > >  return i;
> > > > >  }
> > > > >
> > > > > @@ -605,20 +601,26 @@ static int ogg_packet(AVFormatContext *s, int
> *sid, int *dstart, int *dsize,
> > > > >  } else {
> > > > >  os->pflags= 0;
> > > > >  os->pduration = 0;
> > > > > +
> > > > > +ret = 0;
> > > > >  if (os->codec && os->codec->packet) {
> > > > >  if ((ret = os->codec->packet(s, idx)) < 0) {
> > > > >  av_log(s, AV_LOG_ERROR, "Packet processing failed:
> %s\n", av_err2str(ret));
> > > > >  return ret;
> > > > >  }
> > > > >  }
> > > > > -if (sid)
> > > > > -*sid = idx;
> > > > > -if (dstart)
> > > > > -*dstart = os->pstart;
> > > > > -if (dsize)
> > > > > -*dsize = os->psize;
> > > > > -if (fpos)
> > > > > -*fpos = os->sync_pos;
> > > > > +
> > > > > +if (!ret) {
> > > > > +if (sid)
> > > > > +*sid = idx;
> > > > > +if (dstart)
> > > > > +*dstart = os->pstart;
> > > > > +if (dsize)
> > > > > +*dsize = os->psize;
> > > > > +if (fpos)
> > > > > +*fpos = os->sync_pos;
> > > > > +}
> > > > > +
> > > > >  os->pstart  += os->psize;
> > > > >  os->psize= 0;
> > > > >  if(os->pstart == os->bufpos)
> > > >
> > > > > diff --git a/libavformat/oggdec.h b/libavformat/oggdec.h
> > > > > index 43df23f4cb..09f698f99a 100644
> > > > > --- a/libavformat/oggdec.h
> > > > > +++ b/libavformat/oggdec.h
> > > > > @@ -38,6 +38,12 @@ struct ogg_codec {
> > > > >   * -1 if an error occurred or for unsupported stream
> > > > >   */
> > > > >  int (*header)(AVFormatContext *, int);
> > > > > +/**
> > > > > + * Attempt to process a packet as a data packet
> > > > > + * @return 1 if the packet was a header from a chained
> bitstream.
> > > > > + * 0 if the packet was a regular data packet.
> > > > > + * -1 if an error occurred or for unsupported stream
> > > > > + */
> > > > >  int (*packet)(AVFormatContext *, int);
> > > > >  /**
> > > > >   * Translate a granule into a timestamp.
> > > >
> > > > ok, but seems unrelated
> > >
> > > This is a new convention to allow the parser to know when to skip a
> > > packet. Previous to that, return value of 1 did not have specific
> > > meaning.
> >
> > Do you think this would merit a seperate patch ?
> > I mean patch #1 changing the packet() return value and clearly stating
> > that in the commit message
> > and patch #2 using the new value ?
> >
> > I think it would make things clearer
> 
> Sure thing!
> 
> libavformat/oggdec.c: Skip packets when packet function returns 1:
> 
> https://github.com/toots/FFmpeg/commit/b78acea3b2840320bb68e065b2e712d753cb8d26


> commit 16f4a594af15aee1b68e87f17d00018592528ce6
> Author: Romain Beauxis 
> Date:   Mon Apr 28 18:08:58 2025 -0500
>
> libavformat/oggdec.c: Skip packets when packet function returns 1.
>
...
> diff --git a/libavformat/oggdec.h b/libavformat/oggdec.h
> index 43df23f4cb9..09f698f99ab 100644
> --- a/libavformat/oggdec.h
> +++ b/libavformat/oggdec.h
> @@ -38,6 +38,12 @@ struct ogg_codec {
>   * -1 if an error occurred or for unsupported stream
>   */
>  int (*header)(AVFormatContext *, int

Re: [FFmpeg-devel] [PATCH] avfilter/vf_setparams: Fix chroma_location being cleared

2025-05-02 Thread Michael Niedermayer
On Wed, Apr 30, 2025 at 02:56:58PM +0200, Tobias Rapp wrote:
> On 30/04/2025 14:09, Tobias Rapp wrote:
> 
> > Fix chroma_location being cleared by setrange and setfield filters.
> > This was forgotten in 201f1cba150d44de6fedfeee4e8647170ed5fbca.
> > 
> > Signed-off-by: Tobias Rapp 
> > ---
> >   libavfilter/vf_setparams.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/libavfilter/vf_setparams.c b/libavfilter/vf_setparams.c
> > index 751750e..1e37876 100644
> > --- a/libavfilter/vf_setparams.c
> > +++ b/libavfilter/vf_setparams.c
> > @@ -236,6 +236,7 @@ static av_cold int init_setrange(AVFilterContext *ctx)
> >   s->color_primaries = -1;
> >   s->color_trc   = -1;
> >   s->colorspace  = -1;
> > +s->chroma_location = -1;
> >   return 0;
> >   }
> > @@ -272,6 +273,7 @@ static av_cold int init_setfield(AVFilterContext *ctx)
> >   s->color_primaries = -1;
> >   s->color_trc   = -1;
> >   s->colorspace  = -1;
> > +s->chroma_location = -1;
> >   return 0;
> >   }
> 
> If approved, I would also like to backport the fix to the release/7.1
> branch.

backport ok
201f1cba150d44de6fedfeee4e8647170ed5fbca is niklas change so, give him a
bit time to reply but it looks like this was forgotten i agree

thx

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Elect your leaders based on what they did after the last election, not
based on what they say before an election.



signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 5/5] checkasm: add vvc_sao

2025-05-02 Thread Martin Storsjö

On Fri, 2 May 2025, Nuo Mi wrote:


On Fri, May 2, 2025 at 3:49 PM Martin Storsjö  wrote:
  On Fri, 2 May 2025, Nuo Mi wrote:

  > From: Shaun Loo 
  >
  > This is a part of Google Summer of Code 2023
  >
  > AVX2:
  > - vvc_sao.sao_band [OK]
  > - vvc_sao.sao_edge [OK]
  >
  > Co-authored-by: Nuo Mi 
  > ---
  > tests/checkasm/Makefile   |   2 +-
  > tests/checkasm/checkasm.c |   1 +
  > tests/checkasm/checkasm.h |   1 +
  > tests/checkasm/vvc_sao.c  | 161
  ++
  > 4 files changed, 164 insertions(+), 1 deletion(-)
  > create mode 100644 tests/checkasm/vvc_sao.c
  >
  > diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
  > index 193c1e4633..fabbf595b4 100644
  > --- a/tests/checkasm/Makefile
  > +++ b/tests/checkasm/Makefile
  > @@ -47,7 +47,7 @@ AVCODECOBJS-$(CONFIG_V210_DECODER)      +=
  v210dec.o
  > AVCODECOBJS-$(CONFIG_V210_ENCODER)      += v210enc.o
  > AVCODECOBJS-$(CONFIG_VORBIS_DECODER)    += vorbisdsp.o
  > AVCODECOBJS-$(CONFIG_VP9_DECODER)       += vp9dsp.o
  > -AVCODECOBJS-$(CONFIG_VVC_DECODER)       += vvc_alf.o vvc_mc.o
  > +AVCODECOBJS-$(CONFIG_VVC_DECODER)       += vvc_alf.o vvc_mc.o
  vvc_sao.o
  >
  > CHECKASMOBJS-$(CONFIG_AVCODEC)          += $(AVCODECOBJS-yes)
  >
  > diff --git a/tests/checkasm/checkasm.c
  b/tests/checkasm/checkasm.c
  > index 3bb82ed0e5..0734cd26bf 100644
  > --- a/tests/checkasm/checkasm.c
  > +++ b/tests/checkasm/checkasm.c
  > @@ -256,6 +256,7 @@ static const struct {
  >     #if CONFIG_VVC_DECODER
  >         { "vvc_alf", checkasm_check_vvc_alf },
  >         { "vvc_mc",  checkasm_check_vvc_mc  },
  > +        { "vvc_sao", checkasm_check_vvc_sao },
  >     #endif
  > #endif
  > #if CONFIG_AVFILTER
  > diff --git a/tests/checkasm/checkasm.h
  b/tests/checkasm/checkasm.h
  > index a6b5965e02..146bfdec35 100644
  > --- a/tests/checkasm/checkasm.h
  > +++ b/tests/checkasm/checkasm.h
  > @@ -149,6 +149,7 @@ void checkasm_check_videodsp(void);
  > void checkasm_check_vorbisdsp(void);
  > void checkasm_check_vvc_alf(void);
  > void checkasm_check_vvc_mc(void);
  > +void checkasm_check_vvc_sao(void);
  >
  > struct CheckasmPerf;
  >
  > diff --git a/tests/checkasm/vvc_sao.c
  b/tests/checkasm/vvc_sao.c
  > new file mode 100644
  > index 00..026078ff02
  > --- /dev/null
  > +++ b/tests/checkasm/vvc_sao.c
  > @@ -0,0 +1,161 @@
  > +/*
  > + * Copyright (c) 2018 Yingming Fan 
  > + *
  > + * This file is part of FFmpeg.
  > + *
  > + * FFmpeg is free software; you can redistribute it and/or
  modify
  > + * it under the terms of the GNU General Public License as
  published by
  > + * the Free Software Foundation; either version 2 of the
  License, or
  > + * (at your option) any later version.
  > + *
  > + * FFmpeg is distributed in the hope that it will be useful,
  > + * but WITHOUT ANY WARRANTY; without even the implied
  warranty of
  > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
  the
  > + * GNU General Public License for more details.
  > + *
  > + * You should have received a copy of the GNU General Public
  License along
  > + * with FFmpeg; if not, write to the Free Software
  Foundation, Inc.,
  > + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
  USA.
  > + */
  > +
  > +#include 
  > +
  > +#include "libavutil/intreadwrite.h"
  > +#include "libavutil/mem_internal.h"
  > +
  > +#include "libavcodec/vvc/dsp.h"
  > +#include "libavcodec/vvc/ctu.h"
  > +
  > +#include "checkasm.h"
  > +
  > +static const uint32_t pixel_mask[3] = { 0x,
  0x03ff03ff, 0x0fff0fff };
  > +static const uint32_t sao_size[] = {8, 16, 32, 48, 64, 80,
  96, 112, 128};
  > +
  > +#define SIZEOF_PIXEL ((bit_depth + 7) / 8)
  > +#define PIXEL_STRIDE (2*MAX_PB_SIZE +
  AV_INPUT_BUFFER_PADDING_SIZE) //same with sao_edge src_stride
  > +#define BUF_SIZE (PIXEL_STRIDE * (MAX_PB_SIZE+2) * 2) //+2
  for top and bottom row, *2 for high bit depth
  > +#define OFFSET_THRESH (1 << (bit_depth - 5))
  > +#define OFFSET_LENGTH 5
  > +
  > +#define randomize_buffers(buf0, buf1, size)                 \
  > +    do {                                                    \
  > +        uint32_t mask = pixel_mask[(bit_depth - 8) >> 1];   \
  > +        int k;                                              \
  > +        for (k = 0; k < size; k += 4) {                     \
  > +            uint32_t r = rnd() & mask;                      \
  > +            AV_WN32A(buf0 + k, r);                          \
  > +            AV_WN3

Re: [FFmpeg-devel] [PATCH 1/2] avformat/hls: Split allowed_segment_extensions off allowed_extensions

2025-05-02 Thread Michael Niedermayer
On Wed, Apr 30, 2025 at 01:44:04AM +0200, Michael Niedermayer wrote:
> This allows the user to set only the one that is needed to ALL or a
> specific "wrong" extension like html
> 
> Signed-off-by: Michael Niedermayer 
> ---
>  libavformat/hls.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)

will apply pacthset

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The greatest way to live with honor in this world is to be what we pretend
to be. -- Socrates


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/7] postproc/tests: Add test for temporal denoise

2025-05-02 Thread Michael Niedermayer
On Fri, May 02, 2025 at 12:27:29AM +0200, Michael Niedermayer wrote:
> Sponsored-by: Sovereign Tech Fund
> Signed-off-by: Michael Niedermayer 
> ---
>  libpostproc/Makefile |   1 +
>  libpostproc/tests/temptest.c | 120 +
>  tests/fate/libpostproc.mak   |   4 +
>  tests/ref/fate/temptest  | 336 +++
>  4 files changed, 461 insertions(+)
>  create mode 100644 libpostproc/tests/temptest.c
>  create mode 100644 tests/ref/fate/temptest

will apply patchset

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v8 14/15] fftools/graphprint: Add execution graph printing

2025-05-02 Thread softworkz .



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Michael
> Niedermayer
> Sent: Freitag, 2. Mai 2025 02:12
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH v8 14/15] fftools/graphprint: Add execution
> graph printing
> 
> On Tue, Apr 29, 2025 at 08:33:50PM +, softworkz . wrote:
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel  On Behalf Of Michael
> > > Niedermayer
> > > Sent: Dienstag, 29. April 2025 21:35
> > > To: FFmpeg development discussions and patches 
> > > Subject: Re: [FFmpeg-devel] [PATCH v8 14/15] fftools/graphprint: Add
> execution
> > > graph printing
> > >
> > > Hi softworkz
> > >
> > > On Tue, Apr 29, 2025 at 01:00:03AM +, softworkz wrote:
> > > > From: softworkz 
> > > >
> > > > The key benefits are:
> > > >
> > > > - Different to other graph printing methods, this is outputting:
> > > >   - all graphs with runtime state
> > > > (including auto-inserted filters)
> > > >   - each graph with its inputs and outputs
> > > >   - all filters with their in- and output pads
> > > >   - all connections between all input- and output pads
> > > >   - for each connection:
> > > > - the runtime-negotiated format and media type
> > > > - the hw context
> > > > - if video hw context, both: hw pixfmt + sw pixfmt
> > > > - Output can either be printed to stdout or written to specified file
> > > > - Output is machine-readable
> > > > - Use the same output implementation as ffprobe, supporting multiple
> > > >   formats
> > >
> > > fails on arm cross compile (no zlib maybe)
> > >
> > > /usr/lib/gcc-cross/arm-linux-gnueabi/7/../../../../arm-linux-
> gnueabi/bin/ld:
> > > fftools/resources/resman.o: in function `decompress_gzip':
> > > ffmpeg/arm/src/fftools/resources/resman.c:82: undefined reference to
> > > `inflateInit2_'
> > > /usr/lib/gcc-cross/arm-linux-gnueabi/7/../../../../arm-linux-
> gnueabi/bin/ld:
> > > ffmpeg/arm/src/fftools/resources/resman.c:94: undefined reference to
> `inflate'
> > > /usr/lib/gcc-cross/arm-linux-gnueabi/7/../../../../arm-linux-
> gnueabi/bin/ld:
> > > ffmpeg/arm/src/fftools/resources/resman.c:110: undefined reference to
> > > `inflateEnd'
> >
> >
> > Ouch! I thought that zlib could be taken for granted.
> >
> > What should I do? Guard this and everything else that is depending on it
> > by #if blocks?
> >
> > Thanks for any advice, I'm not sure how to approach this exactly...
> 
> #if or make the whole tool depend on zlib in configure
> 
> or write a basic zlib implementation ;)
> 
> thx
> 
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB


Hi,

I managed to get around making the whole feature conditional by having 
the absence of zlib affect only the compression of resources, otherwise
including them uncompressed - mostly the same way like ptx compression,
so you can also do:

./configure --disable-resource-compression

to disable it. 

Does this configure change require any other accompanying change in any
file (doc, versions, api)?


Probably the compression wasn't really worth the effort considering absolute
numbers, even though the relative savings are nice:

graph.css 7752
graph.html2153

graph.css.min 6655
(css is always minified)

No Compression

graph.css.c  40026
graph.css.o   9344 (6688)
graph.html.c 13016
graph.html.o  4848 (2186)

With Compression

graph.css.c  10206
graph.css.o   4368 (1718)
graph.html.c  5725
graph.html.o  3632 (971)

Numbers in brackets: .rodata size from 'size -Ax -d *.o'


Thanks,
sw





___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] hwcontext_vulkan: fix exporting multi-plane DRM modifiers

2025-05-02 Thread Russell Greene
Is this documented anywhere? Should I maybe have a fixed sized stack
buffer and fallback to an allocation if it's just a rule of thumb that
it'll be less than some known small number. I just somehow doubt this
is in the vulkan spec or any sort of guaranteed

On Thu, May 1, 2025 at 6:10 PM Lynne  wrote:
>
> On 01/05/2025 07:05, Russell Greene wrote:
> > From: Russell Greene 
> >
> > Previously, it was assumed that `drmFormatModifierPlaneCount` was one
> > for all modifiers when exporting, which is not always the case, in
> > particular for AMD GPUs and maybe others.
> >
> > Fetch the number of memory planes and fill the structs appropriately in 
> > this situation.
> >
> > The encoded stream is still bad in the case whre modifers are involved,
> > but I think this patch still stands on its own and I suspect that may be a 
> > driver bug.
> >
> > A potential improvement that could be make is to cache the format
> > information, so we can avoid the two GetPhysicalDeviceFormatProperties2
> > calls for each export, as well as the allocation. I doubt this is very
> > expensive, but seemed worth noting.
> >
> > Signed-off-by: Russell Greene 
> > ---
> >   libavutil/hwcontext_vulkan.c | 76 +++-
> >   1 file changed, 67 insertions(+), 9 deletions(-)
> >
> > diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
> > index ade0235ef1..d14fa4655b 100644
> > --- a/libavutil/hwcontext_vulkan.c
> > +++ b/libavutil/hwcontext_vulkan.c
> > @@ -3787,6 +3787,17 @@ static inline uint32_t vulkan_fmt_to_drm(VkFormat 
> > vkfmt)
> >   return DRM_FORMAT_INVALID;
> >   }
> >
> > +#define MAX_MEMORY_PLANES 4
> > +static VkImageAspectFlags plane_index_to_aspect(int plane) {
> > +if (plane == 0) return VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT_EXT;
> > +if (plane == 1) return VK_IMAGE_ASPECT_MEMORY_PLANE_1_BIT_EXT;
> > +if (plane == 2) return VK_IMAGE_ASPECT_MEMORY_PLANE_2_BIT_EXT;
> > +if (plane == 3) return VK_IMAGE_ASPECT_MEMORY_PLANE_3_BIT_EXT;
> > +
> > +av_assert2 (false && "Invalid plane index");
> > +return VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT_EXT;
> > +}
> > +
> >   static int vulkan_map_to_drm(AVHWFramesContext *hwfc, AVFrame *dst,
> >const AVFrame *src, int flags)
> >   {
> > @@ -3855,14 +3866,65 @@ static int vulkan_map_to_drm(AVHWFramesContext 
> > *hwfc, AVFrame *dst,
> >
> >   drm_desc->nb_layers = planes;
> >   for (int i = 0; i < drm_desc->nb_layers; i++) {
> > -VkSubresourceLayout layout;
> > -VkImageSubresource sub = {
> > -.aspectMask = VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT_EXT,
> > +VkDrmFormatModifierPropertiesListEXT modp = {
> > +.sType = 
> > VK_STRUCTURE_TYPE_DRM_FORMAT_MODIFIER_PROPERTIES_LIST_EXT,
> > +};
> > +VkFormatProperties2 fmtp = {
> > +.sType = VK_STRUCTURE_TYPE_FORMAT_PROPERTIES_2,
> > +.pNext = &modp,
> >   };
> >   VkFormat plane_vkfmt = av_vkfmt_from_pixfmt(hwfc->sw_format)[i];
> >
> > -drm_desc->layers[i].format= vulkan_fmt_to_drm(plane_vkfmt);
> > -drm_desc->layers[i].nb_planes = 1;
> > +drm_desc->layers[i].format = vulkan_fmt_to_drm(plane_vkfmt);
> > +
> > +/* query drmFormatModifierCount by keeping 
> > pDrmFormatModifierProperties NULL */
> > +vk->GetPhysicalDeviceFormatProperties2(hwctx->phys_dev, 
> > plane_vkfmt, &fmtp);
> > +
> > +modp.pDrmFormatModifierProperties =
> > +av_calloc(modp.drmFormatModifierCount, 
> > sizeof(*modp.pDrmFormatModifierProperties));
> > +if (!modp.pDrmFormatModifierProperties) {
> > +err = AVERROR(ENOMEM);
> > +goto end;
> > +}
> > +vk->GetPhysicalDeviceFormatProperties2(hwctx->phys_dev, 
> > plane_vkfmt, &fmtp);
> > +
> > +VkDrmFormatModifierPropertiesEXT *mod_props = NULL;
> > +for (uint32_t i = 0; i < modp.drmFormatModifierCount; ++i) {
> > +VkDrmFormatModifierPropertiesEXT *m = 
> > &modp.pDrmFormatModifierProperties[i];
> > +if (m->drmFormatModifier == drm_mod.drmFormatModifier) {
> > +mod_props = m;
> > +break;
> > +}
> > +}
> > +
> > +if (!mod_props) {
> > +av_free(modp.pDrmFormatModifierProperties);
> > +av_log(hwfc, AV_LOG_ERROR, "Cannot fetch modifier properties 
> > for modifier "PRIu64"!\n",
> > +   drm_mod.drmFormatModifier);
> > +err = AVERROR_EXTERNAL;
> > +goto end;
> > +}
> > +drm_desc->layers[i].nb_planes = 
> > mod_props->drmFormatModifierPlaneCount;
> > +av_free(modp.pDrmFormatModifierProperties);
> > +
> > +if (drm_desc->layers[i].nb_planes > MAX_MEMORY_PLANES) {
> > +av_log(hwfc, AV_LOG_ERROR, "Too many memory planes for DRM 
> > format!\n");
> > +err = AVERROR_EXTERNAL;
> > +goto end;
> > + 

Re: [FFmpeg-devel] [PATCH] configure: Enable -fno-common for Darwin targets, avoid linker warnings

2025-05-02 Thread Martin Storsjö

On Tue, 29 Apr 2025, Martin Storsjö wrote:


Since GCC 10 and llvm.org Clang 11, -fno-common is the default.
However Apple's Xcode Clang hasn't followed suit yet, and still
defaults to -fcommon.

Compiling with -fcommon causes uninitialized global variables to
be treated as "common" (which allows multiple object files to have
similar definitions).

Common variables seem to have the issue that their intended alignment
isn't signaled, so the linker assumes that they may need alignment
according to their full size.

With large global tables, this can lead to linker warnings like
this, with Xcode 16.3:

   ld: warning: reducing alignment of section __DATA,__common from 0x8000 to 
0x4000 because it exceeds segment maximum alignment

This can be reproduced with a small snippet like this:

   char table[16385];
   int main(int argc, char* argv[]) { return 0; }

Compiling with -fno-common avoids this issue and warning, and
matches the default behaviour of other compilers. (Compiling with
-fno-common also avoids the risk of accidentally accepting
duplicate definitions of global variables, as long as they are
uninitialized.)
---
configure | 7 +++
1 file changed, 7 insertions(+)


Will push soon.

// Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 2/5] x86/hevcdec: sao, refact out h26x macros

2025-05-02 Thread Nuo Mi
From: Shaun Loo 

This is a part of Google Summer of Code 2023

Co-authored-by: Nuo Mi 
---
 libavcodec/x86/h26x/h2656_sao.asm   | 301 
 libavcodec/x86/h26x/h2656_sao_10bit.asm | 301 
 libavcodec/x86/hevc/sao.asm | 278 +-
 libavcodec/x86/hevc/sao_10bit.asm   | 277 +-
 4 files changed, 610 insertions(+), 547 deletions(-)
 create mode 100644 libavcodec/x86/h26x/h2656_sao.asm
 create mode 100644 libavcodec/x86/h26x/h2656_sao_10bit.asm

diff --git a/libavcodec/x86/h26x/h2656_sao.asm 
b/libavcodec/x86/h26x/h2656_sao.asm
new file mode 100644
index 00..504fcb388b
--- /dev/null
+++ b/libavcodec/x86/h26x/h2656_sao.asm
@@ -0,0 +1,301 @@
+;**
+;* SIMD optimized SAO functions for HEVC/VVC 8bit decoding
+;*
+;* Copyright (c) 2013 Pierre-Edouard LEPERE
+;* Copyright (c) 2014 James Almer
+;*
+;* This file is part of FFmpeg.
+;*
+;* FFmpeg is free software; you can redistribute it and/or
+;* modify it under the terms of the GNU Lesser General Public
+;* License as published by the Free Software Foundation; either
+;* version 2.1 of the License, or (at your option) any later version.
+;*
+;* FFmpeg is distributed in the hope that it will be useful,
+;* but WITHOUT ANY WARRANTY; without even the implied warranty of
+;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;* Lesser General Public License for more details.
+;*
+;* You should have received a copy of the GNU Lesser General Public
+;* License along with FFmpeg; if not, write to the Free Software
+;* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+;**
+
+%include "libavutil/x86/x86util.asm"
+
+SECTION_RODATA 32
+
+pb_edge_shuffle: times 2 db 1, 2, 0, 3, 4, -1, -1, -1, -1, -1, -1, -1, -1, -1, 
-1, -1
+pb_eo:   db -1, 0, 1, 0, 0, -1, 0, 1, -1, -1, 1, 1, 1, -1, -1, 
1
+cextern pb_1
+cextern pb_2
+
+SECTION .text
+
+;**
+;SAO Band Filter
+;**
+
+%macro H2656_SAO_BAND_FILTER_INIT 0
+andleftq, 31
+movd xm0, leftd
+addleftq, 1
+andleftq, 31
+movd xm1, leftd
+addleftq, 1
+andleftq, 31
+movd xm2, leftd
+addleftq, 1
+andleftq, 31
+movd xm3, leftd
+
+SPLATWm0, xm0
+SPLATWm1, xm1
+SPLATWm2, xm2
+SPLATWm3, xm3
+%if mmsize > 16
+SPLATWm4, [offsetq + 2]
+SPLATWm5, [offsetq + 4]
+SPLATWm6, [offsetq + 6]
+SPLATWm7, [offsetq + 8]
+%else
+movq  m7, [offsetq + 2]
+SPLATWm4, m7, 0
+SPLATWm5, m7, 1
+SPLATWm6, m7, 2
+SPLATWm7, m7, 3
+%endif
+
+%if ARCH_X86_64
+pxor m14, m14
+
+%else ; ARCH_X86_32
+mova  [rsp+mmsize*0], m0
+mova  [rsp+mmsize*1], m1
+mova  [rsp+mmsize*2], m2
+mova  [rsp+mmsize*3], m3
+mova  [rsp+mmsize*4], m4
+mova  [rsp+mmsize*5], m5
+mova  [rsp+mmsize*6], m6
+pxor  m0, m0
+%assign MMSIZE mmsize
+%define m14 m0
+%define m13 m1
+%define  m9 m2
+%define  m8 m3
+%endif ; ARCH
+DEFINE_ARGS dst, src, dststride, srcstride, offset, height
+mov  heightd, r7m
+%endmacro
+
+%macro H2656_SAO_BAND_FILTER_COMPUTE 2
+psraw %1, %2, 3
+%if ARCH_X86_64
+pcmpeqw  m10, %1, m0
+pcmpeqw  m11, %1, m1
+pcmpeqw  m12, %1, m2
+pcmpeqw   %1, m3
+pand m10, m4
+pand m11, m5
+pand m12, m6
+pand  %1, m7
+por  m10, m11
+por  m12, %1
+por  m10, m12
+paddw %2, m10
+%else ; ARCH_X86_32
+pcmpeqw   m4, %1, [rsp+MMSIZE*0]
+pcmpeqw   m5, %1, [rsp+MMSIZE*1]
+pcmpeqw   m6, %1, [rsp+MMSIZE*2]
+pcmpeqw   %1, [rsp+MMSIZE*3]
+pand  m4, [rsp+MMSIZE*4]
+pand  m5, [rsp+MMSIZE*5]
+pand  m6, [rsp+MMSIZE*6]
+pand  %1, m7
+por   m4, m5
+por   m6, %1
+por   m4, m6
+paddw %2, m4
+%endif ; ARCH
+%endmacro
+
+;void ff_{hevc, vvc}_sao_band_filter__8_(uint8_t *_dst, const 
uint8_t *_src, ptrdiff_t _stride_dst, ptrdiff_t _stride_src,
+; int16_t *sao_offset_val, int 
sao_left_class, int width, int height);
+%macro H2656_SAO_BAND_FILTER 3
+cglobal %1_sao_band_filter_%2_8, 6, 6, 15, 7*mmsi

[FFmpeg-devel] [PATCH 5/5] checkasm: add vvc_sao

2025-05-02 Thread Nuo Mi
From: Shaun Loo 

This is a part of Google Summer of Code 2023

AVX2:
 - vvc_sao.sao_band [OK]
 - vvc_sao.sao_edge [OK]
checkasm: all 54 tests passed
vvc_sao_band_8_8_c:157.4 ( 1.00x)
vvc_sao_band_8_8_avx2:  30.7 ( 5.12x)
vvc_sao_band_8_10_c:   119.4 ( 1.00x)
vvc_sao_band_8_10_avx2: 29.2 ( 4.09x)
vvc_sao_band_8_12_c:   144.6 ( 1.00x)
vvc_sao_band_8_12_avx2: 30.0 ( 4.82x)
vvc_sao_band_16_8_c:   446.5 ( 1.00x)
vvc_sao_band_16_8_avx2:103.3 ( 4.32x)
vvc_sao_band_16_10_c:  399.2 ( 1.00x)
vvc_sao_band_16_10_avx2:64.3 ( 6.21x)
vvc_sao_band_16_12_c:  472.9 ( 1.00x)
vvc_sao_band_16_12_avx2:56.5 ( 8.37x)
vvc_sao_band_32_8_c:  2430.9 ( 1.00x)
vvc_sao_band_32_8_avx2:203.3 (11.96x)
vvc_sao_band_32_10_c: 1405.7 ( 1.00x)
vvc_sao_band_32_10_avx2:   208.5 ( 6.74x)
vvc_sao_band_32_12_c: 2054.3 ( 1.00x)
vvc_sao_band_32_12_avx2:   213.0 ( 9.64x)
vvc_sao_band_48_8_c:  3835.4 ( 1.00x)
vvc_sao_band_48_8_avx2:604.2 ( 6.35x)
vvc_sao_band_48_10_c: 3624.6 ( 1.00x)
vvc_sao_band_48_10_avx2:   468.8 ( 7.73x)
vvc_sao_band_48_12_c: 3752.4 ( 1.00x)
vvc_sao_band_48_12_avx2:   477.5 ( 7.86x)
vvc_sao_band_64_8_c:  6061.1 ( 1.00x)
vvc_sao_band_64_8_avx2:803.9 ( 7.54x)
vvc_sao_band_64_10_c: 6142.5 ( 1.00x)
vvc_sao_band_64_10_avx2:   827.3 ( 7.43x)
vvc_sao_band_64_12_c: 6106.6 ( 1.00x)
vvc_sao_band_64_12_avx2:   839.9 ( 7.27x)
vvc_sao_band_80_8_c:  9478.0 ( 1.00x)
vvc_sao_band_80_8_avx2:   1516.7 ( 6.25x)
vvc_sao_band_80_10_c:10300.5 ( 1.00x)
vvc_sao_band_80_10_avx2:  1298.7 ( 7.93x)
vvc_sao_band_80_12_c: 8941.1 ( 1.00x)
vvc_sao_band_80_12_avx2:  1315.3 ( 6.80x)
vvc_sao_band_96_8_c: 13351.5 ( 1.00x)
vvc_sao_band_96_8_avx2:   1815.4 ( 7.35x)
vvc_sao_band_96_10_c:13197.5 ( 1.00x)
vvc_sao_band_96_10_avx2:  1872.4 ( 7.05x)
vvc_sao_band_96_12_c:11969.0 ( 1.00x)
vvc_sao_band_96_12_avx2:  1895.8 ( 6.31x)
vvc_sao_band_112_8_c:19936.9 ( 1.00x)
vvc_sao_band_112_8_avx2:  2802.3 ( 7.11x)
vvc_sao_band_112_10_c:   19534.9 ( 1.00x)
vvc_sao_band_112_10_avx2: 2635.0 ( 7.41x)
vvc_sao_band_112_12_c:   16520.6 ( 1.00x)
vvc_sao_band_112_12_avx2: 2591.8 ( 6.37x)
vvc_sao_band_128_8_c:25967.5 ( 1.00x)
vvc_sao_band_128_8_avx2:  3155.3 ( 8.23x)
vvc_sao_band_128_10_c:   24002.6 ( 1.00x)
vvc_sao_band_128_10_avx2: 3374.6 ( 7.11x)
vvc_sao_band_128_12_c:   20829.4 ( 1.00x)
vvc_sao_band_128_12_avx2: 3377.0 ( 6.17x)
vvc_sao_edge_8_8_c:174.6 ( 1.00x)
vvc_sao_edge_8_8_avx2:  37.0 ( 4.72x)
vvc_sao_edge_8_10_c:   174.4 ( 1.00x)
vvc_sao_edge_8_10_avx2: 58.5 ( 2.98x)
vvc_sao_edge_8_12_c:   171.1 ( 1.00x)
vvc_sao_edge_8_12_avx2: 58.5 ( 2.93x)
vvc_sao_edge_16_8_c:   677.7 ( 1.00x)
vvc_sao_edge_16_8_avx2: 72.2 ( 9.39x)
vvc_sao_edge_16_10_c:  724.8 ( 1.00x)
vvc_sao_edge_16_10_avx2:   106.4 ( 6.81x)
vvc_sao_edge_16_12_c:  647.0 ( 1.00x)
vvc_sao_edge_16_12_avx2:   106.6 ( 6.07x)
vvc_sao_edge_32_8_c:  3001.8 ( 1.00x)
vvc_sao_edge_32_8_avx2:157.6 (19.04x)
vvc_sao_edge_32_10_c: 3071.1 ( 1.00x)
vvc_sao_edge_32_10_

[FFmpeg-devel] [PATCH 4/5] x86/vvcdec: sao, add avx2 support

2025-05-02 Thread Nuo Mi
From: Shaun Loo 

This is a part of Google Summer of Code 2023

Co-authored-by: Nuo Mi 
---
 libavcodec/x86/h26x/h2656_sao.asm |   8 +--
 libavcodec/x86/vvc/Makefile   |   2 +
 libavcodec/x86/vvc/dsp_init.c |  41 +++
 libavcodec/x86/vvc/sao.asm|  73 +++
 libavcodec/x86/vvc/sao_10bit.asm  | 113 ++
 5 files changed, 233 insertions(+), 4 deletions(-)
 create mode 100644 libavcodec/x86/vvc/sao.asm
 create mode 100644 libavcodec/x86/vvc/sao_10bit.asm

diff --git a/libavcodec/x86/h26x/h2656_sao.asm 
b/libavcodec/x86/h26x/h2656_sao.asm
index 504fcb388b..a80ee26178 100644
--- a/libavcodec/x86/h26x/h2656_sao.asm
+++ b/libavcodec/x86/h26x/h2656_sao.asm
@@ -147,7 +147,7 @@ align 16
 %assign i i+mmsize
 %endrep
 
-%if %2 == 48
+%if %2 == 48 || %2 == 80 || %2 == 112
 INIT_XMM cpuname
 
 mova m13, [srcq + i]
@@ -160,7 +160,7 @@ INIT_XMM cpuname
 %if cpuflag(avx2)
 INIT_YMM cpuname
 %endif
-%endif ; %2 == 48
+%endif ; %2 == 48 || %2 == 80 || %2 == 112
 
 add dstq, dststrideq ; dst += dststride
 add srcq, srcstrideq ; src += srcstride
@@ -280,7 +280,7 @@ align 16
 %assign i i+mmsize
 %endrep
 
-%if %2 == 48
+%if %2 == 48 || %2 == 80 || %2 == 112
 INIT_XMM cpuname
 
 mova  m1, [srcq + i]
@@ -291,7 +291,7 @@ INIT_XMM cpuname
 %if cpuflag(avx2)
 INIT_YMM cpuname
 %endif
-%endif
+%endif ; %2 == 48 || %2 == 80 || %2 == 112
 
 add dstq, dststrideq
 add srcq, EDGE_SRCSTRIDE
diff --git a/libavcodec/x86/vvc/Makefile b/libavcodec/x86/vvc/Makefile
index 86a6c8ba7c..c426b156c1 100644
--- a/libavcodec/x86/vvc/Makefile
+++ b/libavcodec/x86/vvc/Makefile
@@ -8,4 +8,6 @@ X86ASM-OBJS-$(CONFIG_VVC_DECODER)  += x86/vvc/alf.o 
\
   x86/vvc/mc.o  \
   x86/vvc/of.o  \
   x86/vvc/sad.o \
+  x86/vvc/sao.o \
+  x86/vvc/sao_10bit.o   \
   x86/h26x/h2656_inter.o
diff --git a/libavcodec/x86/vvc/dsp_init.c b/libavcodec/x86/vvc/dsp_init.c
index bb68ba0b1e..cbcfa40a66 100644
--- a/libavcodec/x86/vvc/dsp_init.c
+++ b/libavcodec/x86/vvc/dsp_init.c
@@ -215,6 +215,44 @@ ALF_FUNCS(16, 12, avx2)
 
 #endif
 
+#define SAO_FILTER_FUNC(wd, bitd, opt) 
  \
+void ff_vvc_sao_band_filter_##wd##_##bitd##_##opt(uint8_t *_dst, const uint8_t 
*_src, ptrdiff_t _stride_dst, ptrdiff_t _stride_src,  \
+const int16_t *sao_offset_val, int sao_left_class, int width, int height); 
  \
+void ff_vvc_sao_edge_filter_##wd##_##bitd##_##opt(uint8_t *_dst, const uint8_t 
*_src, ptrdiff_t stride_dst,  \
+const int16_t *sao_offset_val, int eo, int width, int height); 
  \
+
+#define SAO_FILTER_FUNCS(bitd, opt) \
+SAO_FILTER_FUNC(8,   bitd, opt) \
+SAO_FILTER_FUNC(16,  bitd, opt) \
+SAO_FILTER_FUNC(32,  bitd, opt) \
+SAO_FILTER_FUNC(48,  bitd, opt) \
+SAO_FILTER_FUNC(64,  bitd, opt) \
+SAO_FILTER_FUNC(80,  bitd, opt) \
+SAO_FILTER_FUNC(96,  bitd, opt) \
+SAO_FILTER_FUNC(112, bitd, opt) \
+SAO_FILTER_FUNC(128, bitd, opt) \
+
+SAO_FILTER_FUNCS(8,  avx2)
+SAO_FILTER_FUNCS(10, avx2)
+SAO_FILTER_FUNCS(12, avx2)
+
+#define SAO_FILTER_INIT(type, bitd, opt) do {  
 \
+c->sao.type##_filter[0] = ff_vvc_sao_##type##_filter_8_##bitd##_##opt;\
+c->sao.type##_filter[1] = ff_vvc_sao_##type##_filter_16_##bitd##_##opt;   \
+c->sao.type##_filter[2] = ff_vvc_sao_##type##_filter_32_##bitd##_##opt;   \
+c->sao.type##_filter[3] = ff_vvc_sao_##type##_filter_48_##bitd##_##opt;   \
+c->sao.type##_filter[4] = ff_vvc_sao_##type##_filter_64_##bitd##_##opt;   \
+c->sao.type##_filter[5] = ff_vvc_sao_##type##_filter_80_##bitd##_##opt;   \
+c->sao.type##_filter[6] = ff_vvc_sao_##type##_filter_96_##bitd##_##opt;   \
+c->sao.type##_filter[7] = ff_vvc_sao_##type##_filter_112_##bitd##_##opt;  \
+c->sao.type##_filter[8] = ff_vvc_sao_##type##_filter_128_##bitd##_##opt;  \
+} while (0)
+
+#define SAO_INIT(bitd, opt) do { \
+SAO_FILTER_INIT(band, bitd, opt);\
+SAO_FILTER_INIT(edge, bitd, opt);\
+} while (0)
+
 #define AVG_INIT(bd, opt) do {   \
 c->inter.avg= bf(vvc_avg, bd, opt);  \
 c->inter.w_avg  = bf(vvc_w_avg, bd, opt);\
@@ -329,6 +367,7 @@ void ff_vvc_ds

[FFmpeg-devel] [PATCH 3/5] x86/hevcdec: refact, remove duplicate code in HEVC_SAO_{BAND, EDGE}_FILTER

2025-05-02 Thread Nuo Mi
From: Shaun Loo 

This is a part of Google Summer of Code 2023

Co-authored-by: Nuo Mi 
---
 libavcodec/x86/hevc/sao_10bit.asm | 100 ++
 1 file changed, 48 insertions(+), 52 deletions(-)

diff --git a/libavcodec/x86/hevc/sao_10bit.asm 
b/libavcodec/x86/hevc/sao_10bit.asm
index 77967db5e6..8173509d6f 100644
--- a/libavcodec/x86/hevc/sao_10bit.asm
+++ b/libavcodec/x86/hevc/sao_10bit.asm
@@ -28,18 +28,17 @@
 H2656_SAO_BAND_FILTER hevc, %1, %2, %3
 %endmacro
 
+%macro HEVC_SAO_BAND_FILTER_FUNCS 1
+HEVC_SAO_BAND_FILTER %1,  8, 1
+HEVC_SAO_BAND_FILTER %1, 16, 2
+HEVC_SAO_BAND_FILTER %1, 32, 4
+HEVC_SAO_BAND_FILTER %1, 48, 6
+HEVC_SAO_BAND_FILTER %1, 64, 8
+%endmacro
+
 %macro HEVC_SAO_BAND_FILTER_FUNCS 0
-HEVC_SAO_BAND_FILTER 10,  8, 1
-HEVC_SAO_BAND_FILTER 10, 16, 2
-HEVC_SAO_BAND_FILTER 10, 32, 4
-HEVC_SAO_BAND_FILTER 10, 48, 6
-HEVC_SAO_BAND_FILTER 10, 64, 8
-
-HEVC_SAO_BAND_FILTER 12,  8, 1
-HEVC_SAO_BAND_FILTER 12, 16, 2
-HEVC_SAO_BAND_FILTER 12, 32, 4
-HEVC_SAO_BAND_FILTER 12, 48, 6
-HEVC_SAO_BAND_FILTER 12, 64, 8
+HEVC_SAO_BAND_FILTER_FUNCS 10
+HEVC_SAO_BAND_FILTER_FUNCS 12
 %endmacro
 
 INIT_XMM sse2
@@ -48,54 +47,51 @@ INIT_XMM avx
 HEVC_SAO_BAND_FILTER_FUNCS
 
 %if HAVE_AVX2_EXTERNAL
-INIT_XMM avx2
-HEVC_SAO_BAND_FILTER 10,  8, 1
-INIT_YMM avx2
-HEVC_SAO_BAND_FILTER 10, 16, 1
-HEVC_SAO_BAND_FILTER 10, 32, 2
-HEVC_SAO_BAND_FILTER 10, 48, 3
-HEVC_SAO_BAND_FILTER 10, 64, 4
-
-INIT_XMM avx2
-HEVC_SAO_BAND_FILTER 12,  8, 1
-INIT_YMM avx2
-HEVC_SAO_BAND_FILTER 12, 16, 1
-HEVC_SAO_BAND_FILTER 12, 32, 2
-HEVC_SAO_BAND_FILTER 12, 48, 3
-HEVC_SAO_BAND_FILTER 12, 64, 4
+
+%macro HEVC_SAO_BAND_FILTER_FUNCS_AVX2 1
+INIT_XMM avx2
+HEVC_SAO_BAND_FILTER %1,  8, 1
+INIT_YMM avx2
+HEVC_SAO_BAND_FILTER %1, 16, 1
+HEVC_SAO_BAND_FILTER %1, 32, 2
+HEVC_SAO_BAND_FILTER %1, 48, 3
+HEVC_SAO_BAND_FILTER %1, 64, 4
+%endmacro
+
+HEVC_SAO_BAND_FILTER_FUNCS_AVX2 10
+HEVC_SAO_BAND_FILTER_FUNCS_AVX2 12
+
 %endif
 
 %macro HEVC_SAO_EDGE_FILTER 3
 H2656_SAO_EDGE_FILTER hevc, %1, %2, %3
 %endmacro
 
+%macro HEVC_SAO_EDGE_FILTER_FUNCS 1
+HEVC_SAO_EDGE_FILTER %1,  8, 1
+HEVC_SAO_EDGE_FILTER %1, 16, 2
+HEVC_SAO_EDGE_FILTER %1, 32, 4
+HEVC_SAO_EDGE_FILTER %1, 48, 6
+HEVC_SAO_EDGE_FILTER %1, 64, 8
+%endmacro
+
 INIT_XMM sse2
-HEVC_SAO_EDGE_FILTER 10,  8, 1
-HEVC_SAO_EDGE_FILTER 10, 16, 2
-HEVC_SAO_EDGE_FILTER 10, 32, 4
-HEVC_SAO_EDGE_FILTER 10, 48, 6
-HEVC_SAO_EDGE_FILTER 10, 64, 8
-
-HEVC_SAO_EDGE_FILTER 12,  8, 1
-HEVC_SAO_EDGE_FILTER 12, 16, 2
-HEVC_SAO_EDGE_FILTER 12, 32, 4
-HEVC_SAO_EDGE_FILTER 12, 48, 6
-HEVC_SAO_EDGE_FILTER 12, 64, 8
+HEVC_SAO_EDGE_FILTER_FUNCS 10
+HEVC_SAO_EDGE_FILTER_FUNCS 12
 
 %if HAVE_AVX2_EXTERNAL
-INIT_XMM avx2
-HEVC_SAO_EDGE_FILTER 10,  8, 1
-INIT_YMM avx2
-HEVC_SAO_EDGE_FILTER 10, 16, 1
-HEVC_SAO_EDGE_FILTER 10, 32, 2
-HEVC_SAO_EDGE_FILTER 10, 48, 3
-HEVC_SAO_EDGE_FILTER 10, 64, 4
-
-INIT_XMM avx2
-HEVC_SAO_EDGE_FILTER 12,  8, 1
-INIT_YMM avx2
-HEVC_SAO_EDGE_FILTER 12, 16, 1
-HEVC_SAO_EDGE_FILTER 12, 32, 2
-HEVC_SAO_EDGE_FILTER 12, 48, 3
-HEVC_SAO_EDGE_FILTER 12, 64, 4
+
+%macro HEVC_SAO_EDGE_FILTER_FUNCS_AVX2 1
+INIT_XMM avx2
+HEVC_SAO_EDGE_FILTER %1,  8, 1
+INIT_YMM avx2
+HEVC_SAO_EDGE_FILTER %1, 16, 1
+HEVC_SAO_EDGE_FILTER %1, 32, 2
+HEVC_SAO_EDGE_FILTER %1, 48, 3
+HEVC_SAO_EDGE_FILTER %1, 64, 4
+%endmacro
+
+HEVC_SAO_EDGE_FILTER_FUNCS_AVX2 10
+HEVC_SAO_EDGE_FILTER_FUNCS_AVX2 12
+
 %endif
-- 
2.34.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/2] avformat/rtsp: set AVFMTCTX_UNSEEKABLE flag

2025-05-02 Thread Kaarle Ritvanen via ffmpeg-devel
for live RTP streams. Some external applications, such as Qt Multimedia,
depend on this flag being set correctly.

Signed-off-by: Kaarle Ritvanen 
---
 libavformat/rtsp.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c
index 5ea471b40c..e7c00c69fa 100644
--- a/libavformat/rtsp.c
+++ b/libavformat/rtsp.c
@@ -629,8 +629,8 @@ static void sdp_parse_line(AVFormatContext *s, 
SDPParseState *s1,
 rtsp_parse_range_npt(p, &start, &end);
 s->start_time = start;
 /* AV_NOPTS_VALUE means live broadcast (and can't seek) */
-s->duration   = (end == AV_NOPTS_VALUE) ?
-AV_NOPTS_VALUE : end - start;
+if (end != AV_NOPTS_VALUE)
+s->duration = end - start;
 } else if (av_strstart(p, "lang:", &p)) {
 if (s->nb_streams > 0) {
 get_word(buf1, sizeof(buf1), &p);
@@ -720,6 +720,8 @@ int ff_sdp_parse(AVFormatContext *s, const char *content)
 char buf[SDP_MAX_SIZE], *q;
 SDPParseState sdp_parse_state = { { 0 } }, *s1 = &sdp_parse_state;
 
+s->duration = AV_NOPTS_VALUE;
+
 p = content;
 for (;;) {
 p += strspn(p, SPACE_CHARS);
@@ -753,6 +755,9 @@ int ff_sdp_parse(AVFormatContext *s, const char *content)
 av_freep(&s1->default_exclude_source_addrs[i]);
 av_freep(&s1->default_exclude_source_addrs);
 
+if (s->duration == AV_NOPTS_VALUE)
+s->ctx_flags |= AVFMTCTX_UNSEEKABLE;
+
 return 0;
 }
 #endif /* CONFIG_RTPDEC */
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 2/2] avformat/seek: fail seeking immediately

2025-05-02 Thread Kaarle Ritvanen via ffmpeg-devel
when AVFMTCTX_UNSEEKABLE is set. Depending on the codec, the execution
of this function may take several seconds. This is an optimization for
the case where the stream is already known unseekable.

Signed-off-by: Kaarle Ritvanen 
---
 libavformat/seek.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libavformat/seek.c b/libavformat/seek.c
index c0d94371e6..1a7d3d6741 100644
--- a/libavformat/seek.c
+++ b/libavformat/seek.c
@@ -643,6 +643,9 @@ int av_seek_frame(AVFormatContext *s, int stream_index,
 {
 int ret;
 
+if (s->ctx_flags & AVFMTCTX_UNSEEKABLE)
+return AVERROR(ENOSYS);
+
 if (ffifmt(s->iformat)->read_seek2 && !ffifmt(s->iformat)->read_seek) {
 int64_t min_ts = INT64_MIN, max_ts = INT64_MAX;
 if ((flags & AVSEEK_FLAG_BACKWARD))
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/5] x86/vvcdec: misc, reordered functions in dsp_init for improved readability

2025-05-02 Thread Nuo Mi
---
 libavcodec/x86/vvc/dsp_init.c | 48 +--
 1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/libavcodec/x86/vvc/dsp_init.c b/libavcodec/x86/vvc/dsp_init.c
index dc833bb0f1..bb68ba0b1e 100644
--- a/libavcodec/x86/vvc/dsp_init.c
+++ b/libavcodec/x86/vvc/dsp_init.c
@@ -215,6 +215,18 @@ ALF_FUNCS(16, 12, avx2)
 
 #endif
 
+#define AVG_INIT(bd, opt) do {   \
+c->inter.avg= bf(vvc_avg, bd, opt);  \
+c->inter.w_avg  = bf(vvc_w_avg, bd, opt);\
+} while (0)
+
+#define DMVR_INIT(bd) do {   \
+c->inter.dmvr[0][0]   = ff_vvc_dmvr_##bd##_avx2; \
+c->inter.dmvr[0][1]   = ff_vvc_dmvr_h_##bd##_avx2;   \
+c->inter.dmvr[1][0]   = ff_vvc_dmvr_v_##bd##_avx2;   \
+c->inter.dmvr[1][1]   = ff_vvc_dmvr_hv_##bd##_avx2;  \
+} while (0)
+
 #define PEL_LINK(dst, C, W, idx1, idx2, name, D, opt)  
\
 dst[C][W][idx1][idx2] = vvc_put_## name ## _ ## D ## _##opt;   
\
 dst ## _uni[C][W][idx1][idx2] = ff_h2656_put_uni_ ## name ## _ ## D ## 
_##opt; \
@@ -280,17 +292,8 @@ ALF_FUNCS(16, 12, avx2)
 MC_TAP_LINKS_16BPC_AVX2(LUMA,   8, bd);  \
 MC_TAP_LINKS_16BPC_AVX2(CHROMA, 4, bd);
 
-#define AVG_INIT(bd, opt) do {   \
-c->inter.avg= bf(vvc_avg, bd, opt);  \
-c->inter.w_avg  = bf(vvc_w_avg, bd, opt);\
-} while (0)
-
-#define DMVR_INIT(bd) do {   \
-c->inter.dmvr[0][0]   = ff_vvc_dmvr_##bd##_avx2; \
-c->inter.dmvr[0][1]   = ff_vvc_dmvr_h_##bd##_avx2;   \
-c->inter.dmvr[1][0]   = ff_vvc_dmvr_v_##bd##_avx2;   \
-c->inter.dmvr[1][1]   = ff_vvc_dmvr_hv_##bd##_avx2;  \
-} while (0)
+int ff_vvc_sad_avx2(const int16_t *src0, const int16_t *src1, int dx, int dy, 
int block_w, int block_h);
+#define SAD_INIT() c->inter.sad = ff_vvc_sad_avx2
 
 #define ALF_INIT(bd) do {\
 c->alf.filter[LUMA]   = vvc_alf_filter_luma_##bd##_avx2; \
@@ -298,8 +301,6 @@ ALF_FUNCS(16, 12, avx2)
 c->alf.classify   = vvc_alf_classify_##bd##_avx2;\
 } while (0)
 
-int ff_vvc_sad_avx2(const int16_t *src0, const int16_t *src1, int dx, int dy, 
int block_w, int block_h);
-#define SAD_INIT() c->inter.sad = ff_vvc_sad_avx2
 #endif
 
 
@@ -319,12 +320,15 @@ void ff_vvc_dsp_init_x86(VVCDSPContext *const c, const 
int bd)
 #endif
 #if HAVE_AVX2_EXTERNAL
 if (EXTERNAL_AVX2_FAST(cpu_flags)) {
-ALF_INIT(8);
+// inter
 AVG_INIT(8, avx2);
+DMVR_INIT(8);
 MC_LINKS_AVX2(8);
 OF_INIT(8);
-DMVR_INIT(8);
 SAD_INIT();
+
+// filter
+ALF_INIT(8);
 }
 #endif
 break;
@@ -336,13 +340,16 @@ void ff_vvc_dsp_init_x86(VVCDSPContext *const c, const 
int bd)
 #endif
 #if HAVE_AVX2_EXTERNAL
 if (EXTERNAL_AVX2_FAST(cpu_flags)) {
-ALF_INIT(10);
+// inter
 AVG_INIT(10, avx2);
+DMVR_INIT(10);
 MC_LINKS_AVX2(10);
 MC_LINKS_16BPC_AVX2(10);
 OF_INIT(10);
-DMVR_INIT(10);
 SAD_INIT();
+
+// filter
+ALF_INIT(10);
 }
 #endif
 break;
@@ -354,13 +361,16 @@ void ff_vvc_dsp_init_x86(VVCDSPContext *const c, const 
int bd)
 #endif
 #if HAVE_AVX2_EXTERNAL
 if (EXTERNAL_AVX2_FAST(cpu_flags)) {
-ALF_INIT(12);
+// inter
 AVG_INIT(12, avx2);
+DMVR_INIT(12);
 MC_LINKS_AVX2(12);
 MC_LINKS_16BPC_AVX2(12);
 OF_INIT(12);
-DMVR_INIT(12);
 SAD_INIT();
+
+// filter
+ALF_INIT(12);
 }
 #endif
 break;
-- 
2.34.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 5/5] checkasm: add vvc_sao

2025-05-02 Thread Martin Storsjö

On Fri, 2 May 2025, Nuo Mi wrote:


From: Shaun Loo 

This is a part of Google Summer of Code 2023

AVX2:
- vvc_sao.sao_band [OK]
- vvc_sao.sao_edge [OK]

Co-authored-by: Nuo Mi 
---
tests/checkasm/Makefile   |   2 +-
tests/checkasm/checkasm.c |   1 +
tests/checkasm/checkasm.h |   1 +
tests/checkasm/vvc_sao.c  | 161 ++
4 files changed, 164 insertions(+), 1 deletion(-)
create mode 100644 tests/checkasm/vvc_sao.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index 193c1e4633..fabbf595b4 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -47,7 +47,7 @@ AVCODECOBJS-$(CONFIG_V210_DECODER)  += v210dec.o
AVCODECOBJS-$(CONFIG_V210_ENCODER)  += v210enc.o
AVCODECOBJS-$(CONFIG_VORBIS_DECODER)+= vorbisdsp.o
AVCODECOBJS-$(CONFIG_VP9_DECODER)   += vp9dsp.o
-AVCODECOBJS-$(CONFIG_VVC_DECODER)   += vvc_alf.o vvc_mc.o
+AVCODECOBJS-$(CONFIG_VVC_DECODER)   += vvc_alf.o vvc_mc.o vvc_sao.o

CHECKASMOBJS-$(CONFIG_AVCODEC)  += $(AVCODECOBJS-yes)

diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 3bb82ed0e5..0734cd26bf 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -256,6 +256,7 @@ static const struct {
#if CONFIG_VVC_DECODER
{ "vvc_alf", checkasm_check_vvc_alf },
{ "vvc_mc",  checkasm_check_vvc_mc  },
+{ "vvc_sao", checkasm_check_vvc_sao },
#endif
#endif
#if CONFIG_AVFILTER
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index a6b5965e02..146bfdec35 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -149,6 +149,7 @@ void checkasm_check_videodsp(void);
void checkasm_check_vorbisdsp(void);
void checkasm_check_vvc_alf(void);
void checkasm_check_vvc_mc(void);
+void checkasm_check_vvc_sao(void);

struct CheckasmPerf;

diff --git a/tests/checkasm/vvc_sao.c b/tests/checkasm/vvc_sao.c
new file mode 100644
index 00..026078ff02
--- /dev/null
+++ b/tests/checkasm/vvc_sao.c
@@ -0,0 +1,161 @@
+/*
+ * Copyright (c) 2018 Yingming Fan 
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+
+#include "libavutil/intreadwrite.h"
+#include "libavutil/mem_internal.h"
+
+#include "libavcodec/vvc/dsp.h"
+#include "libavcodec/vvc/ctu.h"
+
+#include "checkasm.h"
+
+static const uint32_t pixel_mask[3] = { 0x, 0x03ff03ff, 0x0fff0fff };
+static const uint32_t sao_size[] = {8, 16, 32, 48, 64, 80, 96, 112, 128};
+
+#define SIZEOF_PIXEL ((bit_depth + 7) / 8)
+#define PIXEL_STRIDE (2*MAX_PB_SIZE + AV_INPUT_BUFFER_PADDING_SIZE) //same 
with sao_edge src_stride
+#define BUF_SIZE (PIXEL_STRIDE * (MAX_PB_SIZE+2) * 2) //+2 for top and bottom 
row, *2 for high bit depth
+#define OFFSET_THRESH (1 << (bit_depth - 5))
+#define OFFSET_LENGTH 5
+
+#define randomize_buffers(buf0, buf1, size) \
+do {\
+uint32_t mask = pixel_mask[(bit_depth - 8) >> 1];   \
+int k;  \
+for (k = 0; k < size; k += 4) { \
+uint32_t r = rnd() & mask;  \
+AV_WN32A(buf0 + k, r);  \
+AV_WN32A(buf1 + k, r);  \
+}   \
+} while (0)
+
+#define randomize_buffers2(buf, size)   \
+do {\
+uint32_t max_offset = OFFSET_THRESH;\
+int k;  \
+if (bit_depth == 8) {   \
+for (k = 0; k < size; k++) {\
+uint8_t r = rnd() % max_offset; \
+buf[k] = r; \
+}   \
+} else {\
+for (k = 0; k < size; k++) {\
+uint16_t r = rnd() % max_offset;\
+buf[k] = r; \
+}   \
+}