Re: [FFmpeg-devel] [IMPORTANT] AI written TLS Code in WHIP patch

2025-07-31 Thread Niklas Haas
On Tue, 29 Jul 2025 21:02:53 +0100 Kieran Kunhya wrote: > Hello, > > It seem there is strong evidence that AI wrote TLS code as part of the > WHIP patch. It goes without saying why this is bad. Further discussion > here: > https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20053 > > This patch was pushe

Re: [FFmpeg-devel] Revert "avformat/tls_openssl: properly get new BIO index"

2025-07-30 Thread Niklas Haas
On Wed, 30 Jul 2025 23:39:18 +0200 Nicolas George wrote: > Kacper Michajlow (HE12025-07-30): > > Can we please find something more relevant to be upset about? > > Can you please stop behaving with such arrogance when somebody who has > 30 times as much experience as you in the project points you y

Re: [FFmpeg-devel] Revert "avformat/tls_openssl: properly get new BIO index"

2025-07-30 Thread Niklas Haas
On Wed, 30 Jul 2025 19:52:59 +0200 Nicolas George wrote: > Kacper Michajłow (HE12025-07-30): > > > Note that BIO_get_new_index() can only be used 127 times before it > > > returns an error. > > > > We cannot call it repeatedly, because it will fail eventually. > > > > To my understanding the index

Re: [FFmpeg-devel] Again pre-multiplied alpha

2025-07-29 Thread Niklas Haas
On Mon, 28 Jul 2025 16:18:29 +0200 Nicolas George wrote: > Niklas Haas (HE12025-07-24): > > On what component are you missing an error here? > > Recently I wrote: “stacking images with different kind of alpha or > sending this kind of frames to a muxer with uncoded frames

Re: [FFmpeg-devel] [PATCH] web: announce code.ffmpeg.org

2025-07-27 Thread Niklas Haas
On Wed, 23 Jul 2025 01:04:34 +0200 Michael Niedermayer wrote: > The announcment should probably mention that performance, as in number of > submissions / percentage of applied / not reviewed patches will be > monitored compared to the mailing list. https://en.wikipedia.org/wiki/Goodhart%27s_law

Re: [FFmpeg-devel] Again pre-multiplied alpha

2025-07-24 Thread Niklas Haas
On Thu, 24 Jul 2025 16:59:24 +0200 Nicolas George wrote: > Right now, I an not demanding negotiation, I am just requiring > protection: > > if (frame->is_premultiplied && !out->supports_premultiplied) { > av_log(ctx, AV_LOG_ERROR, "Your data is about to be > corrupted bec

Re: [FFmpeg-devel] [PATCH v2 03/18] avfilter/vf_showinfo: print alpha mode when relevant

2025-07-24 Thread Niklas Haas
On Wed, 23 Jul 2025 18:11:23 +0200 Kacper Michajlow wrote: > On Wed, 23 Jul 2025 at 15:57, Niklas Haas wrote: > > > > From: Niklas Haas > > > > --- > > libavfilter/vf_showinfo.c | 8 > > 1 file changed, 8 insertions(+) > > > > d

Re: [FFmpeg-devel] [PATCH v2 11/18] avfilter/vf_scale: don't ignore incoming chroma location

2025-07-24 Thread Niklas Haas
On Wed, 23 Jul 2025 15:47:14 +0200 Niklas Haas wrote: > From: Niklas Haas > > This filter was, for some reason, always ignoring the incoming chroma > location in favor of the user-specified value, even when that value was set > to the default (unspecified). > > This has be

Re: [FFmpeg-devel] [PATCH v2 10/18] avcodec/jpegxl: parse and signal correct alpha mode

2025-07-24 Thread Niklas Haas
On Wed, 23 Jul 2025 18:19:03 +0200 Kacper Michajlow wrote: > On Wed, 23 Jul 2025 at 15:57, Niklas Haas wrote: > > > > From: Niklas Haas > > > > This header bit ("alpha_associated") was incorrectly ignored. > > --- > > libavcodec/jpegxl_parse.c

Re: [FFmpeg-devel] Again pre-multiplied alpha

2025-07-24 Thread Niklas Haas
On Wed, 23 Jul 2025 19:02:06 +0200 Nicolas George wrote: > Niklas Haas (HE12025-07-23): > > [PATCH v2 05/18] avcodec/encode: enforce alpha mode compatibility at encode > > time > > That handles it for encoders, I suppose. But I do not see anything > protecting you f

Re: [FFmpeg-devel] Again pre-multiplied alpha

2025-07-23 Thread Niklas Haas
On Wed, 23 Jul 2025 16:11:45 +0200 Nicolas George wrote: > Niklas Haas (HE12025-07-23): > > Changes since v1: > > - Correctly implement alpha mode tagging for JPEG XL > > - Set correct alpha mode for OpenEXR (which is always premultiplied) > > - Ensure -alpha_mode sp

Re: [FFmpeg-devel] FFmpeg 8.0 Release

2025-07-23 Thread Niklas Haas
On Wed, 23 Jul 2025 16:01:14 +0200 Niklas Haas wrote: > On Wed, 23 Jul 2025 13:43:43 +0200 Michael Niedermayer > wrote: > > Hi everyone > > > > I intend to create the release/8.0 branch in the next 1-2 weeks > > after that i intend to make teh 8.0 release in the

Re: [FFmpeg-devel] FFmpeg 8.0 Release

2025-07-23 Thread Niklas Haas
On Wed, 23 Jul 2025 13:43:43 +0200 Michael Niedermayer wrote: > Hi everyone > > I intend to create the release/8.0 branch in the next 1-2 weeks > after that i intend to make teh 8.0 release in the following 1-2 weeks > > If theres something you want in it make sure its pushed before the branch >

[FFmpeg-devel] [PATCH v2 05/18] avcodec/encode: enforce alpha mode compatibility at encode time

2025-07-23 Thread Niklas Haas
From: Niklas Haas Error out if trying to encode frames with an incompatible alpha mode. --- libavcodec/encode.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/libavcodec/encode.c b/libavcodec/encode.c index 38833c566c..9012708130 100644 --- a/libavcodec

[FFmpeg-devel] [PATCH v2 13/18] fftools/ffmpeg_enc: forward frame alpha mode to encoder

2025-07-23 Thread Niklas Haas
From: Niklas Haas --- fftools/ffmpeg_enc.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/fftools/ffmpeg_enc.c b/fftools/ffmpeg_enc.c index 4568c15073..babfca6c0a 100644 --- a/fftools/ffmpeg_enc.c +++ b/fftools/ffmpeg_enc.c @@ -287,6 +287,17 @@ int enc_open(void *opaque, const

[FFmpeg-devel] [PATCH v2 18/18] avfilter/vf_libplacebo: add an alpha_mode setting

2025-07-23 Thread Niklas Haas
From: Niklas Haas Chooses the desired output alpha mode. Note that this depends on an upstream version of libplacebo new enough to respect the corresponding AVFrame field in pl_map_avframe_ex. --- doc/filters.texi| 4 libavfilter/vf_libplacebo.c | 13 + 2 files

[FFmpeg-devel] [PATCH v2 14/18] avfilter/vf_premultiply: tag correct alpha mode on result

2025-07-23 Thread Niklas Haas
From: Niklas Haas --- libavfilter/vf_premultiply.c | 8 1 file changed, 8 insertions(+) diff --git a/libavfilter/vf_premultiply.c b/libavfilter/vf_premultiply.c index 322fc39094..1c08cf524a 100644 --- a/libavfilter/vf_premultiply.c +++ b/libavfilter/vf_premultiply.c @@ -512,6 +512,7

[FFmpeg-devel] [PATCH v2 17/18] avfilter/vf_setparams: add alpha_mode parameter

2025-07-23 Thread Niklas Haas
From: Niklas Haas --- doc/filters.texi | 13 + libavfilter/vf_setparams.c | 10 ++ 2 files changed, 23 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index fbf8aff382..13ca065f85 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -21602,6

[FFmpeg-devel] [PATCH v2 15/18] avfilter/vf_alphamerge: tag correct alpha mode

2025-07-23 Thread Niklas Haas
From: Niklas Haas --- libavfilter/vf_alphamerge.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libavfilter/vf_alphamerge.c b/libavfilter/vf_alphamerge.c index f5779484a9..98c61e282d 100644 --- a/libavfilter/vf_alphamerge.c +++ b/libavfilter/vf_alphamerge.c @@ -85,6 +85,8 @@ static int

[FFmpeg-devel] [PATCH v2 16/18] avfilter/vf_overlay: respect alpha mode tagging by default

2025-07-23 Thread Niklas Haas
From: Niklas Haas --- doc/filters.texi | 3 +- libavfilter/vf_overlay.c | 199 -- libavfilter/vf_overlay.h | 4 +- libavfilter/x86/vf_overlay_init.c | 8 +- 4 files changed, 116 insertions(+), 98 deletions(-) diff --git a/doc

[FFmpeg-devel] [PATCH v2 12/18] fftools/ffmpeg_enc: don't ignore user selected chroma location

2025-07-23 Thread Niklas Haas
From: Niklas Haas This code always ignored the user-provided enc_ctx->chroma_sample_location in favor of the location tagged on the frame. This leads to a very (IMHO) unexpected outcome where -chroma_sample_location works differently from the related options like -colorspace and -color_ra

[FFmpeg-devel] [PATCH v2 09/18] avcodec/libjxlenc: also attach extra channel info

2025-07-23 Thread Niklas Haas
From: Niklas Haas Works around a bug where older versions of libjxl don't correctly forward the alpha channel information to the extra channel info. --- libavcodec/libjxlenc.c | 13 + 1 file changed, 13 insertions(+) diff --git a/libavcodec/libjxlenc.c b/libavcodec/libjxl

[FFmpeg-devel] [PATCH v2 06/18] avcodec/png: set correct alpha mode

2025-07-23 Thread Niklas Haas
From: Niklas Haas PNG always uses straight alpha. cf. https://www.w3.org/TR/PNG-Rationale.html > Although each form of alpha storage has its advantages, we did not want to > require all PNG viewers to handle both forms. We standardized on non- > premultiplied alpha as being the los

[FFmpeg-devel] [PATCH v2 07/18] avcodec/exr: set correct alpha mode

2025-07-23 Thread Niklas Haas
From: Niklas Haas OpenEXR always uses premultiplied alpha, as per the spec. cf. https://openexr.com/en/latest/TechnicalIntroduction.html > By convention, all color channels are premultiplied by alpha, so that > `foreground + (1-alpha) x background` performs a correct “over” operation.

[FFmpeg-devel] [PATCH v2 08/18] avcodec/libjxl: set correct alpha mode

2025-07-23 Thread Niklas Haas
From: Niklas Haas JPEG XL supports both premultiplied and straight alpha, and the basic info struct contains signalling for this. Forward the correct tagging on decode and encode. --- libavcodec/libjxldec.c | 6 ++ libavcodec/libjxlenc.c | 15 +++ 2 files changed, 21 insertions

[FFmpeg-devel] [PATCH v2 10/18] avcodec/jpegxl: parse and signal correct alpha mode

2025-07-23 Thread Niklas Haas
From: Niklas Haas This header bit ("alpha_associated") was incorrectly ignored. --- libavcodec/jpegxl_parse.c | 7 +-- libavcodec/jpegxl_parse.h | 1 + libavcodec/jpegxl_parser.c | 5 + 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/libavcodec/jpegxl

[FFmpeg-devel] [PATCH v2 11/18] avfilter/vf_scale: don't ignore incoming chroma location

2025-07-23 Thread Niklas Haas
From: Niklas Haas This filter was, for some reason, always ignoring the incoming chroma location in favor of the user-specified value, even when that value was set to the default (unspecified). This has been the status quo for quite some time, although commit 04ce01df0bb made the situation

[FFmpeg-devel] [PATCH v2 04/18] avcodec/avcodec: add AVCodecContext.alpha_mode

2025-07-23 Thread Niklas Haas
From: Niklas Haas Following in the footsteps of the previous commit, this commit adds the new fields to AVCodecContext so we can start properly setting it on codecs, as well as limiting the list of supported options to detect a format mismatch during encode. This commit also sets up the

[FFmpeg-devel] [PATCH v2 03/18] avfilter/vf_showinfo: print alpha mode when relevant

2025-07-23 Thread Niklas Haas
From: Niklas Haas --- libavfilter/vf_showinfo.c | 8 1 file changed, 8 insertions(+) diff --git a/libavfilter/vf_showinfo.c b/libavfilter/vf_showinfo.c index c706d00c96..b564d03a84 100644 --- a/libavfilter/vf_showinfo.c +++ b/libavfilter/vf_showinfo.c @@ -887,6 +887,14 @@ static int

[FFmpeg-devel] [PATCH v2 01/18] avutil/frame: add AVFrame.alpha_mode

2025-07-23 Thread Niklas Haas
From: Niklas Haas FFmpeg currently handles alpha in a quasi-arbitrary way. Some filters/codecs assume alpha is premultiplied, others assume it is independent. If there is to be any hope for order in this chaos, we need to start by defining an enum for the possible range of values. --- doc

[FFmpeg-devel] (no subject)

2025-07-23 Thread Niklas Haas
Changes since v1: - Correctly implement alpha mode tagging for JPEG XL - Set correct alpha mode for OpenEXR (which is always premultiplied) - Ensure -alpha_mode specified on the command line correctly propagates - Print out a warning when overriding the alpha mode explicitly Includes an unrelated

Re: [FFmpeg-devel] [PATCH] avfilter: add inverse tone mapping

2025-07-23 Thread Niklas Haas
On Wed, 23 Jul 2025 05:30:57 + Sarthak Indurkhya via ffmpeg-devel wrote: > Thank you for the thoughtful feedback. > > Advantages over vf_libplacebo’s inverse tone mapping: > > 1. Algorithmic Differentiation: > My filter is based on a novel local adaptation + inverse tone mapping > strateg

[FFmpeg-devel] [PATCH] avfilter/vf_libplacebo: composite multiple inputs in linear light

2025-07-22 Thread Niklas Haas
From: Niklas Haas This gives vastly improved blending results than when blending directly in the desired output colorspace. Overridable by the existing "disable_linear" option. This is functionally similar to combining multiple "libplacebo" filters, but does not rely o

[FFmpeg-devel] [PATCH] avfilter/vf_premultiply: use correct premultiplication formula

2025-07-22 Thread Niklas Haas
From: Niklas Haas The previous formula was introduced without justification in 6e713841e8, and the only thing Paul had to say about it over IRC was that it was copied from an unspecified source on the internet. I decided to do some testing and came to the conclusion that this term not only

Re: [FFmpeg-devel] [PATCH v3 1/3] avfilter/vf_colordetect: add new color range detection filter

2025-07-18 Thread Niklas Haas
On Fri, 18 Jul 2025 14:38:04 +0200 Kacper Michajlow wrote: > > +static inline int ff_detect_range_c(const uint8_t *data, ptrdiff_t stride, > > +ptrdiff_t width, ptrdiff_t height, > > +int mpeg_min, int mpeg_max) > > +{ > > +

Re: [FFmpeg-devel] [PATCH v3 1/3] avfilter/vf_colordetect: add new color range detection filter

2025-07-18 Thread Niklas Haas
On Fri, 18 Jul 2025 11:57:14 +0200 Niklas Haas wrote: > From: Niklas Haas > > This filter can detect various properties about the image, including > whether or not there are out-of-range values, or whether the input appears > to use straight or premultiplied alpha. > > Of c

Re: [FFmpeg-devel] [PATCH] avfilter/vf_blackdetect: Fix header guard

2025-07-18 Thread Niklas Haas
On Fri, 18 Jul 2025 19:32:42 +0800 Zhao Zhili wrote: > From: Zhao Zhili > > Fix fate-source failure. > --- > libavfilter/vf_blackdetect.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/libavfilter/vf_blackdetect.h b/libavfilter/vf_blackdetect.h > index 361da2c5bc..2

[FFmpeg-devel] [PATCH v3 3/3] avfilter/vf_colordetect: add x86 SIMD implementation

2025-07-18 Thread Niklas Haas
From: Niklas Haas alphadetect8_full_c: 5658.2 ( 1.00x) alphadetect8_full_avx2:215.1 (26.31x) alphadetect8_full_avx512: 133.5 (42.40x) alphadetect8_limited_c: 7391.5 ( 1.00x

[FFmpeg-devel] [PATCH v3 2/3] tests/checkasm: add check for vf_colordetect

2025-07-18 Thread Niklas Haas
From: Niklas Haas --- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 3 + tests/checkasm/checkasm.h | 1 + tests/checkasm/vf_colordetect.c | 139 tests/fate/checkasm.mak | 1 + 5 files changed, 145 insertions

[FFmpeg-devel] [PATCH v3 1/3] avfilter/vf_colordetect: add new color range detection filter

2025-07-18 Thread Niklas Haas
From: Niklas Haas This filter can detect various properties about the image, including whether or not there are out-of-range values, or whether the input appears to use straight or premultiplied alpha. Of course, these can only be heuristics, with "undetermined" as the base case. Wh

Re: [FFmpeg-devel] [PATCH v2 3/3] avfilter/vf_colordetect: add x86 SIMD implementation

2025-07-18 Thread Niklas Haas
On Thu, 17 Jul 2025 11:41:56 +0200 Niklas Haas wrote: > On Wed, 16 Jul 2025 17:25:12 -0300 James Almer wrote: > > On 7/16/2025 1:24 PM, Niklas Haas wrote: > > > +cglobal detect_alpha%1_%3, 6, 7, 6, color, color_stride, alpha, > > > alpha_stride, width, heigh

[FFmpeg-devel] [PATCH v2 2/2] tests/checkasm: add test for vf_blackdetect

2025-07-17 Thread Niklas Haas
From: Niklas Haas --- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 3 ++ tests/checkasm/checkasm.h | 1 + tests/checkasm/vf_blackdetect.c | 69 + tests/fate/checkasm.mak | 1 + 5 files changed, 75 insertions

[FFmpeg-devel] [PATCH v2 1/2] avfilter/vf_blackdetect: add AVX2 SIMD version

2025-07-17 Thread Niklas Haas
From: Niklas Haas Requested by a user. Even with autovectorization enabled, the compiler performs a quite poor job of optimizing this function, due to not being able to take advantage of the pmaxub + pcmpeqb trick for counting the number of pixels less than or equal-to a threshold

Re: [FFmpeg-devel] [PATCH v2 3/3] avfilter/vf_colordetect: add x86 SIMD implementation

2025-07-17 Thread Niklas Haas
On Wed, 16 Jul 2025 17:25:12 -0300 James Almer wrote: > On 7/16/2025 1:24 PM, Niklas Haas wrote: > > +cglobal detect_alpha%1_%3, 6, 7, 6, color, color_stride, alpha, > > alpha_stride, width, height, x > > +pxor m0, m0 > > +add colorq, widthq > > +

Re: [FFmpeg-devel] [PATCH v2 3/3] avfilter/vf_colordetect: add x86 SIMD implementation

2025-07-17 Thread Niklas Haas
On Wed, 16 Jul 2025 22:06:28 +0200 Henrik Gramner via ffmpeg-devel wrote: > On Wed, Jul 16, 2025 at 6:26 PM Niklas Haas wrote: > > +cglobal detect_range%1, 6, 7, 5, data, stride, width, height, mpeg_min, > > mpeg_max, x > > +movd xm0, mpeg_mind > &g

Re: [FFmpeg-devel] [PATCH v2 3/3] avfilter/vf_colordetect: add x86 SIMD implementation

2025-07-16 Thread Niklas Haas
On Wed, 16 Jul 2025 17:25:12 -0300 James Almer wrote: > On 7/16/2025 1:24 PM, Niklas Haas wrote: > > +cglobal detect_alpha%1_%3, 6, 7, 6, color, color_stride, alpha, > > alpha_stride, width, height, x > > +pxor m0, m0 > > +add colorq, widthq > > +

[FFmpeg-devel] [PATCH v2 3/3] avfilter/vf_colordetect: add x86 SIMD implementation

2025-07-16 Thread Niklas Haas
From: Niklas Haas alphadetect8_full_c: 5658.2 ( 1.00x) alphadetect8_full_avx2:215.1 (26.31x) alphadetect8_full_avx512: 133.5 (42.40x) alphadetect8_limited_c: 7391.5 ( 1.00x

[FFmpeg-devel] [PATCH v2 1/3] avfilter/vf_colordetect: add new color range detection filter

2025-07-16 Thread Niklas Haas
From: Niklas Haas This filter can detect various properties about the image, including whether or not there are out-of-range values, or whether the input appears to use straight or premultiplied alpha. Of course, these can only be heuristics, with "undetermined" as the base case. Wh

[FFmpeg-devel] [PATCH v2 2/3] tests/checkasm: add check for vf_colordetect

2025-07-16 Thread Niklas Haas
From: Niklas Haas --- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 3 + tests/checkasm/checkasm.h | 1 + tests/checkasm/vf_colordetect.c | 137 4 files changed, 142 insertions(+) create mode 100644 tests/checkasm

[FFmpeg-devel] [PATCH v2 0/3] avfilter: add vf_colordetect filter

2025-07-16 Thread Niklas Haas
Changes since v1: - Fix overflow in both C and x86 code for 16-bit depth - Fix SIMD overflow for 8-bit depth - Improve checkasm test to try and force overflow conditions ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/lis

Re: [FFmpeg-devel] [PATCH 1/3] avfilter/vf_colordetect: add new color range detection filter

2025-07-16 Thread Niklas Haas
On Wed, 16 Jul 2025 17:25:49 +0200 Niklas Haas wrote: > From: Niklas Haas > > This filter can detect various properties about the image, including > whether or not there are out-of-range values, or whether the input appears > to use straight or premultiplied alpha. > > Of c

[FFmpeg-devel] [PATCH 3/3] avfilter/vf_colordetect: add x86 SIMD implementation

2025-07-16 Thread Niklas Haas
From: Niklas Haas alphadetect8_full_c: 6334.7 ( 1.00x) alphadetect8_full_avx2:208.1 (30.44x) alphadetect8_full_avx512: 123.3 (51.39x) alphadetect8_limited_c:645.3 ( 1.00x

[FFmpeg-devel] [PATCH 1/3] avfilter/vf_colordetect: add new color range detection filter

2025-07-16 Thread Niklas Haas
From: Niklas Haas This filter can detect various properties about the image, including whether or not there are out-of-range values, or whether the input appears to use straight or premultiplied alpha. Of course, these can only be heuristics, with "undetermined" as the base case. Wh

[FFmpeg-devel] [PATCH 2/3] tests/checkasm: add check for vf_colordetect

2025-07-16 Thread Niklas Haas
From: Niklas Haas --- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 3 + tests/checkasm/checkasm.h | 1 + tests/checkasm/vf_colordetect.c | 129 4 files changed, 134 insertions(+) create mode 100644 tests/checkasm

Re: [FFmpeg-devel] [PATCH] avfilter/vf_blackdetect: add AVX2 SIMD version

2025-07-16 Thread Niklas Haas
On Thu, 10 Jul 2025 17:10:42 +0200 Niklas Haas wrote: > From: Niklas Haas > > Requested by a user. Even with autovectorization enabled, the compiler > performs a quite poor job of optimizing this function, due to not being > able to take advantage of the pmaxub + pcmpeqb trick f

Re: [FFmpeg-devel] [PATCH] avfilter/vf_blackdetect: add AVX2 SIMD version

2025-07-16 Thread Niklas Haas
On Thu, 10 Jul 2025 17:10:42 +0200 Niklas Haas wrote: > From: Niklas Haas > > Requested by a user. Even with autovectorization enabled, the compiler > performs a quite poor job of optimizing this function, due to not being > able to take advantage of the pmaxub + pcmpeqb trick f

Re: [FFmpeg-devel] [PATCH] avfilter/vf_thumbnail: unroll and use multiple histograms

2025-07-16 Thread Niklas Haas
On Sat, 12 Jul 2025 13:38:37 +0200 Niklas Haas wrote: > From: Niklas Haas > > This naive hist[p[x]]++ loop suffers badly when there are large regions of > identical values in the image, because of store-to-load forwarding delay. > > Splitting up the histogram into four "p

Re: [FFmpeg-devel] [PATCH 1/4] avfilter/scene_sad: pass true depth to ff_scene_sad_get_fn()

2025-07-16 Thread Niklas Haas
On Sat, 12 Jul 2025 11:22:40 +0200 Niklas Haas wrote: > From: Niklas Haas > > I need to be able to distinguish between 10/12/14 and 16 bit depths, for > overflow reasons. Merging this series soon. ___ ffmpeg-devel mailing list ffmpeg-deve

[FFmpeg-devel] [PATCH] avutil/hwcontext_vulkan: don't over-map buffers with prior padding

2025-07-15 Thread Niklas Haas
From: Niklas Haas If the image data is not at the start of the buffer allocation, such as when the buffer has padding before the image data, this function maps too much memory, since src_data + src_buf->size exceeds the buffer size. Fix this by subtracting the difference between the buf

Re: [FFmpeg-devel] [PATCH v8 14/18] swscale/ops_memcpy: add 'memcpy' backend for plane->plane copies

2025-07-14 Thread Niklas Haas
On Sun, 13 Jul 2025 19:04:21 +0200 Alexander Strasser via ffmpeg-devel wrote: > On 2025-07-12 12:44 +0200, Niklas Haas wrote: > > From: Niklas Haas > > > > Provides a generic fast path for any operation list that can be decomposed > > into a series of memcpy and me

Re: [FFmpeg-devel] [PATCH v8 13/18] swscale/ops_backend: add reference backend basend on C templates

2025-07-14 Thread Niklas Haas
On Sun, 13 Jul 2025 22:14:44 +0200 Andreas Rheinhardt wrote: > Niklas Haas: > > From: Niklas Haas > > > > This will serve as a reference for the SIMD backends to come. That said, > > with auto-vectorization enabled, the performance of this is not atrocious. > &g

Re: [FFmpeg-devel] [PATCH v8 06/18] swscale: add SWS_UNSTABLE flag

2025-07-14 Thread Niklas Haas
On Sun, 13 Jul 2025 22:05:21 +0200 Andreas Rheinhardt wrote: > This patchset adds 214992B of .text, 6304B of .rodata, 42176B of > .data.rel.ro for an opt-in feature. There should be a configure option > to disable the new additions (i.e. the rest of the patchset) from being > compiled in for user

Re: [FFmpeg-devel] [PATCH v8 07/18] swscale/ops: introduce new low level framework

2025-07-14 Thread Niklas Haas
On Sun, 13 Jul 2025 20:25:18 +0200 Andreas Rheinhardt wrote: > Niklas Haas: > > From: Niklas Haas > > > > See docs/swscale-v2.txt for an in-depth introduction to the new approach. > > > > This commit merely introduces the ops definitions and boilerplate func

Re: [FFmpeg-devel] [PATCH v2 01/13] vf_libplacebo: add support for specifying a LUT for the input

2025-07-13 Thread Niklas Haas
On Sun, 13 Jul 2025 03:51:10 +0900 Lynne wrote: > This makes it possible to apply Adobe .cube files to inputs. > --- > doc/filters.texi| 30 ++ > libavfilter/vf_libplacebo.c | 36 > 2 files changed, 66 insertions(+) > >

[FFmpeg-devel] [PATCH] avfilter/vf_thumbnail: unroll and use multiple histograms

2025-07-12 Thread Niklas Haas
From: Niklas Haas This naive hist[p[x]]++ loop suffers badly when there are large regions of identical values in the image, because of store-to-load forwarding delay. Splitting up the histogram into four "parallel" histograms and processing them one at a time speeds things up sig

[FFmpeg-devel] [PATCH v8 02/18] swscale/format: rename legacy format conversion table

2025-07-12 Thread Niklas Haas
From: Niklas Haas --- libswscale/format.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/libswscale/format.c b/libswscale/format.c index 53162f8756..ac130a2595 100644 --- a/libswscale/format.c +++ b/libswscale/format.c @@ -24,14 +24,14 @@ #include

[FFmpeg-devel] [PATCH v8 03/18] swscale/format: add ff_fmt_clear()

2025-07-12 Thread Niklas Haas
From: Niklas Haas Reset an SwsFormat to its fully unset/invalid state. --- libswscale/format.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/libswscale/format.h b/libswscale/format.h index 3b6d745159..be92038f4f 100644 --- a/libswscale/format.h +++ b/libswscale/format.h

[FFmpeg-devel] [PATCH v8 10/18] swscale/ops: add dispatch layer

2025-07-12 Thread Niklas Haas
From: Niklas Haas This handles the low-level execution of an op list, and integration into the SwsGraph infrastructure. To handle frames with insufficient padding in the stride (or a width smaller than one block size), we use a fallback loop that pads the last column of pixels using `memcpy

[FFmpeg-devel] [PATCH v8 17/18] swscale/format: add new format decode/encode logic

2025-07-12 Thread Niklas Haas
From: Niklas Haas This patch adds format handling code for the new operations. This entails fully decoding a format to standardized RGB, and the inverse. Handling it this way means we can always guarantee that a conversion path exists from A to B without having to explicitly cover logic for

[FFmpeg-devel] [PATCH v8 12/18] swscale/ops_chain: add internal abstraction for kernel linking

2025-07-12 Thread Niklas Haas
From: Niklas Haas See doc/swscale-v2.txt for design details. --- libswscale/Makefile| 1 + libswscale/ops_chain.c | 291 + libswscale/ops_chain.h | 134 +++ 3 files changed, 426 insertions(+) create mode 100644 libswscale

[FFmpeg-devel] [PATCH v8 11/18] swscale/optimizer: add packed shuffle solver

2025-07-12 Thread Niklas Haas
From: Niklas Haas This can turn any compatible sequence of operations into a single packed shuffle, including packed swizzling, grayscale->RGB conversion, endianness swapping, RGB bit depth conversions, rgb24->rgb0 alpha clearing and more. --- libswscale/ops_internal.h

[FFmpeg-devel] [PATCH v8 09/18] swscale/ops_internal: add internal ops backend API

2025-07-12 Thread Niklas Haas
From: Niklas Haas This adds an internal API for ops backends, which are responsible for compiling op lists into executable functions. --- libswscale/ops.c | 65 +++ libswscale/ops_internal.h | 108 ++ 2 files changed, 173

[FFmpeg-devel] [PATCH v8 08/18] swscale/optimizer: add high-level ops optimizer

2025-07-12 Thread Niklas Haas
From: Niklas Haas This is responsible for taking a "naive" ops list and optimizing it as much as possible. Also includes a small analyzer that generates component metadata for use by the optimizer. --- libswscale/Makefile| 1 + libswscale/ops.h | 12 +

[FFmpeg-devel] [PATCH v8 07/18] swscale/ops: introduce new low level framework

2025-07-12 Thread Niklas Haas
From: Niklas Haas See docs/swscale-v2.txt for an in-depth introduction to the new approach. This commit merely introduces the ops definitions and boilerplate functions. The subsequent commits will flesh out the underlying implementation. --- libswscale/Makefile | 1 + libswscale/ops.c

[FFmpeg-devel] [PATCH v8 06/18] swscale: add SWS_UNSTABLE flag

2025-07-12 Thread Niklas Haas
From: Niklas Haas Give users and developers a way to opt in to the new format conversion code, and more code from the swscale rewrite in general, even while development is still ongoing. --- doc/APIchanges | 3 +++ doc/scaler.texi | 4 libswscale/options.c | 1 + libswscale

[FFmpeg-devel] [PATCH v8 16/18] tests/checkasm: add checkasm tests for swscale ops

2025-07-12 Thread Niklas Haas
From: Niklas Haas Because of the lack of an external ABI on low-level kernels, we cannot directly test internal functions. Instead, we construct a minimal op chain consisting of a read, the op to be tested, and a write. The bigger complication arises from the fact that the backend may generate

[FFmpeg-devel] [PATCH v8 18/18] swscale/graph: allow experimental use of new format handler

2025-07-12 Thread Niklas Haas
From: Niklas Haas The humor originally contained in this commit message has been redacted to comply with the strict FFmpeg code quality standards. --- libswscale/graph.c | 84 -- 1 file changed, 82 insertions(+), 2 deletions(-) diff --git a

[FFmpeg-devel] [PATCH v8 14/18] swscale/ops_memcpy: add 'memcpy' backend for plane->plane copies

2025-07-12 Thread Niklas Haas
From: Niklas Haas Provides a generic fast path for any operation list that can be decomposed into a series of memcpy and memset operations. 25% faster than the x86 backend for yuv444p -> yuva444p 33% faster than the x86 backend for gray -> yuvj444p --- libswscale/Makefile

[FFmpeg-devel] [PATCH v8 15/18] swscale/x86: add SIMD backend

2025-07-12 Thread Niklas Haas
From: Niklas Haas This covers most 8-bit and 16-bit ops, and some 32-bit ops. It also covers all floating point operations. While this is not yet 100% coverage, it's good enough for the vast majority of formats out there. Of special note is the packed shuffle fast path, which uses pshu

[FFmpeg-devel] [PATCH v8 05/18] tests/checkasm: generalize DEF_CHECKASM_CHECK_FUNC to floats

2025-07-12 Thread Niklas Haas
From: Niklas Haas We split the standard macro into its body (implementation) and declaration, and use a macro argument in place of the raw `memcmp` call, with the major difference that we now take the number of pixels to compare instead of the number of bytes (to match the signature of

[FFmpeg-devel] [PATCH v8 13/18] swscale/ops_backend: add reference backend basend on C templates

2025-07-12 Thread Niklas Haas
From: Niklas Haas This will serve as a reference for the SIMD backends to come. That said, with auto-vectorization enabled, the performance of this is not atrocious. It easily beats the old C code and sometimes even the old SIMD. In theory, we can dramatically speed it up by using GCC vectors

[FFmpeg-devel] [PATCH v8 04/18] tests/checkasm: increase number of runs in between measurements

2025-07-12 Thread Niklas Haas
From: Niklas Haas Sometimes, when measuring very small functions, rdtsc is not accurate enough to get a reliable measurement. This increases the number of runs inside the inner loop from 4 to 32, which should help a lot. Less important when using the more precise linux-perf API, but still useful

[FFmpeg-devel] [PATCH v8 01/18] swscale/graph: pass per-pass image pointers to setup()

2025-07-12 Thread Niklas Haas
From: Niklas Haas This behavior had no real justification and was just incredibly confusing, since the in/out pointers passet to setup() did not match those passed to run(), all for what is arguably an exception anyways (the palette setup). --- libswscale/graph.c | 10 +++--- libswscale

Re: [FFmpeg-devel] [PATCH v7 01/18] swscale/graph: pass per-pass image pointers to setup()

2025-07-12 Thread Niklas Haas
On Fri, 20 Jun 2025 15:17:21 +0200 Niklas Haas wrote: > Changes since v6: > - fix one MSVC build failure > > Will merge this version in ~24H assuming patchwork passes. For the sake of the record, the reason this one cas not merged yet was because there was a bug where older versi

[FFmpeg-devel] [PATCH 2/4] tests/checkasm: add scene_sad checkasm test

2025-07-12 Thread Niklas Haas
From: Niklas Haas --- tests/checkasm/Makefile| 1 + tests/checkasm/checkasm.c | 3 ++ tests/checkasm/checkasm.h | 1 + tests/checkasm/scene_sad.c | 73 ++ 4 files changed, 78 insertions(+) create mode 100644 tests/checkasm/scene_sad.c diff --git a

[FFmpeg-devel] [PATCH 4/4] avfilter/x86/scene_sad: add high bit depth AVX2/AVX512 version

2025-07-12 Thread Niklas Haas
From: Niklas Haas Since psadbw only exists for 8-bits, we have to emulate it for 16-bit inputs. The simplest sequence is to use a normal subtraction, which is safe as long as the inputs do not exceed 32767 - so limit this implementation to 15-bit inputs and below. For 16-bit inputs, we could in

[FFmpeg-devel] [PATCH 1/4] avfilter/scene_sad: pass true depth to ff_scene_sad_get_fn()

2025-07-12 Thread Niklas Haas
From: Niklas Haas I need to be able to distinguish between 10/12/14 and 16 bit depths, for overflow reasons. --- libavfilter/f_select.c | 2 +- libavfilter/scene_sad.c | 5 ++--- libavfilter/vf_framerate.c | 2 +- libavfilter/vf_freezedetect.c| 2 +- libavfilter

[FFmpeg-devel] [PATCH 3/4] avfilter/x86/scene_sad: add AVX512 implementation

2025-07-12 Thread Niklas Haas
From: Niklas Haas Trivial to add, but a lot faster (on my machine). scene_sad8_c: 114476.4 ( 1.00x) scene_sad8_sse2: 8644.3 (13.24x) scene_sad8_avx2: 4520.1 (25.33x) scene_sad8_avx512

Re: [FFmpeg-devel] [PATCH 1/7] vf_libplacebo: add support for specifying a LUT for the input

2025-07-11 Thread Niklas Haas
On Fri, 11 Jul 2025 00:13:29 +0900 Lynne wrote: > This makes it possible to apply Adobe .cube files to inputs. > --- > libavfilter/vf_libplacebo.c | 28 > 1 file changed, 28 insertions(+) > > diff --git a/libavfilter/vf_libplacebo.c b/libavfilter/vf_libplacebo.c > ind

[FFmpeg-devel] [PATCH] avfilter/vf_blackdetect: add AVX2 SIMD version

2025-07-10 Thread Niklas Haas
From: Niklas Haas Requested by a user. Even with autovectorization enabled, the compiler performs a quite poor job of optimizing this function, due to not being able to take advantage of the pmaxub + pcmpeqb trick for counting the number of pixels less than or equal-to a threshold

[FFmpeg-devel] [PATCH] avfilter/f_ebur128: properly propagate true peak

2025-06-23 Thread Niklas Haas
From: Niklas Haas After 3b26b782ee, `ebur128->true_peak` was only set to the maximum of the current "true peak per frame" values, when it should report the true peak for the entire stream. Fixes: 3b26b782eeded9b9ab7fac013cd1a83a30d68206 --- libavfilter/f_ebur128.c | 4 +++- 1 fi

[FFmpeg-devel] [PATCH v3 1/2] avfilter/vf_thumbnail: support more planar formats

2025-06-22 Thread Niklas Haas
From: Niklas Haas This adds support for high bit depth formats, as well as formats with fewer than 3 planes. The implementation for HBD is the same as for 8 bit formats, just right shifted to 8 bits. It's worth pointing out that this also works for HDR formats (and even DV), becaus

[FFmpeg-devel] [PATCH v3 2/2] avfilter/vf_thumbnail: switch to query_func2

2025-06-22 Thread Niklas Haas
From: Niklas Haas Instead of enumerating a static list of planar formats to support, walk through the format list and enable all supported formats. As of writing, this generates the following format list: - gbrap - gbrap10le - gbrap12le - gbrap14le - gbrap16le - gbrp - gbrp10le - gbrp12le

[FFmpeg-devel] [PATCH v5 03/13] avfilter/f_ebur128: use structs for biquad weights

2025-06-20 Thread Niklas Haas
From: Niklas Haas Simplifies the code a bit. In particular, the copy to the stack is marginally faster. --- libavfilter/f_ebur128.c | 52 +++-- 1 file changed, 29 insertions(+), 23 deletions(-) diff --git a/libavfilter/f_ebur128.c b/libavfilter/f_ebur128.c

[FFmpeg-devel] [PATCH v5 02/13] avfilter/f_ebur128: simplify sample cache array

2025-06-20 Thread Niklas Haas
From: Niklas Haas We don't need an X sample cache anymore, and we also can simplify the access macro slightly. --- libavfilter/f_ebur128.c | 29 +++-- 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/libavfilter/f_ebur128.c b/libavfilter/f_ebur128.c

[FFmpeg-devel] [PATCH] avutil/hwcontext_vulkan: disable host transfers if ReBAR is disabled

2025-06-20 Thread Niklas Haas
From: Niklas Haas This feature fundamentally relies on host-visible VRAM, which restricts the set of available memory types to (typically) host-visible device-local ones. When resizable BAR is disabled, this memory type is usually limited to e.g. 256 MiB in size, which is just plain

[FFmpeg-devel] [PATCH v7 12/18] swscale/ops_chain: add internal abstraction for kernel linking

2025-06-20 Thread Niklas Haas
From: Niklas Haas See doc/swscale-v2.txt for design details. --- libswscale/Makefile| 1 + libswscale/ops_chain.c | 291 + libswscale/ops_chain.h | 134 +++ 3 files changed, 426 insertions(+) create mode 100644 libswscale

[FFmpeg-devel] [PATCH v7 03/18] swscale/format: add ff_fmt_clear()

2025-06-20 Thread Niklas Haas
From: Niklas Haas Reset an SwsFormat to its fully unset/invalid state. --- libswscale/format.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/libswscale/format.h b/libswscale/format.h index 3b6d745159..be92038f4f 100644 --- a/libswscale/format.h +++ b/libswscale/format.h

[FFmpeg-devel] [PATCH v7 09/18] swscale/ops_internal: add internal ops backend API

2025-06-20 Thread Niklas Haas
From: Niklas Haas This adds an internal API for ops backends, which are responsible for compiling op lists into executable functions. --- libswscale/ops.c | 65 +++ libswscale/ops_internal.h | 108 ++ 2 files changed, 173

[FFmpeg-devel] [PATCH v5 10/13] avfilter/f_ebur128: lift sample peak calculation out of main loop

2025-06-20 Thread Niklas Haas
From: Niklas Haas This is substantially faster (~55%) than the transposed loop, and also avoids an unnecessary macro. --- libavfilter/f_ebur128.c | 38 ++ 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/libavfilter/f_ebur128.c b/libavfilter

  1   2   3   4   5   6   7   8   9   10   >