date:20250521

Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-21 Thread Timothy Allen via ffmpeg-devel

On Wed, 2025-05-21 at 19:25 +, softworkz . wrote:
> What I mean and what the comment in the ticket is probably
> suggesting, 
> is that the HLS demuxer should URL-encode the URL after combining the
> base url with the segment file name before making a request for the 
> segment.

Ah, sure. That was my original plan, but then I realised that the URL
canonicalisation (or decomposition) step was already there, and I was
concerned about adding an extra step that changed the dynamics of URLs
for this specific case.

Let me take another look and see if I can figure out a reasonable way
of percent-encoding the URL in the specific case of playlists (or
possibly just HLS).

Thanks!

Tim
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] Graphprint Patches Overview

2025-05-21 Thread softworkz .



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Kyle Swanson
> Sent: Mittwoch, 21. Mai 2025 22:11
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] Graphprint Patches Overview
> 
> Hi,
> 
> On Wed, May 21, 2025 at 4:00 AM Kieran Kunhya via ffmpeg-devel
>  wrote:
> > Can we just revert the whole set until it's cleaned up properly?
> >
> > There are more patches to fix issues than the set itself. This is
> > understandable if it's a bit architectural change like threading but it's
> > not.
> 
> I agree with Kieran, revert. This was not ready to be pushed IMO.
> 
> Thanks,
> Kyle
> ___

I think the least that can be expected from somebody making such a 
request is that they provide specific reasoning after having taken 
a closer look - which the two of you apparently haven't.

Thanks
sw

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] Graphprint Patches Reminder

2025-05-21 Thread softworkz .

Hi,

I'm aiming to apply the patches below by the weekend. Please let me know in
case there are any concerns or objections or when you just need more time to
look at them.

Thanks
sw



[2/3] ffbuild: correctly silence and tag new css/html steps

[3/3] fftools/resources: add missing extensions to .gitignore

[1/5] fftools/makefile: Remove resources from ffprobe

[2/5] fftools/resources: Use .SECONDARY in Makefile comment

[3/5] fftools/ffmpeg: Free print_graph option variables

[4/5] fftools/graphprint: Fix memory leaks

[5/5] fftools/tf_mermaid: Add missing uninit and fix leaks

[v3]  ffbuild/commonmak: Fix rebuild check with implicit rule chains

Links

https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14528

https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14563

https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14570




___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 0/3] tests/fate: Improvements for running FATE on Windows/MSYS2

2025-05-21 Thread softworkz .




> -Original Message-
> From: ffmpegagent 
> Sent: Dienstag, 13. Mai 2025 16:23
> To: ffmpeg-devel@ffmpeg.org
> Cc: softworkz 
> Subject: [PATCH v2 0/3] tests/fate: Improvements for running FATE on
> Windows/MSYS2
> 
> When setting up the new Patchword builders I noticed some issues when
> running FATE tests on Windows. Initially I had them suppressed on the
> builders, but this patchset should finally fix it.
> 
> Version V2
> 
>  * Clarified commit message in 3/3 regarding the requirement for a relative
>path to the fate samples (thanks, Zhao)
> 
> .
> 
> softworkz (3):
>   tests/fate: Fix subtitle fate tests on Windows
>   tests/source-check: Fix make inclusion-guard check EOL-agnostic
>   tests/hevc: Fix concat input when running in MSYS2 shell



Hi,

would anybody be able to take a quick look at this patchset?

The third one has been confirmed by Zhao already and the first 
two are very short and simple as well.

This would allow me to change the Windows CI runners on Patchwork to
execute the full suite of FATE tests.

Thanks,
sw


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-21 Thread softworkz .

From: Timothy Allen 
Sent: Mittwoch, 21. Mai 2025 09:29
To: softworkz . ; FFmpeg development discussions and 
patches 
Subject: Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, 
instead of stripping preceding characters.

On Tue, 2025-05-20 at 20:03 +, softworkz . wrote:
I was just about to reply and suggest to replace those colons with %3A
(url-encoded) when I read the ticket, which already suggests that.

Have you tried it? It sounds like a much better way to me.

I think this would be a common-sense solution as long as one controls the 
server/content.

The reason I submitted the patch anyway was because not everyone will control 
the content they're consuming, and because, although I acknowledge it 
technically breaks RFCs, there is an argument that the RFC's behaviour is 
surprising; we can certainly see that other applications (Safari, in the linked 
ticket) break the RFC as well.

Why do you think it would require control over the server side?

Almost every http server in the world will decode the URL as one of its first 
steps. When
you have a file on disk with a space in it like

“/httproot/my  file.html”

then a URL like “http://server/my%20file.html” will retrieve that file without 
any change to the http server.

Best,
sw
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] Graphprint Patches Overview

2025-05-21 Thread softworkz .




> -Original Message-
> From: ffmpeg-devel  On Behalf Of Kieran
> Kunhya via ffmpeg-devel
> Sent: Mittwoch, 21. Mai 2025 13:00
> To: FFmpeg development discussions and patches 
> Cc: Kieran Kunhya 
> Subject: Re: [FFmpeg-devel] Graphprint Patches Overview
> 
> On Wed, 21 May 2025, 01:45 softworkz ., 
> wrote:
> 
> > Hello,
> >
> > thanks again to all for the patches. I figured it might be a bit difficult
> > to
> > keep track of what has already been submitted and fixed and is still
> > pending, and I'm sorry that there has been some duplicate effort to fix the
> > same things - so here's an overview. The ones with X are the ones I would
> > like to apply eventually:
> >
> >
> > Timo Rothenpieler
> > https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14528
> > (I would favor "ffbuild/commonmak" over for 1/3)
> >
> >   [1/3] fftools/resources: fix preservation of intermediary resman build
> > artifacts
> > X [2/3] ffbuild: correctly silence and tag new css/html steps
> > X [3/3] fftools/resources: add missing extensions to .gitignore
> >
> >
> >
> > Mark Thompson (already merged)
> > https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14537
> >
> > X [1/3] ffmpeg: Don't print graphs if there are no graphs to print
> > X [2/3] fftools/graphprint: Fix leak of graphprint object
> > X [3/3] fftools/graphprint: Fix leak of graph section header string
> >
> >
> > softworkz
> > https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14563
> >
> > X [1/5] fftools/makefile: Remove resources from ffprobe
> > X [2/5] fftools/resources: Use .SECONDARY in Makefile comment
> > X [3/5] fftools/ffmpeg: Free print_graph option variables
> > X [4/5] fftools/graphprint: Fix memory leaks
> > X [5/5] fftools/tf_mermaid: Add missing uninit and fix leaks
> >
> > https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14570
> > X [v3] ffbuild/commonmak: Fix rebuild check with implicit rule chains
> >
> >
> > Derek Buitenhuis
> > https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14569
> > (1/3 and 2/3 correspond to 2/3 from Timo, and 3/3 doesn't fix
> > the rebuild check like "commonmak" above does)
> >
> > [1/3] ffbuild/common: Remove what appears to be a temporary debugging
> > comment
> > [2/3] ffbuild/common: Properly tag/suppress sed command
> > [3/3] fftools/resoirces: Mark .css.min and .css.min.gz as NOTINTERMEDIATE
> >
> >
> > Thanks again,
> > sw
> >
> 
> Can we just revert the whole set until it's cleaned up properly?

So that it can be ignored for another 15 revisions?
I'm glad and thankful that others have looked at it now and I don't expect
so many more things to come.

> There are more patches to fix issues than the set itself. This is
> understandable if it's a bit architectural change like threading but it's
> not.

Please note that many of those patches are just single-line changes and
the "more fixes than patches" is not that unusual. It's only less visible
because nobody normally sends an overview like I did. Even though it's
not an architectural change, it's still an entirely new feature with 2k
lines of new code (counting the last 4 commits only).

I really would have preferred this to have happened while the patches 
were submitted for review and I don't know why nobody had responded to 
my messages asking whether somebody would still like to review it and 
would need more time. I would have happily waited longer - even weeks,
if someone had said anything.

Finally, for a "better picture", I could have excluded the patches from
Mark (already merged) and Derek (duplicate), but I still listed them
to acknowledge their efforts.

Best,
sw

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] avcodec: reorder MP3 decoder declarations to match MP2 pattern

2025-05-21 Thread sohzm

Fix inconsistent sample format reporting between probing and decoding.
Previously, avformat_find_stream_info reported fltp format for MP3
streams but frames were decoded as s16p.

Fixes ticket/11561
---
 libavcodec/allcodecs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index cd4f6ecd59..329e410aee 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -500,8 +500,8 @@ extern const FFCodec ff_mp2_encoder;
 extern const FFCodec ff_mp2_decoder;
 extern const FFCodec ff_mp2float_decoder;
 extern const FFCodec ff_mp2fixed_encoder;
-extern const FFCodec ff_mp3float_decoder;
 extern const FFCodec ff_mp3_decoder;
+extern const FFCodec ff_mp3float_decoder;
 extern const FFCodec ff_mp3adufloat_decoder;
 extern const FFCodec ff_mp3adu_decoder;
 extern const FFCodec ff_mp3on4float_decoder;
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec: reorder MP3 decoder declarations to match MP2 pattern

2025-05-21 Thread Lynne


On 22/05/2025 06:21, sohzm wrote:

Fix inconsistent sample format reporting between probing and decoding.
Previously, avformat_find_stream_info reported fltp format for MP3
streams but frames were decoded as s16p.

Fixes ticket/11561
---
  libavcodec/allcodecs.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index cd4f6ecd59..329e410aee 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -500,8 +500,8 @@ extern const FFCodec ff_mp2_encoder;
  extern const FFCodec ff_mp2_decoder;
  extern const FFCodec ff_mp2float_decoder;
  extern const FFCodec ff_mp2fixed_encoder;
-extern const FFCodec ff_mp3float_decoder;
  extern const FFCodec ff_mp3_decoder;
+extern const FFCodec ff_mp3float_decoder;
  extern const FFCodec ff_mp3adufloat_decoder;
  extern const FFCodec ff_mp3adu_decoder;
  extern const FFCodec ff_mp3on4float_decoder;


It should be the other way around, the float decoder should be picked first.


OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/7] avutil/vulkan: automatically enable shader device address usage bit

2025-05-21 Thread Lynne


On 18/05/2025 21:11, Niklas Haas wrote:

From: Niklas Haas 

We require this internally when using descriptor buffers, so it makes sense
to enable it internally, also.
---
  libavutil/vulkan.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c
index 5f2ac6267d..97c008c809 100644
--- a/libavutil/vulkan.c
+++ b/libavutil/vulkan.c
@@ -989,6 +989,9 @@ int ff_vk_create_buf(FFVulkanContext *s, FFVkBuffer *buf, 
size_t size,
  int use_ded_mem;
  FFVulkanFunctions *vk = &s->vkfn;
  
+if (s->extensions & FF_VK_EXT_DESCRIPTOR_BUFFER)

+usage |= VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT;


You should omit the flag if the usage contains
VK_BUFFER_USAGE_VIDEO_DECODE_SRC_BIT_KHR or 
VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR, since its not used there.



+
  VkBufferCreateInfo buf_spawn = {
  .sType   = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
  .pNext   = pNext,


Apart from that, okay.


OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/7] avfilter/vf_gblur_vulkan: omit unnecessary buffer usage flag

2025-05-21 Thread Lynne


On 18/05/2025 21:11, Niklas Haas wrote:

From: Niklas Haas 

Implied internally now when needed.
---
  libavfilter/vf_gblur_vulkan.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/libavfilter/vf_gblur_vulkan.c b/libavfilter/vf_gblur_vulkan.c
index 80b66de735..fb676a7fc9 100644
--- a/libavfilter/vf_gblur_vulkan.c
+++ b/libavfilter/vf_gblur_vulkan.c
@@ -171,7 +171,6 @@ static int init_gblur_pipeline(GBlurVulkanContext *s,
  RET(ff_vk_shader_register_exec(&s->vkctx, &s->e, shd));
  
  RET(ff_vk_create_buf(&s->vkctx, params_buf, sizeof(float) * ksize, NULL, NULL,

- VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT |
   VK_BUFFER_USAGE_STORAGE_BUFFER_BIT,
   VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT));
  RET(ff_vk_map_buffer(&s->vkctx, params_buf, &kernel_mapped, 0));


Its used in a lot more places than here, but its a start.


OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] Posting correctly (was: Accept a colon in the path of a URI, instead of stripping preceding characters.)

2025-05-21 Thread Nicolas George

softworkz . (HE12025-05-21):
> From: Timothy Allen 
> Sent: Mittwoch, 21. Mai 2025 09:29
> To: softworkz . ; FFmpeg development discussions and 
> patches 
> Subject: Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, 
> instead of stripping preceding characters.
> 
> On Tue, 2025-05-20 at 20:03 +, softworkz . wrote:
> I was just about to reply and suggest to replace those colons with %3A
> (url-encoded) when I read the ticket, which already suggests that.

Your message is absolutely unreadable. It is not my software that makes
it so, it is unreadable on the official archive as well:

https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343902.html

Compare with the message you are responding to:

https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343841.html

It is impossible to know which parts you wrote and which parts you
quote, let alone distinguish quotes from double quotes.

Before you allow yourself to badmouth the tool mailing again, you need
to make the minimum amount of effort to use it properly. If not for
yourself, at least out of basic respects for the other people who use
it. That means making sure your mails are at least as readable as the
ones from other experienced developers. You can observe them rendered in
the archive, not mangled by your local mail software. Start with having
a real attribution instead of that awful blob of headers.

And if that makes you realize the software you have been using is crap
and change it, so much the better, and then realize that with proper
software a mailing-list is an excellent tool, so much the better. But I
will not hold my breath.

-- 
  Nicolas George
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-21 Thread softworkz .




> -Original Message-
> From: ffmpeg-devel  On Behalf Of Timothy
> Allen via ffmpeg-devel
> Sent: Mittwoch, 21. Mai 2025 21:09
> To: ffmpeg-devel@ffmpeg.org
> Cc: Timothy Allen 
> Subject: Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI,
> instead of stripping preceding characters.
> 
> On Wed, 2025-05-21 at 18:56 +, softworkz . wrote:
> > Why do you think it would require control over the server side?
> 
> The original ticket is referring to HLS, and specifically the manifest
> of HLS, which means a remotely-hosted M3U playlist.
> 
> In principle, the user could download the playlist, convert any
> relative URLs to absolute URLs, and percent-encode the URLs.
> 
> In practice, for most users, the URL will simply not play (or will skip
> any segments containing colons in the URL) without any indication of
> why the URL is failing.

What I mean and what the comment in the ticket is probably suggesting, 
is that the HLS demuxer should URL-encode the URL after combining the 
base url with the segment file name before making a request for the 
segment.

PS: Thanks for switching to plain-text.

Best,
sw


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-21 Thread Timothy Allen via ffmpeg-devel

On Wed, 2025-05-21 at 18:56 +, softworkz . wrote:
> Why do you think it would require control over the server side?

The original ticket is referring to HLS, and specifically the manifest
of HLS, which means a remotely-hosted M3U playlist.

In principle, the user could download the playlist, convert any
relative URLs to absolute URLs, and percent-encode the URLs.

In practice, for most users, the URL will simply not play (or will skip
any segments containing colons in the URL) without any indication of
why the URL is failing.

Thanks,

Tim
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 3/3] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Andreas Rheinhardt

Zhao Zhili:
> From: Zhao Zhili 
> 
> ---
>  tests/fate/hevc.mak| 3 +++
>  tests/ref/fate/hevc-color-reserved | 6 ++
>  2 files changed, 9 insertions(+)
>  create mode 100644 tests/ref/fate/hevc-color-reserved
> 
> diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
> index 390ccf46e2..8113c04300 100644
> --- a/tests/fate/hevc.mak
> +++ b/tests/fate/hevc.mak
> @@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
> fate-hevc-mv-position
>  fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
>  FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
>  
> +fate-hevc-color-reserved: CMD = framecrc -bsf:v 
> hevc_metadata=colour_primaries=0:transfer_characteristics=0:matrix_coefficients=3
>  -i $(TARGET_SAMPLES)/hevc-conformance/AMP_A_Samsung_4.bit -vf 
> scale,format=nv12 -frames:v 1
> +FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, HEVC_METADATA_BSF SCALE_FILTER) += 
> fate-hevc-color-reserved
> +
>  FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
>  FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
>  
> diff --git a/tests/ref/fate/hevc-color-reserved 
> b/tests/ref/fate/hevc-color-reserved
> new file mode 100644
> index 00..cba6397aa8
> --- /dev/null
> +++ b/tests/ref/fate/hevc-color-reserved
> @@ -0,0 +1,6 @@
> +#tb 0: 1/25
> +#media_type 0: video
> +#codec_id 0: rawvideo
> +#dimensions 0: 2560x1600
> +#sar 0: 0/1
> +0,  0,  0,1,  6144000, 0x427b9a00

This seems to simply presume that hevc_metadata does what it is supposed
to do, but this is not really tested. If hevc_metadata were a no-op for
this, it would pass the test. Why don't you use something with a hash of
the output of hevc_metadata?

- Andreas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] qt-faststart: update co64 chunk offsets when converting stco to co64

2025-05-21 Thread CesarATV

Before this patch, when qt-faststart converted stco atoms to co64, it
did so without updating the chunk offsets of pre-existing co64 atoms,
resulting in corrupted tracks. This patch ensures that existing co64
chunk offsets are correctly adjusted when such a conversion occurs.

Signed-off-by: CesarATV 
---
 tools/qt-faststart.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/qt-faststart.c b/tools/qt-faststart.c
index 46950a5cf4..3c3b0890a8 100644
--- a/tools/qt-faststart.c
+++ b/tools/qt-faststart.c
@@ -218,7 +218,7 @@ static int 
update_stco_offsets(update_chunk_offsets_context_t *context, atom_t *
 return 0;
 }
 
-static int update_co64_offsets(update_chunk_offsets_context_t *context, atom_t 
*atom)
+static int update_co64_offsets(uint64_t offset_increment, atom_t *atom)
 {
 uint64_t current_offset;
 uint32_t offset_count;
@@ -241,7 +241,7 @@ static int 
update_co64_offsets(update_chunk_offsets_context_t *context, atom_t *
 pos < end;
 pos += 8) {
 current_offset = BE_64(pos);
-current_offset += context->moov_atom_size;
+current_offset += offset_increment;
 AV_WB64(pos, current_offset);
 }
 
@@ -258,7 +258,7 @@ static int update_chunk_offsets_callback(void *ctx, atom_t 
*atom)
 return update_stco_offsets(context, atom);
 
 case CO64_ATOM:
-return update_co64_offsets(context, atom);
+return update_co64_offsets(context->moov_atom_size, atom);
 
 case MOOV_ATOM:
 case TRAK_ATOM:
@@ -359,6 +359,10 @@ static int upgrade_stco_callback(void *ctx, atom_t *atom)
 set_atom_size(start_pos, atom->header_size, context->dest - start_pos);
 break;
 
+case CO64_ATOM:
+update_co64_offsets(context->new_moov_size - 
context->original_moov_size, atom);
+/* fallthrough */
+
 default:
 copy_size = atom->header_size + atom->size;
 memcpy(context->dest, atom->data - atom->header_size, copy_size);
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 14/17] swscale/x86: add SIMD backend

2025-05-21 Thread Kieran Kunhya via ffmpeg-devel

On Wed, May 21, 2025 at 2:00 PM Niklas Haas  wrote:
>
> From: Niklas Haas 
>
> This covers most 8-bit and 16-bit ops, and some 32-bit ops. It also covers all
> floating point operations. While this is not yet 100% coverage, it's good
> enough for the vast majority of formats out there.
>
> Of special note is the packed shuffle fast path, which uses pshufb at vector
> sizes up to AVX512.

Can I ask if this has some kind of design documentation? Because it's
not exactly simple to understand what's going on here.
I would not like to repeat the mistakes of swscale.

Kieran
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/3] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Andreas Rheinhardt

Zhao Zhili:
> From: Zhao Zhili 
> 
> ---
>  tests/fate/hevc.mak| 3 +++
>  tests/ref/fate/hevc-color-reserved | 6 ++
>  2 files changed, 9 insertions(+)
>  create mode 100644 tests/ref/fate/hevc-color-reserved
> 
> diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
> index 390ccf46e2..5e721526d0 100644
> --- a/tests/fate/hevc.mak
> +++ b/tests/fate/hevc.mak
> @@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
> fate-hevc-mv-position
>  fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
>  FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
>  
> +fate-hevc-color-reserved: CMD = framecrc -i 
> $(TARGET_SAMPLES)/hevc/color_prim_reserved0.hevc -fps_mode passthrough 
> -sws_flags +accurate_rnd+bitexact -vf scale,format=nv12
> +FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, SCALE_FILTER) += 
> fate-hevc-color-reserved

A new sample for this? Why don't you just create one with hevc_metadata?

> +
>  FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
>  FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
>  
> diff --git a/tests/ref/fate/hevc-color-reserved 
> b/tests/ref/fate/hevc-color-reserved
> new file mode 100644
> index 00..3351628209
> --- /dev/null
> +++ b/tests/ref/fate/hevc-color-reserved
> @@ -0,0 +1,6 @@
> +#tb 0: 1/60
> +#media_type 0: video
> +#codec_id 0: rawvideo
> +#dimensions 0: 1920x900
> +#sar 0: 1/1
> +0,  0,  0,1,  2592000, 0xfa6fce1e

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread softworkz .



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Martin
> Storsjö
> Sent: Mittwoch, 21. Mai 2025 14:22
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.
> 
> On Wed, 21 May 2025, Andreas Rheinhardt wrote:
> 
> > Martin Storsjö:
> >> On Wed, 21 May 2025, Andreas Rheinhardt wrote:
> >>
> >>> Jiawei:
>  This patch modifies the FFmpeg build system to remove the explicit
>  disabling
>  of GCC's auto-vectorization feature.
> 
>  Modern GCC versions (>= 10.0) have demonstrated stable auto-
>  vectorization
>  capabilities through extensive optimizations in loop analysis and SIMD
>  code generation. The explicit -fno-tree-vectorize flag originally added
>  in commit 973859f (2009) to workaround early GCC vectorization
>  instability
>  is no longer necessary.
> 
>  Key improvements justifying this change:
>  1. Enhanced heuristics for loop vectorization cost models
>  2. Mature handling of alignment and memory access patterns
>  3. Robust fallback mechanisms for unsupported architectures
> 
>  This change allows FFmpeg to benefit from automated SIMD optimizations
>  when built with -O3 optimization level, particularly improving
>  performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> 
>  [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/
>  commit/973859f5230e77beea7bb59dc081870689d6d191
> 
>  ---
>   configure | 1 -
>   1 file changed, 1 deletion(-)
> 
>  diff --git a/configure b/configure
>  index 3730b0524c..b9e95ce4ec 100755
>  --- a/configure
>  +++ b/configure
>  @@ -7656,7 +7656,6 @@ if enabled icc; then
>   disable aligned_stack
>   fi
>   elif enabled gcc; then
>  -    check_optflags -fno-tree-vectorize
>   check_cflags -Werror=format-security
>   check_cflags -Werror=implicit-function-declaration
>   check_cflags -Werror=missing-prototypes
> >>>
> >>> FYI: The last discussion about auto-vectorization is here:
> >>> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299405.html
> >>> It contains a report about a failing build with vectorization enabled:
> >>> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299421.html
> >>> I don't know whether this is still reproducible with the latest GCC.
> >>
> >> The issue which was reported last time, when compiling for i686 mingw32
> >> with --cpu=haswell, seems to have gone away in
> >> 182663a58a7a099e02e76da3b0f96d63e5c26a6d, where we made the whole
> >> problematic x86 inline cabac assembly noinline on i386. (That whole
> >> inline assembly block has been problematic in a large number of cases
> >> anyway.)
> >>
> >
> > So there are currently no known miscompilations due to vectorization
> > with GCC?
> 
> I'm not aware of any, but I haven't tested widely. It certainly is worth
> evalulating.
> 
> (From dav1d, I can anecdotally add that autovectorization does seem to
> help, somewhat, especially when there's not 100% assembly coverage for the
> use case. For some cases it make things slower than without
> autovectorization, but generally the net result is positive.)
> 
> // Martin
> ___

Hi,

a few years ago, I had spent days on that subject. Intel have some great
tools which allow precise analysis of how the compiler applies those
vectorization and loop optimizations - and it also works when it was
compiled with gcc, which is what I had been investigating. Focus was
the code in the vf_tonemap filter, later I briefly confirmed my findings
by looking at some other examples. Platform was x86_x64 only.

The outcome was that enabling tree-vectorize is beneficial, but combining
it with -O3 has adverse effects. Since then, we are using -O2 with 
tree-vectorization enabled on all platforms.

For CPU tone mapping, I still ended up doing a SIMD implementation using 
Intel intrinsics 😊


Best
sw
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 2/4] tests/fate/cbs: Add hevc metadata set color test

2025-05-21 Thread Zhao Zhili

From: Zhao Zhili 

---
 tests/fate/cbs.mak | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tests/fate/cbs.mak b/tests/fate/cbs.mak
index 32207e2ee2..138dab67a9 100644
--- a/tests/fate/cbs.mak
+++ b/tests/fate/cbs.mak
@@ -172,6 +172,11 @@ $(foreach N,$(FATE_CBS_DISCARD_TYPES),$(eval $(call 
FATE_CBS_DISCARD_TEST,hevc,$
 
 FATE_CBS_HEVC-$(call ALLYES, HEVC_DEMUXER HEVC_MUXER HEVC_PARSER 
FILTER_UNITS_BSF HEVC_METADATA_BSF FILE_PROTOCOL) += $(FATE_CBS_hevc_DISCARD)
 
+fate-cbs-hevc-metadata-set-color: CMD = md5 -i 
$(TARGET_SAMPLES)/hevc-conformance/AMP_A_Samsung_4.bit -c:v copy -bsf:v 
hevc_metadata=colour_primaries=0:transfer_characteristics=0:matrix_coefficients=3
 -f hevc
+fate-cbs-hevc-metadata-set-color: CMP = oneline
+fate-cbs-hevc-metadata-set-color: REF = d073124fca9e30a46c173292f948967c
+FATE_CBS_HEVC-$(call ALLYES, HEVC_DEMUXER, HEVC_METADATA_BSF, HEVC_MUXER) += 
fate-cbs-hevc-metadata-set-color
+
 FATE_SAMPLES_AVCONV += $(FATE_CBS_HEVC-yes)
 fate-cbs-hevc: $(FATE_CBS_HEVC-yes)
 
-- 
2.46.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 1/4] avcodec/h2645_vui: Ensure color primaries/trc/space isn't reserved value

2025-05-21 Thread Zhao Zhili

From: Zhao Zhili 

Fix error reported by swscaler:
Unsupported input (Operation not supported): fmt:yuv420p csp:unknown 
prim:reserved trc:bt709 -> fmt:yuv420p csp:bt709 prim:reserved trc:bt709
---
 libavcodec/h2645_vui.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/libavcodec/h2645_vui.c b/libavcodec/h2645_vui.c
index e5c7bf46f9..0e576c1563 100644
--- a/libavcodec/h2645_vui.c
+++ b/libavcodec/h2645_vui.c
@@ -67,11 +67,16 @@ void ff_h2645_decode_common_vui_params(GetBitContext *gb, 
H2645VUI *vui, void *l
 vui->matrix_coeffs= get_bits(gb, 8);
 
 // Set invalid values to "unspecified"
-if (!av_color_primaries_name(vui->colour_primaries))
+if (vui->colour_primaries == AVCOL_PRI_RESERVED0 ||
+vui->colour_primaries == AVCOL_PRI_RESERVED ||
+!av_color_primaries_name(vui->colour_primaries))
 vui->colour_primaries = AVCOL_PRI_UNSPECIFIED;
-if (!av_color_transfer_name(vui->transfer_characteristics))
+if (vui->transfer_characteristics == AVCOL_TRC_RESERVED0 ||
+vui->transfer_characteristics == AVCOL_TRC_RESERVED ||
+!av_color_transfer_name(vui->transfer_characteristics))
 vui->transfer_characteristics = AVCOL_TRC_UNSPECIFIED;
-if (!av_color_space_name(vui->matrix_coeffs))
+if (vui->matrix_coeffs == AVCOL_SPC_RESERVED ||
+!av_color_space_name(vui->matrix_coeffs))
 vui->matrix_coeffs = AVCOL_SPC_UNSPECIFIED;
 }
 }
-- 
2.46.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 4/4] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Zhao Zhili

From: Zhao Zhili 

---
 tests/fate/hevc.mak| 3 +++
 tests/ref/fate/hevc-color-reserved | 6 ++
 2 files changed, 9 insertions(+)
 create mode 100644 tests/ref/fate/hevc-color-reserved

diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
index 390ccf46e2..8113c04300 100644
--- a/tests/fate/hevc.mak
+++ b/tests/fate/hevc.mak
@@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
fate-hevc-mv-position
 fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
 FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
 
+fate-hevc-color-reserved: CMD = framecrc -bsf:v 
hevc_metadata=colour_primaries=0:transfer_characteristics=0:matrix_coefficients=3
 -i $(TARGET_SAMPLES)/hevc-conformance/AMP_A_Samsung_4.bit -vf 
scale,format=nv12 -frames:v 1
+FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, HEVC_METADATA_BSF SCALE_FILTER) += 
fate-hevc-color-reserved
+
 FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
 FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
 
diff --git a/tests/ref/fate/hevc-color-reserved 
b/tests/ref/fate/hevc-color-reserved
new file mode 100644
index 00..cba6397aa8
--- /dev/null
+++ b/tests/ref/fate/hevc-color-reserved
@@ -0,0 +1,6 @@
+#tb 0: 1/25
+#media_type 0: video
+#codec_id 0: rawvideo
+#dimensions 0: 2560x1600
+#sar 0: 0/1
+0,  0,  0,1,  6144000, 0x427b9a00
-- 
2.46.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 3/4] tests/fate/hevc: Fix dependancy for hevc-alpha

2025-05-21 Thread Zhao Zhili

From: Zhao Zhili 

---
 tests/fate/hevc.mak | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
index e432345ef7..390ccf46e2 100644
--- a/tests/fate/hevc.mak
+++ b/tests/fate/hevc.mak
@@ -292,7 +292,7 @@ fate-hevc-mv-position: CMD = framecrc -i 
$(TARGET_SAMPLES)/hevc/multiview.mov -m
 FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-mv-position
 
 fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
-FATE_HEVC-$(call FRAMECRC, HEVC, HEVC) += fate-hevc-alpha
+FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
 
 FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
 FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
-- 
2.46.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 3/3] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Zhao Zhili



> On May 21, 2025, at 21:52, Andreas Rheinhardt 
>  wrote:
> 
> Zhao Zhili:
>> From: Zhao Zhili 
>> 
>> ---
>> tests/fate/hevc.mak| 3 +++
>> tests/ref/fate/hevc-color-reserved | 6 ++
>> 2 files changed, 9 insertions(+)
>> create mode 100644 tests/ref/fate/hevc-color-reserved
>> 
>> diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
>> index 390ccf46e2..8113c04300 100644
>> --- a/tests/fate/hevc.mak
>> +++ b/tests/fate/hevc.mak
>> @@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
>> fate-hevc-mv-position
>> fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
>> FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
>> 
>> +fate-hevc-color-reserved: CMD = framecrc -bsf:v 
>> hevc_metadata=colour_primaries=0:transfer_characteristics=0:matrix_coefficients=3
>>  -i $(TARGET_SAMPLES)/hevc-conformance/AMP_A_Samsung_4.bit -vf 
>> scale,format=nv12 -frames:v 1
>> +FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, HEVC_METADATA_BSF SCALE_FILTER) += 
>> fate-hevc-color-reserved
>> +
>> FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
>> FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
>> 
>> diff --git a/tests/ref/fate/hevc-color-reserved 
>> b/tests/ref/fate/hevc-color-reserved
>> new file mode 100644
>> index 00..cba6397aa8
>> --- /dev/null
>> +++ b/tests/ref/fate/hevc-color-reserved
>> @@ -0,0 +1,6 @@
>> +#tb 0: 1/25
>> +#media_type 0: video
>> +#codec_id 0: rawvideo
>> +#dimensions 0: 2560x1600
>> +#sar 0: 0/1
>> +0,  0,  0,1,  6144000, 0x427b9a00
> 
> This seems to simply presume that hevc_metadata does what it is supposed
> to do, but this is not really tested. If hevc_metadata were a no-op for
> this, it would pass the test. Why don't you use something with a hash of
> the output of hevc_metadata?

Added another test. Forgive me for "PATCH v2” which should be "PATCH v3”.

https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343927.html

> 
> - Andreas
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] swscale: rgb_to_yuv neon optimizations

2025-05-21 Thread Dmitriy Kovalenko


Bumping on the review for this one


On 19/05/2025 21:50, Dmitriy Kovalenko wrote:

I've found quite a few ways to optimize existing ffmpeg's rgb to yuv
subsampled conversion. In this patch stack I'll try to
improve the performance.

This particular set of changes is a small improvement to all the
existing functions and macro. The biggest performance gain is
coming from post loading increment of the pointer and immediate
pref etching of the memory blocks and interleaving the multiplication 
shifting operations of

different registers for better scheduling.

Also changed a bunch of places where cmp + b.le was used instead
of one instruction cbnz/tbnz and some other small cleanups.

Here are checkasm results on the macbook pro with the latest M4 max



bgra_to_uv_1080_c: 257.5 ( 1.00x)
bgra_to_uv_1080_neon:  211.9 ( 1.22x)
bgra_to_uv_1920_c: 467.1 ( 1.00x)
bgra_to_uv_1920_neon:  379.3 ( 1.23x)
bgra_to_uv_half_1080_c:    198.9 ( 1.00x)
bgra_to_uv_half_1080_neon: 125.7 ( 1.58x)
bgra_to_uv_half_1920_c:    346.3 ( 1.00x)
bgra_to_uv_half_1920_neon: 223.7 ( 1.55x)



bgra_to_uv_1080_c: 268.3 ( 1.00x)
bgra_to_uv_1080_neon:  176.0 ( 1.53x)
bgra_to_uv_1920_c: 456.6 ( 1.00x)
bgra_to_uv_1920_neon:  307.7 ( 1.48x)
bgra_to_uv_half_1080_c:    193.2 ( 1.00x)
bgra_to_uv_half_1080_neon:  96.8 ( 2.00x)
bgra_to_uv_half_1920_c:    347.2 ( 1.00x)
bgra_to_uv_half_1920_neon: 182.6 ( 1.92x)

With my proprietary test on IOS it gives around 70% of performance
improvement converting bgra 1920x1920 image to yuv420p

On my linux arm cortex-r processing the performance improvement not that
visible but still consistently faster by 5-10% than the current
implementation.

Signed-off-by: Dmitriy Kovalenko 
---
 libswscale/aarch64/input.S | 166 +
 1 file changed, 112 insertions(+), 54 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index c1c0adffc8..ee8eb24c14 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -1,5 +1,4 @@
-/*
- * Copyright (c) 2024 Zhao Zhili 
+/* Copyright (c) 2024 Zhao Zhili 
  *
  * This file is part of FFmpeg.
  *
@@ -57,20 +56,41 @@
 sqshrn2 \dst\().8h, \dst2\().4s, \right_shift // 
dst_higher_half = dst2 >> right_shift

 .endm
 +// interleaved product version of the rgb to yuv gives slightly 
better performance on non-performant mobile +.macro 
rgb_to_uv_interleaved_product r, g, b, u_coef0, u_coef1, u_coef2, 
v_coef0, v_coef1, v_coef2, u_dst1, u_dst2, v_dst1, v_dst2, u_dst, 
v_dst, right_shift
+    smlal   \u_dst1\().4s, \u_coef0\().4h, \r\().4h // U += ru * 
r (first 4)
+    smlal   \v_dst1\().4s, \v_coef0\().4h, \r\().4h // V += rv * 
r (first 4)
+    smlal2  \u_dst2\().4s, \u_coef0\().8h, \r\().8h // U += ru * 
r (second 4)
+    smlal2  \v_dst2\().4s, \v_coef0\().8h, \r\().8h // V += rv * 
r (second 4)
+    +    smlal   \u_dst1\().4s, \u_coef1\().4h, \g\().4h // U += 
gu * g (first 4)
+    smlal   \v_dst1\().4s, \v_coef1\().4h, \g\().4h // V += gv * 
g (first 4)
+    smlal2  \u_dst2\().4s, \u_coef1\().8h, \g\().8h // U += gu * 
g (second 4)
+    smlal2  \v_dst2\().4s, \v_coef1\().8h, \g\().8h // V += gv * 
g (second 4)
+    +    smlal   \u_dst1\().4s, \u_coef2\().4h, \b\().4h // U += 
bu * b (first 4)
+    smlal   \v_dst1\().4s, \v_coef2\().4h, \b\().4h // V += bv * 
b (first 4)
+    smlal2  \u_dst2\().4s, \u_coef2\().8h, \b\().8h // U += bu * 
b (second 4)
+    smlal2  \v_dst2\().4s, \v_coef2\().8h, \b\().8h // V += bv * 
b (second 4)

+
+    sqshrn  \u_dst\().4h, \u_dst1\().4s, \right_shift   // U first 4 
pixels
+    sqshrn2 \u_dst\().8h, \u_dst2\().4s, \right_shift   // U all 8 
pixels
+    sqshrn  \v_dst\().4h, \v_dst1\().4s, \right_shift   // V first 4 
pixels
+    sqshrn2 \v_dst\().8h, \v_dst2\().4s, \right_shift   // V all 8 
pixels

+.endm
+
 .macro rgbToY_neon fmt_bgr, fmt_rgb, element, alpha_first=0
 function ff_\fmt_bgr\()ToY_neon, export=1
-    cmp w4, #0  // check width > 0
+    cbz w4, 3f  // check width > 0
 ldp w12, w11, [x5]  // w12: ry, w11: gy
 ldr w10, [x5, #8]   // w10: by
-    b.gt    4f
-    ret
+    b   4f
 endfunc
  function ff_\fmt_rgb\()ToY_neon, export=1
-    cmp w4, #0  // check width > 0
+    cbz w4, 3f  // check width > 0

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Jiawei


在 2025/5/22 2:21, Frank Plowman 写道:

On 21/05/2025 11:17, Jiawei wrote:

在 2025/5/21 14:52, Nicolas George 写道:

Jiawei (HE12025-05-21):

  particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.

Benchmark needed.

Regards,


Hi Nicolas,


Since I am a gcc developer, I'm not so familiar with the FFmpeg test
flow, here is my test process,
if there exists anything uncorrect, please point me out:


1. Download the video bbb_sunflower_2160p_30fps_normal.mp4.zip

from https://download.blender.org/demo/movies/BBB/，

```

ffmpeg -i bbb_sunflower_2160p_30fps_normal.mp4 -t 60 -vf
"scale=1920:1080" -c:v libx265 -c:a libmp3lame 1080p_hevc_mp3.mp4
```

get the 1080p video as Benchmark test video


2. Build two version of FFmpeg, one with the modify,  another without
the patch modif, using the gcc 13.3 release version,

verified with Intel(R) Core(TM) Ultra 9 285HX


Using patch:

```
./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg
developers
    built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
    configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64
--extra-cflags=-O3 --enable-static --target-os=linux
    libavutil  60.  2.100 / 60.  2.100
    libavcodec 62.  3.101 / 62.  3.101
    libavformat    62.  0.102 / 62.  0.102
    libavdevice    62.  0.100 / 62.  0.100
    libavfilter    11.  0.100 / 11.  0.100
    libswscale  9.  0.100 /  9.  0.100
    libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from
'/home/pz9115/mp/1080p_hevc_mp3.mp4':
    Metadata:
      major_brand : isom
      minor_version   : 512
      compatible_brands: isomiso2mp41
      title   : Big Buck Bunny, Sunflower version
      artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
      composer    : Sacha Goedegebure
      encoder : Lavf60.16.100
      comment : Creative Commons Attribution 3.0 -
http://bbb3d.renderfarming.net
      genre   : Animation
    Duration: 00:01:00.00, start: 0.00, bitrate: 1564 kb/s
    Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568),
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30
fps, 30 tbr, 15360 tbn (default)
      Metadata:
    handler_name    : GPAC ISO Video Handler
    vendor_id   : [0][0][0][0]
    encoder : Lavc60.31.102 libx265
    Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D),
48000 Hz, stereo, fltp, 128 kb/s (default)
      Metadata:
    handler_name    : GPAC ISO Audio Handler
    vendor_id   : [0][0][0][0]
Stream mapping:
    Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
    Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
    Metadata:
      major_brand : isom
      minor_version   : 512
      compatible_brands: isomiso2mp41
      title   : Big Buck Bunny, Sunflower version
      artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
      composer    : Sacha Goedegebure
      genre   : Animation
      comment : Creative Commons Attribution 3.0 -
http://bbb3d.renderfarming.net
      encoder : Lavf62.0.102
    Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive),
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
      Metadata:
    encoder : Lavc62.3.101 wrapped_avframe
    handler_name    : GPAC ISO Video Handler
    vendor_id   : [0][0][0][0]
    Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
(default)
      Metadata:
    encoder : Lavc62.3.101 pcm_s16le
    handler_name    : GPAC ISO Audio Handler
    vendor_id   : [0][0][0][0]
[out#0/null @ 0x565233669eb0] video:731KiB audio:11250KiB subtitle:0KiB
other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 1800 fps=635 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A
speed=21.2x elapsed=0:00:02.83
bench: utime=11.324s stime=0.290s rtime=2.834s
bench: maxrss=186556KiB
```

Without patch(here I add the fno-tree-vectorize directly):

./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg
developers
    built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
    configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64
--extra-cflags='-O3 -fno-tree-vectorize' --enable-static --target-os=linux
    libavutil  60.  2.100 / 60.  2.100
    libavcodec 62.  3.101 / 62.  3.101
    libavformat    62.  0.102 / 62.  0.102
    libavdevice    62.  0.100 / 62.  0.100
    libavfilter    11.  0.100 / 11.  0.100
    libswscale  9.  0.100 /  9.  0.100
    libswresample   6.  0.100 /  6.

Re: [FFmpeg-devel] [PATCH 5/7] avfilter/blackdetect_vulkan: add hw accelerated blackdetect filter

2025-05-21 Thread Lynne


On 18/05/2025 21:11, Niklas Haas wrote:

From: Niklas Haas 

Like vf_blackdetect but better, faster, stronger, harder.

Signed-off-by: Niklas Haas 
Sponsored-by: nxtedition
---
  configure   |   1 +
  doc/filters.texi|   2 +-
  libavfilter/Makefile|   1 +
  libavfilter/allfilters.c|   1 +
  libavfilter/vf_blackdetect_vulkan.c | 431 
  5 files changed, 435 insertions(+), 1 deletion(-)
  create mode 100644 libavfilter/vf_blackdetect_vulkan.c

LGTM


OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 7/7] avutil/vf_scdet_vulkan: add new filter

2025-05-21 Thread Lynne


On 18/05/2025 21:11, Niklas Haas wrote:

From: Niklas Haas 

Carbon copy of vf_scdet.

Signed-off-by: Niklas Haas 
Sponsored-by: nxtedition
---
  configure |   1 +
  libavfilter/Makefile  |   1 +
  libavfilter/allfilters.c  |   1 +
  libavfilter/vf_scdet_vulkan.c | 412 ++
  4 files changed, 415 insertions(+)
  create mode 100644 libavfilter/vf_scdet_vulkan.c


LGTM


OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v7 0/4] Remove chained ogg stream header packets from the demuxer

2025-05-21 Thread Romain Beauxis

## Changes since last revision:
* Removed invalid return statement in vorbisdec.c

Romain Beauxis (4):
  libavformat/oggdec.{c,h}: Add new_extradata, use it to pass extradata
to the next decoded packet.
  ogg/vorbis: factor out header processing logic.
  ogg/vorbis: implement header packet skip in chained ogg bitstreams.
  libavformat/oggdec.h: Change paket function documentation to return 1
on header packets only.

 libavcodec/vorbis_parser.h |  11 ++
 libavcodec/vorbisdec.c |  75 +
 libavformat/oggdec.c   |  11 ++
 libavformat/oggdec.h   |   6 +-
 libavformat/oggparsevorbis.c   | 167 +++--
 tests/ref/fate/ogg-vorbis-chained-meta.txt |   3 -
 6 files changed, 190 insertions(+), 83 deletions(-)

-- 
2.39.5 (Apple Git-154)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v7 3/4] ogg/vorbis: implement header packet skip in chained ogg bitstreams.

2025-05-21 Thread Romain Beauxis

---
 libavcodec/vorbis_parser.h | 11 
 libavcodec/vorbisdec.c | 75 +-
 libavformat/oggparsevorbis.c   | 63 +-
 tests/ref/fate/ogg-vorbis-chained-meta.txt |  3 -
 4 files changed, 115 insertions(+), 37 deletions(-)

diff --git a/libavcodec/vorbis_parser.h b/libavcodec/vorbis_parser.h
index 789932ac49..b176fe536c 100644
--- a/libavcodec/vorbis_parser.h
+++ b/libavcodec/vorbis_parser.h
@@ -30,6 +30,17 @@
 
 typedef struct AVVorbisParseContext AVVorbisParseContext;
 
+/**
+ * Used by the vorbis parser to pass new chained stream headers
+ * as extradata.
+ */
+typedef struct vorbis_new_extradata {
+uint8_t *header;
+size_t   header_size;
+uint8_t *setup;
+size_t   setup_size;
+} vorbis_new_extradata;
+
 /**
  * Allocate and initialize the Vorbis parser using headers in the extradata.
  */
diff --git a/libavcodec/vorbisdec.c b/libavcodec/vorbisdec.c
index adbd726183..6c5d3fa50f 100644
--- a/libavcodec/vorbisdec.c
+++ b/libavcodec/vorbisdec.c
@@ -43,6 +43,7 @@
 #include "vorbis.h"
 #include "vorbisdsp.h"
 #include "vorbis_data.h"
+#include "vorbis_parser.h"
 #include "xiph.h"
 
 #define V_NB_BITS 8
@@ -1778,47 +1779,59 @@ static int vorbis_decode_frame(AVCodecContext *avctx, 
AVFrame *frame,
 GetBitContext *gb = &vc->gb;
 float *channel_ptrs[255];
 int i, len, ret;
+size_t new_extradata_size;
+vorbis_new_extradata *new_extradata;
+const uint8_t *header;
+const uint8_t *setup;
 
 ff_dlog(NULL, "packet length %d \n", buf_size);
 
-if (*buf == 1 && buf_size > 7) {
-if ((ret = init_get_bits8(gb, buf + 1, buf_size - 1)) < 0)
-return ret;
+new_extradata = (vorbis_new_extradata *)av_packet_get_side_data(
+avpkt, AV_PKT_DATA_NEW_EXTRADATA, &new_extradata_size);
 
-vorbis_free(vc);
-if ((ret = vorbis_parse_id_hdr(vc))) {
-av_log(avctx, AV_LOG_ERROR, "Id header corrupt.\n");
-vorbis_free(vc);
-return ret;
-}
+if (new_extradata) {
+header = new_extradata->header;
+setup = new_extradata->setup;
 
-av_channel_layout_uninit(&avctx->ch_layout);
-if (vc->audio_channels > 8) {
-avctx->ch_layout.order   = AV_CHANNEL_ORDER_UNSPEC;
-avctx->ch_layout.nb_channels = vc->audio_channels;
-} else {
-av_channel_layout_copy(&avctx->ch_layout, 
&ff_vorbis_ch_layouts[vc->audio_channels - 1]);
-}
+if (header && *header == 1 && new_extradata->header_size > 7) {
+if ((ret = init_get_bits8(
+gb, header + 1,
+new_extradata->header_size - 1)) < 0)
+return ret;
 
-avctx->sample_rate = vc->audio_samplerate;
-return buf_size;
-}
+vorbis_free(vc);
+if ((ret = vorbis_parse_id_hdr(vc))) {
+av_log(avctx, AV_LOG_ERROR, "Id header corrupt.\n");
+vorbis_free(vc);
+return ret;
+}
 
-if (*buf == 3 && buf_size > 7) {
-av_log(avctx, AV_LOG_DEBUG, "Ignoring comment header\n");
-return buf_size;
-}
+av_channel_layout_uninit(&avctx->ch_layout);
+if (vc->audio_channels > 8) {
+avctx->ch_layout.order   = AV_CHANNEL_ORDER_UNSPEC;
+avctx->ch_layout.nb_channels = vc->audio_channels;
+} else {
+av_channel_layout_copy(
+&avctx->ch_layout,
+&ff_vorbis_ch_layouts[vc->audio_channels - 1]);
+}
 
-if (*buf == 5 && buf_size > 7 && vc->channel_residues && !vc->modes) {
-if ((ret = init_get_bits8(gb, buf + 1, buf_size - 1)) < 0)
-return ret;
+avctx->sample_rate = vc->audio_samplerate;
+}
 
-if ((ret = vorbis_parse_setup_hdr(vc))) {
-av_log(avctx, AV_LOG_ERROR, "Setup header corrupt.\n");
-vorbis_free(vc);
-return ret;
+if (setup && *setup == 5 && new_extradata->setup_size > 7 &&
+vc->channel_residues && !vc->modes) {
+if ((ret = init_get_bits8(
+   gb, setup + 1,
+   new_extradata->setup_size - 1)) < 0)
+return ret;
+
+if ((ret = vorbis_parse_setup_hdr(vc))) {
+av_log(avctx, AV_LOG_ERROR, "Setup header corrupt.\n");
+vorbis_free(vc);
+return ret;
+}
 }
-return buf_size;
 }
 
 if (!vc->channel_residues || !vc->modes) {
diff --git a/libavformat/oggparsevorbis.c b/libavformat/oggparsevorbis.c
index 62cc2da6de..ee2e01f468 100644
--- a/libavformat/oggparsevorbis.c
+++ b/libavformat/oggparsevorbis.c
@@ -255,12 +255,19 @@ static void vorbis_cleanup(AVFormatContext *s, int idx)
 struct ogg *ogg = s->priv_data;
 struct ogg_stream *os =

[FFmpeg-devel] [PATCH v7 2/4] ogg/vorbis: factor out header processing logic.

2025-05-21 Thread Romain Beauxis

---
 libavformat/oggparsevorbis.c | 104 ---
 1 file changed, 60 insertions(+), 44 deletions(-)

diff --git a/libavformat/oggparsevorbis.c b/libavformat/oggparsevorbis.c
index 9f50ab9ffc..62cc2da6de 100644
--- a/libavformat/oggparsevorbis.c
+++ b/libavformat/oggparsevorbis.c
@@ -293,6 +293,62 @@ static int vorbis_update_metadata(AVFormatContext *s, int 
idx)
 return ret;
 }
 
+static int vorbis_parse_header(AVFormatContext *s, AVStream *st,
+   const uint8_t *p, unsigned int psize)
+{
+unsigned blocksize, bs0, bs1;
+int srate;
+int channels;
+
+if (psize != 30)
+return AVERROR_INVALIDDATA;
+
+p += 7; /* skip "\001vorbis" tag */
+
+if (bytestream_get_le32(&p) != 0) /* vorbis_version */
+return AVERROR_INVALIDDATA;
+
+channels = bytestream_get_byte(&p);
+if (st->codecpar->ch_layout.nb_channels &&
+channels != st->codecpar->ch_layout.nb_channels) {
+av_log(s, AV_LOG_ERROR, "Channel change is not supported\n");
+return AVERROR_PATCHWELCOME;
+}
+st->codecpar->ch_layout.nb_channels = channels;
+srate   = bytestream_get_le32(&p);
+p += 4; // skip maximum bitrate
+st->codecpar->bit_rate = bytestream_get_le32(&p); // nominal bitrate
+p += 4; // skip minimum bitrate
+
+blocksize = bytestream_get_byte(&p);
+bs0   = blocksize & 15;
+bs1   = blocksize >> 4;
+
+if (bs0 > bs1)
+return AVERROR_INVALIDDATA;
+if (bs0 < 6 || bs1 > 13)
+return AVERROR_INVALIDDATA;
+
+if (bytestream_get_byte(&p) != 1) /* framing_flag */
+return AVERROR_INVALIDDATA;
+
+st->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
+st->codecpar->codec_id   = AV_CODEC_ID_VORBIS;
+
+if (srate > 0) {
+if (st->codecpar->sample_rate &&
+srate != st->codecpar->sample_rate) {
+av_log(s, AV_LOG_ERROR, "Sample rate change is not supported\n");
+return AVERROR_PATCHWELCOME;
+}
+
+st->codecpar->sample_rate = srate;
+avpriv_set_pts_info(st, 64, 1, srate);
+}
+
+return 1;
+}
+
 static int vorbis_header(AVFormatContext *s, int idx)
 {
 struct ogg *ogg = s->priv_data;
@@ -329,50 +385,10 @@ static int vorbis_header(AVFormatContext *s, int idx)
 priv->packet[pkt_type >> 1] = av_memdup(os->buf + os->pstart, os->psize);
 if (!priv->packet[pkt_type >> 1])
 return AVERROR(ENOMEM);
-if (os->buf[os->pstart] == 1) {
-const uint8_t *p = os->buf + os->pstart + 7; /* skip "\001vorbis" tag 
*/
-unsigned blocksize, bs0, bs1;
-int srate;
-int channels;
-
-if (os->psize != 30)
-return AVERROR_INVALIDDATA;
-
-if (bytestream_get_le32(&p) != 0) /* vorbis_version */
-return AVERROR_INVALIDDATA;
-
-channels = bytestream_get_byte(&p);
-if (st->codecpar->ch_layout.nb_channels &&
-channels != st->codecpar->ch_layout.nb_channels) {
-av_log(s, AV_LOG_ERROR, "Channel change is not supported\n");
-return AVERROR_PATCHWELCOME;
-}
-st->codecpar->ch_layout.nb_channels = channels;
-srate   = bytestream_get_le32(&p);
-p += 4; // skip maximum bitrate
-st->codecpar->bit_rate = bytestream_get_le32(&p); // nominal bitrate
-p += 4; // skip minimum bitrate
-
-blocksize = bytestream_get_byte(&p);
-bs0   = blocksize & 15;
-bs1   = blocksize >> 4;
-
-if (bs0 > bs1)
-return AVERROR_INVALIDDATA;
-if (bs0 < 6 || bs1 > 13)
-return AVERROR_INVALIDDATA;
-
-if (bytestream_get_byte(&p) != 1) /* framing_flag */
-return AVERROR_INVALIDDATA;
-
-st->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
-st->codecpar->codec_id   = AV_CODEC_ID_VORBIS;
-
-if (srate > 0) {
-st->codecpar->sample_rate = srate;
-avpriv_set_pts_info(st, 64, 1, srate);
-}
-} else if (os->buf[os->pstart] == 3) {
+if (pkt_type == 1)
+return vorbis_parse_header(s, st, os->buf + os->pstart, os->psize);
+
+if (pkt_type == 3) {
 if (vorbis_update_metadata(s, idx) >= 0 && priv->len[1] > 10) {
 unsigned new_len;
 
-- 
2.39.5 (Apple Git-154)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v7 1/4] libavformat/oggdec.{c, h}: Add new_extradata, use it to pass extradata to the next decoded packet.

2025-05-21 Thread Romain Beauxis

---
 libavformat/oggdec.c | 11 +++
 libavformat/oggdec.h |  2 ++
 2 files changed, 13 insertions(+)

diff --git a/libavformat/oggdec.c b/libavformat/oggdec.c
index 5557eb4a14..cb77cdd994 100644
--- a/libavformat/oggdec.c
+++ b/libavformat/oggdec.c
@@ -77,6 +77,7 @@ static void free_stream(AVFormatContext *s, int i)
 
 av_freep(&stream->private);
 av_freep(&stream->new_metadata);
+av_freep(&stream->new_extradata);
 }
 
 //FIXME We could avoid some structure duplication
@@ -888,6 +889,16 @@ retry:
 os->new_metadata_size = 0;
 }
 
+if (os->new_extradata) {
+ret = av_packet_add_side_data(pkt, AV_PKT_DATA_NEW_EXTRADATA,
+  os->new_extradata, 
os->new_extradata_size);
+if (ret < 0)
+return ret;
+
+os->new_extradata = NULL;
+os->new_extradata_size = 0;
+}
+
 return psize;
 }
 
diff --git a/libavformat/oggdec.h b/libavformat/oggdec.h
index bc670d0f1e..5083de646c 100644
--- a/libavformat/oggdec.h
+++ b/libavformat/oggdec.h
@@ -94,6 +94,8 @@ struct ogg_stream {
 int end_trimming; ///< set the number of packets to drop from the end
 uint8_t *new_metadata;
 size_t new_metadata_size;
+uint8_t *new_extradata;
+size_t new_extradata_size;
 void *private;
 };
 
-- 
2.39.5 (Apple Git-154)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v7 4/4] libavformat/oggdec.h: Change paket function documentation to return 1 on header packets only.

2025-05-21 Thread Romain Beauxis

---
 libavformat/oggdec.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavformat/oggdec.h b/libavformat/oggdec.h
index 5083de646c..c15fbe738e 100644
--- a/libavformat/oggdec.h
+++ b/libavformat/oggdec.h
@@ -42,8 +42,8 @@ struct ogg_codec {
  * Attempt to process a packet as a data packet
  * @return < 0 (AVERROR) code or -1 on error
  * == 0 if the packet was a regular data packet.
- * == 0 or 1 if the packet was a header from a chained bitstream.
- *   (1 will cause the packet to be skiped in calling code 
(ogg_packet())
+ * == 1 if the packet was a header from a chained bitstream.
+ *This will cause the packet to be skiped in calling code 
(ogg_packet()
  */
 int (*packet)(AVFormatContext *, int);
 /**
-- 
2.39.5 (Apple Git-154)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] build: remove unused SLIBOBJS variable

2025-05-21 Thread Ramiro Polla

On Sat, May 17, 2025 at 12:36 AM Ramiro Polla  wrote:
>
> The SLIBOBJS variable was introduced in 56572787ae2 but is no longer used.
> Another variable, SHLIBOBJS, was introduced after SLIBOBJS, in 20b0d24c2f7.
> The functionality from SLIBOBJS was effectively migrated to SHLIBOBJS in 
> b77fff47d0d.
>
> No code has used SLIBOBJS since.
>
> This commit removes all remaining references to SLIBOBJS from the build 
> system.
> ---
>  Makefile| 3 +--
>  ffbuild/common.mak  | 7 ++-
>  ffbuild/library.mak | 2 +-
>  3 files changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index e2250f6bc6..09509bb930 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -104,8 +104,7 @@ SUBDIR_VARS := CLEANFILES FFLIBS HOSTPROGS TESTPROGS 
> TOOLS   \
> ALTIVEC-OBJS VSX-OBJS MMX-OBJS X86ASM-OBJS\
> MIPSFPU-OBJS MIPSDSPR2-OBJS MIPSDSP-OBJS MSA-OBJS \
> MMI-OBJS LSX-OBJS LASX-OBJS RV-OBJS RVV-OBJS RVVB-OBJS\
> -   OBJS SLIBOBJS SHLIBOBJS STLIBOBJS HOSTOBJS TESTOBJS   \
> -   SIMD128-OBJS
> +   OBJS SHLIBOBJS STLIBOBJS HOSTOBJS TESTOBJS SIMD128-OBJS
>
>  define RESET
>  $(1) :=
> diff --git a/ffbuild/common.mak b/ffbuild/common.mak
> index 0e1eb1f62b..1ac3c31c1e 100644
> --- a/ffbuild/common.mak
> +++ b/ffbuild/common.mak
> @@ -197,7 +197,6 @@ endif
>  include $(SRC_PATH)/ffbuild/arch.mak
>
>  OBJS  += $(OBJS-yes)
> -SLIBOBJS  += $(SLIBOBJS-yes)
>  SHLIBOBJS += $(SHLIBOBJS-yes)
>  STLIBOBJS += $(STLIBOBJS-yes)
>  FFLIBS:= $($(NAME)_FFLIBS) $(FFLIBS-yes) $(FFLIBS)
> @@ -207,7 +206,6 @@ LDLIBS   = $(FFLIBS:%=%$(BUILDSUF))
>  FFEXTRALIBS := $(LDLIBS:%=$(LD_LIB)) $(foreach lib,EXTRALIBS-$(NAME) 
> $(FFLIBS:%=EXTRALIBS-%),$($(lib))) $(EXTRALIBS)
>
>  OBJS  := $(sort $(OBJS:%=$(SUBDIR)%))
> -SLIBOBJS  := $(sort $(SLIBOBJS:%=$(SUBDIR)%))
>  SHLIBOBJS := $(sort $(SHLIBOBJS:%=$(SUBDIR)%))
>  STLIBOBJS := $(sort $(STLIBOBJS:%=$(SUBDIR)%))
>  TESTOBJS  := $(TESTOBJS:%=$(SUBDIR)tests/%) $(TESTPROGS:%=$(SUBDIR)tests/%.o)
> @@ -245,13 +243,12 @@ $(HOSTPROGS): %$(HOSTEXESUF): %.o
>  $(OBJS): | $(sort $(dir $(OBJS)))
>  $(HOBJS):| $(sort $(dir $(HOBJS)))
>  $(HOSTOBJS): | $(sort $(dir $(HOSTOBJS)))
> -$(SLIBOBJS): | $(sort $(dir $(SLIBOBJS)))
>  $(SHLIBOBJS): | $(sort $(dir $(SHLIBOBJS)))
>  $(STLIBOBJS): | $(sort $(dir $(STLIBOBJS)))
>  $(TESTOBJS): | $(sort $(dir $(TESTOBJS)))
>  $(TOOLOBJS): | tools
>
> -OUTDIRS := $(OUTDIRS) $(dir $(OBJS) $(HOBJS) $(HOSTOBJS) $(SLIBOBJS) 
> $(SHLIBOBJS) $(STLIBOBJS) $(TESTOBJS))
> +OUTDIRS := $(OUTDIRS) $(dir $(OBJS) $(HOBJS) $(HOSTOBJS) $(SHLIBOBJS) 
> $(STLIBOBJS) $(TESTOBJS))
>
>  CLEANSUFFIXES = *.d *.gcda *.gcno *.h.c *.ho *.map *.o *.objs *.pc *.ptx 
> *.ptx.gz *.ptx.c *.ver *.version *.html.gz *.html.c *.css.gz *.css.c  
> *$(DEFAULT_X86ASMD).asm *~ *.ilk *.pdb
>  LIBSUFFIXES   = *.a *.lib *.so *.so.* *.dylib *.dll *.def *.dll.a
> @@ -263,4 +260,4 @@ endef
>
>  $(eval $(RULES))
>
> --include $(wildcard $(OBJS:.o=.d) $(HOSTOBJS:.o=.d) $(TESTOBJS:.o=.d) 
> $(HOBJS:.o=.d) $(SHLIBOBJS:.o=.d) $(STLIBOBJS:.o=.d) $(SLIBOBJS:.o=.d)) 
> $(OBJS:.o=$(DEFAULT_X86ASMD).d)
> +-include $(wildcard $(OBJS:.o=.d) $(HOSTOBJS:.o=.d) $(TESTOBJS:.o=.d) 
> $(HOBJS:.o=.d) $(SHLIBOBJS:.o=.d) $(STLIBOBJS:.o=.d)) 
> $(OBJS:.o=$(DEFAULT_X86ASMD).d)
> diff --git a/ffbuild/library.mak b/ffbuild/library.mak
> index 288c82a177..569708c73b 100644
> --- a/ffbuild/library.mak
> +++ b/ffbuild/library.mak
> @@ -70,7 +70,7 @@ $(SUBDIR)lib$(NAME).ver: $(SUBDIR)lib$(NAME).v $(OBJS)
>  $(SUBDIR)$(SLIBNAME): $(SUBDIR)$(SLIBNAME_WITH_MAJOR)
> $(Q)cd ./$(SUBDIR) && $(LN_S) $(SLIBNAME_WITH_MAJOR) $(SLIBNAME)
>
> -$(SUBDIR)$(SLIBNAME_WITH_MAJOR): $(OBJS) $(SHLIBOBJS) $(SLIBOBJS) 
> $(SUBDIR)lib$(NAME).ver
> +$(SUBDIR)$(SLIBNAME_WITH_MAJOR): $(OBJS) $(SHLIBOBJS) $(SUBDIR)lib$(NAME).ver
> $(SLIB_CREATE_DEF_CMD)
>  ifeq ($(RESPONSE_FILES),yes)
> $(Q)echo $$(filter %.o,$$^) > $$@.objs
> --
> 2.39.5

I'll apply tomorrow if there are no objections.

Ramiro
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/3] swscale/swscale_unscaled: fix packed16togbra16() for formats with bpc between 9-14 bits

2025-05-21 Thread Ramiro Polla

On Mon, May 19, 2025 at 12:02 AM Ramiro Polla  wrote:
> On Sun, May 18, 2025 at 11:17 PM James Almer  wrote:
> > On 5/18/2025 6:14 PM, James Almer wrote:
> > > On 5/18/2025 5:52 PM, Ramiro Polla wrote:
> > >> Currently, packed16togbra16() always sets the alpha value to 0x,
> > >> without taking the bit depth into consideration.
> > >>
> > >> This commit restricts the alpha value to the bit depth.
> > >
> > > packed16togbra16() seems to only be called for BGR48 and BGRA64, both of
> > > which are 16bits, so this change is superfluous.
> >
> > Ah, nevermind, i was looking at the src formats, not dst.
> >
> > Are there no tests that cover these paths? I added a bunch a couple
> > months ago, so maybe it could be extended.
>
> It can be reproduced with:
> ./libswscale/tests/swscale -unscaled 1 -src xyz12le -dst gbrap12be
>
> A little bit more information: this bug only happens on x86. The
> problem arises from the optimized conversion that comes afterwards,
> from gbrap12be to yuva444p, in ff_hscale14to15_4_ssse3(). It has
> something to do with pmaddwd not working on unsigned values IIRC. We
> could fix ff_hscale14to15_4_ssse3() to also work correctly with 0x
> on bit depths < 16, or we could just not write 0x there in the
> first place, which is what this patch does.
>
> I thought about adding libswscale/tests/swscale to FATE, but the tests
> take way too long.

I'll apply this patchset and the other similar patch about
planarRgbToplanarRgbWrapper() tomorrow if there are no more comments.

Ramiro
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] Graphprint Patches Overview

2025-05-21 Thread softworkz .



> -Original Message-
> From: ffmpeg-devel  On Behalf Of softworkz .
> Sent: Mittwoch, 21. Mai 2025 22:20
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] Graphprint Patches Overview
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Kyle
> Swanson
> > Sent: Mittwoch, 21. Mai 2025 22:11
> > To: FFmpeg development discussions and patches 
> > Subject: Re: [FFmpeg-devel] Graphprint Patches Overview
> >
> > Hi,
> >
> > On Wed, May 21, 2025 at 4:00 AM Kieran Kunhya via ffmpeg-devel
> >  wrote:
> > > Can we just revert the whole set until it's cleaned up properly?
> > >
> > > There are more patches to fix issues than the set itself. This is
> > > understandable if it's a bit architectural change like threading but it's
> > > not.
> >
> > I agree with Kieran, revert. This was not ready to be pushed IMO.
> >
> > Thanks,
> > Kyle
> > ___
> 
> I think the least that can be expected from somebody making such a
> request is that they provide specific reasoning after having taken
> a closer look - which the two of you apparently haven't.
> 
> Thanks
> sw
> 
> ___

Here's a branch with the pending fixes included:

https://github.com/softworkz/FFmpeg/tree/submit_graphprint_allfixes

Please explain specifically what you want to have reverted and why.

Thanks
sw
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] Graphprint Patches Overview

2025-05-21 Thread Kyle Swanson

Hi,

On Wed, May 21, 2025 at 4:00 AM Kieran Kunhya via ffmpeg-devel
 wrote:
> Can we just revert the whole set until it's cleaned up properly?
>
> There are more patches to fix issues than the set itself. This is
> understandable if it's a bit architectural change like threading but it's
> not.

I agree with Kieran, revert. This was not ready to be pushed IMO.

Thanks,
Kyle
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/4] avcodec/mpegvideo_enc: Use av_unreachable() for unreachable code

2025-05-21 Thread Andreas Rheinhardt

Patches attached.

- Andreas
From 8be9ae98dd8c880dd459cddb3192c67294d25186 Mon Sep 17 00:00:00 2001
From: Andreas Rheinhardt 
Date: Fri, 16 May 2025 16:43:34 +0200
Subject: [PATCH 1/4] avcodec/mpegvideo_enc: Use av_unreachable() for
 unreachable code

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/mpegvideo_enc.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/libavcodec/mpegvideo_enc.c b/libavcodec/mpegvideo_enc.c
index 6e9533ebc9..0023e88dc1 100644
--- a/libavcodec/mpegvideo_enc.c
+++ b/libavcodec/mpegvideo_enc.c
@@ -559,9 +559,10 @@ av_cold int ff_mpv_encode_init(AVCodecContext *avctx)
 case AV_PIX_FMT_YUV422P:
 s->c.chroma_format = CHROMA_422;
 break;
+default:
+av_unreachable("Already checked via CODEC_PIXFMTS");
 case AV_PIX_FMT_YUVJ420P:
 case AV_PIX_FMT_YUV420P:
-default:
 s->c.chroma_format = CHROMA_420;
 break;
 }
@@ -992,7 +993,7 @@ av_cold int ff_mpv_encode_init(AVCodecContext *avctx)
 s->c.low_delay = 1;
 break;
 default:
-return AVERROR(EINVAL);
+av_unreachable("List contains all codecs using ff_mpv_encode_init()");
 }
 
 avctx->has_b_frames = !s->c.low_delay;
@@ -3541,7 +3542,10 @@ static int encode_thread(AVCodecContext *c, void *arg){
 }
 break;
 default:
-av_log(s->c.avctx, AV_LOG_ERROR, "illegal MB type\n");
+av_unreachable("There is a case for every CANDIDATE_MB_TYPE_* "
+   "except CANDIDATE_MB_TYPE_SKIPPED which is never "
+   "the only candidate (always coupled with INTER) "
+   "so that it never reaches this switch");
 }
 
 encode_mb(s, motion_x, motion_y);
-- 
2.45.2

From c3bbb56ba29fa89bcb5732f263f127d15124fba2 Mon Sep 17 00:00:00 2001
From: Andreas Rheinhardt 
Date: Fri, 16 May 2025 18:24:38 +0200
Subject: [PATCH 2/4] avcodec/msmpeg4dec: Use av_unreachable() for unreachable
 code

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/msmpeg4dec.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libavcodec/msmpeg4dec.c b/libavcodec/msmpeg4dec.c
index df67d43542..ddb990b1a0 100644
--- a/libavcodec/msmpeg4dec.c
+++ b/libavcodec/msmpeg4dec.c
@@ -379,6 +379,8 @@ av_cold int ff_msmpeg4_decode_init(AVCodecContext *avctx)
 break;
 case MSMP4_WMV2:
 break;
+default:
+av_unreachable("List contains all cases using ff_msmpeg4_decode_init()");
 }
 
 s->slice_height= s->mb_height; //to avoid 1/0 if the first frame is not a keyframe
@@ -472,6 +474,8 @@ int ff_msmpeg4_decode_picture_header(MpegEncContext * s)
 ms->dc_table_index = get_bits1(&s->gb);
 s->inter_intra_pred= 0;
 break;
+default:
+av_unreachable("ff_msmpeg4_decode_picture_header() only used by MSMP4V1-3, WMV1");
 }
 s->no_rounding = 1;
 if(s->avctx->debug&FF_DEBUG_PICT_INFO)
@@ -523,6 +527,8 @@ int ff_msmpeg4_decode_picture_header(MpegEncContext * s)
 s->inter_intra_pred = s->width*s->height < 320*240 &&
   ms->bit_rate <= II_BITRATE;
 break;
+default:
+av_unreachable("ff_msmpeg4_decode_picture_header() only used by MSMP4V1-3, WMV1");
 }
 
 if(s->avctx->debug&FF_DEBUG_PICT_INFO)
-- 
2.45.2

From 3619f0f736bf91c45fea89dee81bbf97663d0a4d Mon Sep 17 00:00:00 2001
From: Andreas Rheinhardt 
Date: Fri, 16 May 2025 18:38:41 +0200
Subject: [PATCH 3/4] avcodec/h263dec: Use av_unreachable() for unreachable
 code

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/h263dec.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/libavcodec/h263dec.c b/libavcodec/h263dec.c
index c36070e23c..eb4c48f68c 100644
--- a/libavcodec/h263dec.c
+++ b/libavcodec/h263dec.c
@@ -150,9 +150,7 @@ av_cold int ff_h263_decode_init(AVCodecContext *avctx)
 s->h263_flv = 1;
 break;
 default:
-av_log(avctx, AV_LOG_ERROR, "Unsupported codec %d\n",
-   avctx->codec->id);
-return AVERROR(ENOSYS);
+av_unreachable("Switch contains a case for every codec using ff_h263_decode_init()");
 }
 
 if (avctx->codec_tag == AV_RL32("L263") || avctx->codec_tag == AV_RL32("S263"))
-- 
2.45.2

From 4ca5d0fa3587e1de7d698e16d7eea7c98d432fdf Mon Sep 17 00:00:00 2001
From: Andreas Rheinhardt 
Date: Wed, 21 May 2025 12:55:48 +0200
Subject: [PATCH 4/4] avcodec/pcm: Use av_unreachable() for unreachable code

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/pcm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libavcodec/pcm.c b/libavcodec/pcm.c
index bff61f2195..68b1945194 100644
--- a/libavcodec/pcm.c
+++ b/libavcodec/pcm.c
@@ -327,6 +327,8 @@ static av_cold av_unused int pcm_lut_decode_init(AVCodecContext *avctx)
 PCMLU

[FFmpeg-devel] [PATCH v2 10/17] swscale/optimizer: add packed shuffle solver

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

This can turn any compatible sequence of operations into a single packed
shuffle, including packed swizzling, grayscale->RGB conversion, endianness
swapping, RGB bit depth conversions, rgb24->rgb0 alpha clearing and more.
---
 libswscale/ops_internal.h  | 17 +++
 libswscale/ops_optimizer.c | 96 ++
 2 files changed, 113 insertions(+)

diff --git a/libswscale/ops_internal.h b/libswscale/ops_internal.h
index 9fd866430b..ab957b0837 100644
--- a/libswscale/ops_internal.h
+++ b/libswscale/ops_internal.h
@@ -105,4 +105,21 @@ int ff_sws_ops_compile_backend(SwsContext *ctx, const 
SwsOpBackend *backend,
  */
 int ff_sws_ops_compile(SwsContext *ctx, const SwsOpList *ops, SwsCompiledOp 
*out);
 
+/**
+ * "Solve" an op list into a fixed shuffle mask, with an optional ability to
+ * also directly clear the output value (for e.g. rgb24 -> rgb0).
+ *
+ * @param ops The operation list to decompose.
+ * @param shuffle The output shuffle mask.
+ * @param sizeThe size (in bytes) of the output shuffle mask.
+ * @param clear_val   If nonzero, this index will be used to clear the output.
+ * @param read_bytes  Returns the number of bytes read per shuffle iteration.
+ * @param write_bytes Returns the number of bytes written per shuffle 
iteration.
+ *
+ * @return  The number of pixels processed per iteration, or a negative error
+code; in particular AVERROR(ENOTSUP) for unsupported operations.
+ */
+int ff_sws_solve_shuffle(const SwsOpList *ops, uint8_t shuffle[], int size,
+ uint8_t clear_val, int *read_bytes, int *write_bytes);
+
 #endif
diff --git a/libswscale/ops_optimizer.c b/libswscale/ops_optimizer.c
index d503bf7bf3..9cde60ed58 100644
--- a/libswscale/ops_optimizer.c
+++ b/libswscale/ops_optimizer.c
@@ -19,9 +19,11 @@
  */
 
 #include "libavutil/avassert.h"
+#include 
 #include "libavutil/rational.h"
 
 #include "ops.h"
+#include "ops_internal.h"
 
 #define Q(N) ((AVRational) { N, 1 })
 
@@ -781,3 +783,97 @@ retry:
 
 return 0;
 }
+
+int ff_sws_solve_shuffle(const SwsOpList *const ops, uint8_t shuffle[],
+ int shuffle_size, uint8_t clear_val,
+ int *out_read_bytes, int *out_write_bytes)
+{
+const SwsOp read = ops->ops[0];
+const int read_size = ff_sws_pixel_type_size(read.type);
+uint32_t mask[4] = {0};
+
+if (!ops->num_ops || read.op != SWS_OP_READ)
+return AVERROR(EINVAL);
+if (read.rw.frac || (!read.rw.packed && read.rw.elems > 1))
+return AVERROR(ENOTSUP);
+
+for (int i = 0; i < read.rw.elems; i++)
+mask[i] = 0x01010101 * i * read_size + 0x03020100;
+
+for (int opidx = 1; opidx < ops->num_ops; opidx++) {
+const SwsOp *op = &ops->ops[opidx];
+switch (op->op) {
+case SWS_OP_SWIZZLE: {
+uint32_t orig[4] = { mask[0], mask[1], mask[2], mask[3] };
+for (int i = 0; i < 4; i++)
+mask[i] = orig[op->swizzle.in[i]];
+break;
+}
+
+case SWS_OP_SWAP_BYTES:
+for (int i = 0; i < 4; i++) {
+switch (ff_sws_pixel_type_size(op->type)) {
+case 2: mask[i] = av_bswap16(mask[i]); break;
+case 4: mask[i] = av_bswap32(mask[i]); break;
+}
+}
+break;
+
+case SWS_OP_CLEAR:
+for (int i = 0; i < 4; i++) {
+if (!op->c.q4[i].den)
+continue;
+if (op->c.q4[i].num != 0 || !clear_val)
+return AVERROR(ENOTSUP);
+mask[i] = 0x1010101ul * clear_val;
+}
+break;
+
+case SWS_OP_CONVERT: {
+if (!op->convert.expand)
+return AVERROR(ENOTSUP);
+for (int i = 0; i < 4; i++) {
+switch (ff_sws_pixel_type_size(op->type)) {
+case 1: mask[i] = 0x01010101 * (mask[i] & 0xFF);   break;
+case 2: mask[i] = 0x00010001 * (mask[i] & 0x); break;
+}
+}
+break;
+}
+
+case SWS_OP_WRITE: {
+if (op->rw.frac || !op->rw.packed)
+return AVERROR(ENOTSUP);
+
+/* Initialize to no-op */
+memset(shuffle, clear_val, shuffle_size);
+
+const int write_size  = ff_sws_pixel_type_size(op->type);
+const int read_chunk  = read.rw.elems * read_size;
+const int write_chunk = op->rw.elems * write_size;
+const int num_groups  = shuffle_size / FFMAX(read_chunk, 
write_chunk);
+for (int n = 0; n < num_groups; n++) {
+const int base_in  = n * read_chunk;
+const int base_out = n * write_chunk;
+for (int i = 0; i < op->rw.elems; i++) {
+const int offset = base_out + i * write_size;
+for (int b = 0; b < write_size; b++) {
+

Re: [FFmpeg-devel] [PATCH 3/3] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Zhao Zhili




> On May 21, 2025, at 21:11, Andreas Rheinhardt 
>  wrote:
> 
> Zhao Zhili:
>> From: Zhao Zhili 
>> 
>> ---
>> tests/fate/hevc.mak| 3 +++
>> tests/ref/fate/hevc-color-reserved | 6 ++
>> 2 files changed, 9 insertions(+)
>> create mode 100644 tests/ref/fate/hevc-color-reserved
>> 
>> diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
>> index 390ccf46e2..5e721526d0 100644
>> --- a/tests/fate/hevc.mak
>> +++ b/tests/fate/hevc.mak
>> @@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
>> fate-hevc-mv-position
>> fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
>> FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
>> 
>> +fate-hevc-color-reserved: CMD = framecrc -i 
>> $(TARGET_SAMPLES)/hevc/color_prim_reserved0.hevc -fps_mode passthrough 
>> -sws_flags +accurate_rnd+bitexact -vf scale,format=nv12
>> +FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, SCALE_FILTER) += 
>> fate-hevc-color-reserved
> 
> A new sample for this? Why don't you just create one with hevc_metadata?

Great idea. See patch v2 3/3.

https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343884.html

> 
>> +
>> FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
>> FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
>> 
>> diff --git a/tests/ref/fate/hevc-color-reserved 
>> b/tests/ref/fate/hevc-color-reserved
>> new file mode 100644
>> index 00..3351628209
>> --- /dev/null
>> +++ b/tests/ref/fate/hevc-color-reserved
>> @@ -0,0 +1,6 @@
>> +#tb 0: 1/60
>> +#media_type 0: video
>> +#codec_id 0: rawvideo
>> +#dimensions 0: 1920x900
>> +#sar 0: 1/1
>> +0,  0,  0,1,  2592000, 0xfa6fce1e
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] avfilter/vf_interlace_vulkan: fix FPS and PTS calculation

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

ol->frame_rate is 0/0, so we need to calcalute the correct value based on
the il->frame_rate instead. Also adjust the time base, PTS and frame_duration
values accordingly. (Logic taken from vf_tinterlace.c)
---
 libavfilter/vf_interlace_vulkan.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libavfilter/vf_interlace_vulkan.c 
b/libavfilter/vf_interlace_vulkan.c
index b5cd321fef..7afb30c2d7 100644
--- a/libavfilter/vf_interlace_vulkan.c
+++ b/libavfilter/vf_interlace_vulkan.c
@@ -189,7 +189,9 @@ static int interlace_vulkan_filter_frame(AVFilterLink 
*link, AVFrame *in)
 AVFrame *out = NULL, *input_top, *input_bot;
 AVFilterContext *ctx = link->dst;
 InterlaceVulkanContext *s = ctx->priv;
+const AVFilterLink *inlink = ctx->inputs[0];
 AVFilterLink *outlink = ctx->outputs[0];
+FilterLink *l = ff_filter_link(outlink);
 
 if (!s->initialized)
 RET(init_filter(ctx));
@@ -226,6 +228,9 @@ static int interlace_vulkan_filter_frame(AVFilterLink 
*link, AVFrame *in)
 if (s->mode == MODE_TFF)
 out->flags |= AV_FRAME_FLAG_TOP_FIELD_FIRST;
 
+out->pts = av_rescale_q(out->pts, inlink->time_base, outlink->time_base);
+out->duration = av_rescale_q(1, av_inv_q(l->frame_rate), 
outlink->time_base);
+
 av_frame_free(&s->cur);
 av_frame_free(&in);
 
@@ -260,9 +265,12 @@ static void interlace_vulkan_uninit(AVFilterContext *avctx)
 
 static int config_out_props(AVFilterLink *outlink)
 {
+AVFilterLink *inlink = outlink->src->inputs[0];
+const FilterLink *il = ff_filter_link(inlink);
 FilterLink *ol = ff_filter_link(outlink);
 
-ol->frame_rate = av_mul_q(ol->frame_rate, av_make_q(1, 2));
+ol->frame_rate = av_mul_q(il->frame_rate, av_make_q(1, 2));
+outlink->time_base = av_mul_q(inlink->time_base, av_make_q(2, 1));
 return ff_vk_filter_config_output(outlink);
 }
 
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v6 3/4] ogg/vorbis: implement header packet skip in chained ogg bitstreams.

2025-05-21 Thread Romain Beauxis

Le mer. 21 mai 2025 à 02:53, Michael Niedermayer  a
écrit :
>
> Hi Romain
>
> On Tue, May 20, 2025 at 05:45:01PM -0500, Romain Beauxis wrote:
> > Le mar. 20 mai 2025 à 16:46, Michael Niedermayer
> >  a écrit :
> > >
> > > On Mon, May 19, 2025 at 09:46:38AM -0500, Romain Beauxis wrote:
> > > > ---
> > > >  libavcodec/vorbis_parser.h | 11 
> > > >  libavcodec/vorbisdec.c | 76
+-
> > > >  libavformat/oggparsevorbis.c   | 63 +-
> > > >  tests/ref/fate/ogg-vorbis-chained-meta.txt |  3 -
> > > >  4 files changed, 116 insertions(+), 37 deletions(-)
> > >
> > > breaks fate here (normal x86-64 ubuntu)
> > >
> > > --- ./tests/ref/fate/ogg-vorbis-chained-meta.txt2025-05-20
23:42:32.043927021 +0200
> > > +++ tests/data/fate/ogg-vorbis-chained-meta 2025-05-20
23:43:07.908216645 +0200
> > > @@ -7,8 +7,4 @@
> > >  Stream ID: 0, frame PTS: 704, metadata: N/A
> > >  Stream ID: 0, packet PTS: 0, packet DTS: 0
> > >  Stream ID: 0, new metadata: encoder=Lavc61.19.100
libvorbis:title=Second Stream
> > > -Stream ID: 0, frame PTS: 0, metadata: N/A
> > >  Stream ID: 0, packet PTS: 128, packet DTS: 128
> > > -Stream ID: 0, frame PTS: 128, metadata: N/A
> > > -Stream ID: 0, packet PTS: 704, packet DTS: 704
> > > -Stream ID: 0, frame PTS: 704, metadata: N/A
> > > Test ogg-vorbis-chained-meta failed. Look at
tests/data/fate/ogg-vorbis-chained-meta.err for details.
> > > make: *** [tests/Makefile:316: fate-ogg-vorbis-chained-meta] Error 1
> >
> > I'm not sure what I'm looking at. Is that the output of running the
FATE tests?
>
> yes
> probably was make fate-ogg-vorbis-chained-meta

Woof thanks for catching that indeed. One `return` statement had snuck into
my refactorization. About to push a fixed series.

>
> >
> > This diff is already included in the patch:
> >
> > ```
> > % git show 37370e99451cf0750d5304764ba9031b80e5b3e0 tests/
> > commit 37370e99451cf0750d5304764ba9031b80e5b3e0 (HEAD)
> > Author: Romain Beauxis 
> > Date:   Sat May 17 12:59:40 2025 -0500
> >
> > ogg/vorbis: implement header packet skip in chained ogg bitstreams.
> >
> > diff --git a/tests/ref/fate/ogg-vorbis-chained-meta.txt
> > b/tests/ref/fate/ogg-vorbis-chained-meta.txt
> > index b7a97c90e2..1206f86c1f 100644
> > --- a/tests/ref/fate/ogg-vorbis-chained-meta.txt
> > +++ b/tests/ref/fate/ogg-vorbis-chained-meta.txt
> > @@ -6,10 +6,7 @@ Stream ID: 0, frame PTS: 128, metadata: N/A
> >  Stream ID: 0, packet PTS: 704, packet DTS: 704
> >  Stream ID: 0, frame PTS: 704, metadata: N/A
> >  Stream ID: 0, packet PTS: 0, packet DTS: 0
> > -Stream ID: 0, packet PTS: 0, packet DTS: 0
> >  Stream ID: 0, new metadata: encoder=Lavc61.19.100
libvorbis:title=Second Stream
> > -Stream ID: 0, packet PTS: 0, packet DTS: 0
> > -Stream ID: 0, packet PTS: 0, packet DTS: 0
> >  Stream ID: 0, frame PTS: 0, metadata: N/A
> >  Stream ID: 0, packet PTS: 128, packet DTS: 128
> >  Stream ID: 0, frame PTS: 128, metadata: N/A
>
> These 2 diffs are not the same
>
> thx
>
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> You can kill me, but you cannot change the truth.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 09/17] swscale/ops: add dispatch layer

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

This handles the low-level execution of an op list, and integration into
the SwsGraph infrastructure. To handle frames with insufficient padding in
the stride (or a width smaller than one block size), we use a fallback loop
that pads the last column of pixels using `memcpy` into an appropriately
sized buffer.
---
 libswscale/ops.c | 256 +++
 libswscale/ops.h |  14 +++
 2 files changed, 270 insertions(+)

diff --git a/libswscale/ops.c b/libswscale/ops.c
index 8491bd9cad..d466f5e45c 100644
--- a/libswscale/ops.c
+++ b/libswscale/ops.c
@@ -582,3 +582,259 @@ int ff_sws_ops_compile(SwsContext *ctx, const SwsOpList 
*ops, SwsCompiledOp *out
 ff_sws_op_list_print(ctx, AV_LOG_WARNING, ops);
 return AVERROR(ENOTSUP);
 }
+
+typedef struct SwsOpPass {
+SwsCompiledOp comp;
+SwsOpExec exec_base;
+int num_blocks;
+int tail_off_in;
+int tail_off_out;
+int tail_size_in;
+int tail_size_out;
+bool memcpy_in;
+bool memcpy_out;
+} SwsOpPass;
+
+static void op_pass_free(void *ptr)
+{
+SwsOpPass *p = ptr;
+if (!p)
+return;
+
+if (p->comp.free)
+p->comp.free(p->comp.priv);
+
+av_free(p);
+}
+
+static void op_pass_setup(const SwsImg *out, const SwsImg *in, const SwsPass 
*pass)
+{
+const AVPixFmtDescriptor *indesc  = av_pix_fmt_desc_get(in->fmt);
+const AVPixFmtDescriptor *outdesc = av_pix_fmt_desc_get(out->fmt);
+
+SwsOpPass *p = pass->priv;
+SwsOpExec *exec = &p->exec_base;
+const SwsCompiledOp *comp = &p->comp;
+const int block_size = comp->block_size;
+p->num_blocks = (pass->width + block_size - 1) / block_size;
+
+/* Set up main loop parameters */
+const int aligned_w  = p->num_blocks * block_size;
+const int safe_width = (p->num_blocks - 1) * block_size;
+const int tail_size  = pass->width - safe_width;
+p->tail_off_in   = safe_width * exec->pixel_bits_in  >> 3;
+p->tail_off_out  = safe_width * exec->pixel_bits_out >> 3;
+p->tail_size_in  = tail_size  * exec->pixel_bits_in  >> 3;
+p->tail_size_out = tail_size  * exec->pixel_bits_out >> 3;
+p->memcpy_in = false;
+p->memcpy_out= false;
+
+for (int i = 0; i < 4 && in->data[i]; i++) {
+const int sub_x  = (i == 1 || i == 2) ? indesc->log2_chroma_w : 0;
+const int plane_w= (aligned_w + sub_x) >> sub_x;
+const int plane_pad  = (comp->over_read + sub_x) >> sub_x;
+const int plane_size = plane_w * exec->pixel_bits_in >> 3;
+p->memcpy_in |= plane_size + plane_pad > in->linesize[i];
+exec->in_stride[i] = in->linesize[i];
+}
+
+for (int i = 0; i < 4 && out->data[i]; i++) {
+const int sub_x  = (i == 1 || i == 2) ? outdesc->log2_chroma_w : 0;
+const int plane_w= (aligned_w + sub_x) >> sub_x;
+const int plane_pad  = (comp->over_write + sub_x) >> sub_x;
+const int plane_size = plane_w * exec->pixel_bits_out >> 3;
+p->memcpy_out |= plane_size + plane_pad > out->linesize[i];
+exec->out_stride[i] = out->linesize[i];
+}
+}
+
+/* Dispatch kernel over the last column of the image using memcpy */
+static av_always_inline void
+handle_tail(const SwsOpPass *p, SwsOpExec *exec,
+const SwsImg *out_base, const bool copy_out,
+const SwsImg *in_base, const bool copy_in,
+int y, const int h)
+{
+DECLARE_ALIGNED_64(uint8_t, tmp)[2][4][sizeof(uint32_t[128])];
+
+const SwsCompiledOp *comp = &p->comp;
+const int tail_size_in  = p->tail_size_in;
+const int tail_size_out = p->tail_size_out;
+const int bx = p->num_blocks - 1;
+
+SwsImg in  = ff_sws_img_shift(in_base,  y);
+SwsImg out = ff_sws_img_shift(out_base, y);
+for (int i = 0; i < 4 && in.data[i]; i++) {
+in.data[i]  += p->tail_off_in;
+if (copy_in) {
+exec->in[i] = (void *) tmp[0][i];
+exec->in_stride[i] = sizeof(tmp[0][i]);
+} else {
+exec->in[i] = in.data[i];
+}
+}
+
+for (int i = 0; i < 4 && out.data[i]; i++) {
+out.data[i] += p->tail_off_out;
+if (copy_out) {
+exec->out[i] = (void *) tmp[1][i];
+exec->out_stride[i] = sizeof(tmp[1][i]);
+} else {
+exec->out[i] = out.data[i];
+}
+}
+
+for (int y_end = y + h; y < y_end; y++) {
+if (copy_in) {
+for (int i = 0; i < 4 && in.data[i]; i++) {
+av_assert2(tmp[0][i] + tail_size_in < (uint8_t *) tmp[1]);
+memcpy(tmp[0][i], in.data[i], tail_size_in);
+in.data[i] += in.linesize[i];
+}
+}
+
+comp->func(exec, comp->priv, bx, y, p->num_blocks, y + 1);
+
+if (copy_out) {
+for (int i = 0; i < 4 && out.data[i]; i++) {
+av_assert2(tmp[1][i] + tail_size_out < (uint8_t *) tmp[2]);
+memcpy(out.data[i], tmp[1][i], tail_size_ou

Re: [FFmpeg-devel] [PATCH v3] avformat/dhav: fix backward scanning for get_duration and optimize seeking

2025-05-21 Thread Derek Buitenhuis

On 5/21/2025 2:23 PM, Derek Buitenhuis wrote:
> This changes the scanning to check for the end tag 1 byte at a time
> and buffers the last 1 MiB using ffio_ensure_seekback to avoid additional
> seek operations.

I removed the part about ffio_ensure_seekback locally.

- Derek
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 0/3] Clean up build spam from graph css builder

2025-05-21 Thread Derek Buitenhuis

On 5/20/2025 7:44 PM, softworkz . wrote:
> Hi Derek,
> 
> thanks a lot for the patch. This partially duplicates what Timo had
> already submitted:
> 
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250516230202.355445-1-t...@rothenpieler.org/

Dropping this set, then.

> Regarding patch 3/3, would you mind taking a look at the patch that I 
> have submitted in this regard, from which I believe that it's the "most
> correct" way:
> 
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/pull.80.v2.ffstaging.ffmpeg.1747549830700.ffmpegag...@gmail.com/

I am actually unclear on which is "most correct", myself.

[...]

- Derek
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 00/17] swscale: new ops framework

2025-05-21 Thread Niklas Haas

Changes since v1:
- keep track of `packed` status even for single-element bit-packed formats
- fix memory leak of dither matrix
- fix AVRational printing of infinities
- fix value range tracking for big endian formats
- fix some overflow bugs on 32-bit
- remove unneeded internal helper
- add optimization for convert->swizzle->convert
- clean up the generated shuffle mask when clearing multiple bytes
- slightly tune the x86 asm loops
- add an `unsigned max_ulp` to checkasm_check_float()

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 12/17] swscale/ops_backend: add reference backend basend on C templates

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

This will serve as a reference for the SIMD backends to come. That said,
with auto-vectorization enabled, the performance of this is not atrocious.
It easily beats the old C code and sometimes even the old SIMD.

In theory, we can dramatically speed it up by using GCC vectors instead of
arrays, but the performance gains from this are too dependent on exact GCC
versions and flags, so it practice it's not a substitute for a SIMD
implementation.
---
 libswscale/Makefile  |   6 +
 libswscale/ops.c |   3 +
 libswscale/ops_backend.c | 105 ++
 libswscale/ops_backend.h | 167 ++
 libswscale/ops_tmpl_common.c | 176 ++
 libswscale/ops_tmpl_float.c  | 257 +++
 libswscale/ops_tmpl_int.c| 608 +++
 7 files changed, 1322 insertions(+)
 create mode 100644 libswscale/ops_backend.c
 create mode 100644 libswscale/ops_backend.h
 create mode 100644 libswscale/ops_tmpl_common.c
 create mode 100644 libswscale/ops_tmpl_float.c
 create mode 100644 libswscale/ops_tmpl_int.c

diff --git a/libswscale/Makefile b/libswscale/Makefile
index c9dfa78c89..6e5696c5a6 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -16,6 +16,7 @@ OBJS = alphablend.o \
input.o  \
lut3d.o  \
ops.o\
+   ops_backend.o\
ops_chain.o  \
ops_optimizer.o  \
options.o\
@@ -29,6 +30,11 @@ OBJS = alphablend.o \
yuv2rgb.o\
vscale.o \
 
+OPS-CFLAGS = -Wno-uninitialized \
+ -ffinite-math-only
+
+$(SUBDIR)ops_backend.o: CFLAGS += $(OPS-CFLAGS)
+
 # Objects duplicated from other libraries for shared builds
 SHLIBOBJS+= log2_tab.o half2float.o
 
diff --git a/libswscale/ops.c b/libswscale/ops.c
index d466f5e45c..3b9c2844f8 100644
--- a/libswscale/ops.c
+++ b/libswscale/ops.c
@@ -27,7 +27,10 @@
 #include "ops.h"
 #include "ops_internal.h"
 
+extern SwsOpBackend backend_c;
+
 const SwsOpBackend * const ff_sws_op_backends[] = {
+&backend_c,
 NULL
 };
 
diff --git a/libswscale/ops_backend.c b/libswscale/ops_backend.c
new file mode 100644
index 00..47ce992bb3
--- /dev/null
+++ b/libswscale/ops_backend.c
@@ -0,0 +1,105 @@
+/**
+ * Copyright (C) 2025 Niklas Haas
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "ops_backend.h"
+
+/* Array-based reference implementation */
+
+#ifndef SWS_BLOCK_SIZE
+#  define SWS_BLOCK_SIZE 32
+#endif
+
+typedef  uint8_t  u8block_t[SWS_BLOCK_SIZE];
+typedef uint16_t u16block_t[SWS_BLOCK_SIZE];
+typedef uint32_t u32block_t[SWS_BLOCK_SIZE];
+typedeffloat f32block_t[SWS_BLOCK_SIZE];
+
+#define BIT_DEPTH 8
+# include "ops_tmpl_int.c"
+#undef BIT_DEPTH
+
+#define BIT_DEPTH 16
+# include "ops_tmpl_int.c"
+#undef BIT_DEPTH
+
+#define BIT_DEPTH 32
+# include "ops_tmpl_int.c"
+# include "ops_tmpl_float.c"
+#undef BIT_DEPTH
+
+static void process(const SwsOpExec *exec, const void *priv,
+const int bx_start, const int y_start, int bx_end, int 
y_end)
+{
+const SwsOpChain *chain = priv;
+const SwsOpImpl *impl = chain->impl;
+SwsOpIter iter;
+
+for (iter.y = y_start; iter.y < y_end; iter.y++) {
+for (int i = 0; i < 4; i++) {
+iter.in[i]  = exec->in[i]  + (iter.y - y_start) * 
exec->in_stride[i];
+iter.out[i] = exec->out[i] + (iter.y - y_start) * 
exec->out_stride[i];
+}
+
+for (int block = bx_start; block < bx_end; block++) {
+iter.x = block * SWS_BLOCK_SIZE;
+((void (*)(SwsOpIter *, const SwsOpImpl *)) impl->cont)
+(&iter, &impl[1]);
+}
+}
+}
+
+static int compile(SwsContext *ctx, SwsOpList *ops, SwsCompiledOp *out)
+{
+int ret;
+
+SwsOpChain *chain = ff_sws_op_chain_alloc();
+if (!chain)
+return AVERROR(ENOMEM);
+
+static

[FFmpeg-devel] [PATCH v2 11/17] swscale/ops_chain: add internal abstraction for kernel linking

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

See doc/swscale-v2.txt for design details.
---
 libswscale/Makefile|   1 +
 libswscale/ops_chain.c | 293 +
 libswscale/ops_chain.h | 109 +++
 3 files changed, 403 insertions(+)
 create mode 100644 libswscale/ops_chain.c
 create mode 100644 libswscale/ops_chain.h

diff --git a/libswscale/Makefile b/libswscale/Makefile
index 810c9dee78..c9dfa78c89 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -16,6 +16,7 @@ OBJS = alphablend.o \
input.o  \
lut3d.o  \
ops.o\
+   ops_chain.o  \
ops_optimizer.o  \
options.o\
output.o \
diff --git a/libswscale/ops_chain.c b/libswscale/ops_chain.c
new file mode 100644
index 00..cba825ee41
--- /dev/null
+++ b/libswscale/ops_chain.c
@@ -0,0 +1,293 @@
+/**
+ * Copyright (C) 2025 Niklas Haas
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/avassert.h"
+#include "libavutil/mem.h"
+#include "libavutil/rational.h"
+
+#include "ops_chain.h"
+
+SwsOpChain *ff_sws_op_chain_alloc(void)
+{
+return av_mallocz(sizeof(SwsOpChain));
+}
+
+void ff_sws_op_chain_free(SwsOpChain *chain)
+{
+if (!chain)
+return;
+
+for (int i = 0; i < chain->num_impl + 1; i++) {
+if (chain->free[i])
+chain->free[i](chain->impl[i].priv.ptr);
+}
+
+av_free(chain);
+}
+
+int ff_sws_op_chain_append(SwsOpChain *chain, SwsFuncPtr func,
+   void (*free)(void *), SwsOpPriv priv)
+{
+const int idx = chain->num_impl;
+if (idx == SWS_MAX_OPS)
+return AVERROR(EINVAL);
+
+av_assert1(func);
+chain->impl[idx].cont = func;
+chain->impl[idx + 1].priv = priv;
+chain->free[idx + 1] = free;
+chain->num_impl++;
+return 0;
+}
+
+/**
+ * Match an operation against a reference operation. Returns a score for how
+ * well the reference matches the operation, or 0 if there is no match.
+ *
+ * If `ref->comps` has any flags set, they must be set in `op` as well.
+ * Likewise, if `ref->comps` has any components marked as unused, they must be
+ * marked as as unused in `ops` as well.
+ *
+ * For SWS_OP_LINEAR, `ref->linear.mask` must be a strict superset of
+ * `op->linear.mask`, but may not contain any columns explicitly ignored by
+ * `op->comps.unused`.
+ *
+ * For SWS_OP_READ, SWS_OP_WRITE, SWS_OP_SWAP_BYTES and SWS_OP_SWIZZLE, the
+ * exact type is not checked, just the size.
+ *
+ * Components set in `next.unused` are ignored when matching. If `flexible`
+ * is true, the op body is ignored - only the operation, pixel type, and
+ * component masks are checked.
+ */
+static int op_match(const SwsOp *op, const SwsOpEntry *entry, const SwsComps 
next)
+{
+const SwsOp *ref = &entry->op;
+int score = 10;
+if (op->op != ref->op)
+return 0;
+
+switch (op->op) {
+case SWS_OP_READ:
+case SWS_OP_WRITE:
+case SWS_OP_SWAP_BYTES:
+case SWS_OP_SWIZZLE:
+/* Only the size matters for these operations */
+if (ff_sws_pixel_type_size(op->type) != 
ff_sws_pixel_type_size(ref->type))
+return 0;
+break;
+default:
+if (op->type != ref->type)
+return 0;
+break;
+}
+
+for (int i = 0; i < 4; i++) {
+if (ref->comps.unused[i]) {
+if (op->comps.unused[i])
+score += 1; /* Operating on fewer components is better .. */
+else
+return false; /* .. but not too few! */
+}
+
+if (ref->comps.flags[i]) {
+if (ref->comps.flags[i] & ~op->comps.flags[i]) {
+return false; /* Missing required output assumptions */
+} else {
+/* Implementation is more specialized */
+score += av_popcount(ref->comps.flags[i]);
+}
+}
+}
+
+/* Flexible variants always match, but lower the score to prioritize more
+

[FFmpeg-devel] [PATCH v2 04/17] tests/checkasm: generalize DEF_CHECKASM_CHECK_FUNC to floats

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

We split the standard macro into its body (implementation) and declaration,
and use a macro argument in place of the raw `memcmp` call, with the major
difference that we now take the number of pixels to compare instead of the
number of bytes (to match the signature of float_near_ulp_array).
---
 tests/checkasm/checkasm.c | 52 ++-
 tests/checkasm/checkasm.h |  7 ++
 2 files changed, 42 insertions(+), 17 deletions(-)

diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 71d1e5766c..f393a0cb96 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -1187,14 +1187,8 @@ static int check_err(const char *file, int line,
 return 0;
 }
 
-#define DEF_CHECKASM_CHECK_FUNC(type, fmt) \
-int checkasm_check_##type(const char *file, int line, \
-  const type *buf1, ptrdiff_t stride1, \
-  const type *buf2, ptrdiff_t stride2, \
-  int w, int h, const char *name, \
-  int align_w, int align_h, \
-  int padding) \
-{ \
+#define DEF_CHECKASM_CHECK_BODY(compare, type, fmt) \
+do { \
 int64_t aligned_w = (w - 1LL + align_w) & ~(align_w - 1); \
 int64_t aligned_h = (h - 1LL + align_h) & ~(align_h - 1); \
 int err = 0; \
@@ -1204,7 +1198,7 @@ int checkasm_check_##type(const char *file, int line, \
 stride1 /= sizeof(*buf1); \
 stride2 /= sizeof(*buf2); \
 for (y = 0; y < h; y++) \
-if (memcmp(&buf1[y*stride1], &buf2[y*stride2], w*sizeof(*buf1))) \
+if (!compare(&buf1[y*stride1], &buf2[y*stride2], w)) \
 break; \
 if (y != h) { \
 if (check_err(file, line, name, w, h, &err)) \
@@ -1226,38 +1220,50 @@ int checkasm_check_##type(const char *file, int line, \
 buf2 -= h*stride2; \
 } \
 for (y = -padding; y < 0; y++) \
-if (memcmp(&buf1[y*stride1 - padding], &buf2[y*stride2 - padding], \
-   (w + 2*padding)*sizeof(*buf1))) { \
+if (!compare(&buf1[y*stride1 - padding], &buf2[y*stride2 - padding], \
+ w + 2*padding)) { \
 if (check_err(file, line, name, w, h, &err)) \
 return 1; \
 fprintf(stderr, " overwrite above\n"); \
 break; \
 } \
 for (y = aligned_h; y < aligned_h + padding; y++) \
-if (memcmp(&buf1[y*stride1 - padding], &buf2[y*stride2 - padding], \
-   (w + 2*padding)*sizeof(*buf1))) { \
+if (!compare(&buf1[y*stride1 - padding], &buf2[y*stride2 - padding], \
+ w + 2*padding)) { \
 if (check_err(file, line, name, w, h, &err)) \
 return 1; \
 fprintf(stderr, " overwrite below\n"); \
 break; \
 } \
 for (y = 0; y < h; y++) \
-if (memcmp(&buf1[y*stride1 - padding], &buf2[y*stride2 - padding], \
-   padding*sizeof(*buf1))) { \
+if (!compare(&buf1[y*stride1 - padding], &buf2[y*stride2 - padding], \
+ padding)) { \
 if (check_err(file, line, name, w, h, &err)) \
 return 1; \
 fprintf(stderr, " overwrite left\n"); \
 break; \
 } \
 for (y = 0; y < h; y++) \
-if (memcmp(&buf1[y*stride1 + aligned_w], &buf2[y*stride2 + aligned_w], 
\
-   padding*sizeof(*buf1))) { \
+if (!compare(&buf1[y*stride1 + aligned_w], &buf2[y*stride2 + 
aligned_w], \
+ padding)) { \
 if (check_err(file, line, name, w, h, &err)) \
 return 1; \
 fprintf(stderr, " overwrite right\n"); \
 break; \
 } \
 return err; \
+} while (0)
+
+#define cmp_int(a, b, len) (!memcmp(a, b, (len) * sizeof(*(a
+#define DEF_CHECKASM_CHECK_FUNC(type, fmt) \
+int checkasm_check_##type(const char *file, int line, \
+  const type *buf1, ptrdiff_t stride1, \
+  const type *buf2, ptrdiff_t stride2, \
+  int w, int h, const char *name, \
+  int align_w, int align_h, \
+  int padding) \
+{ \
+DEF_CHECKASM_CHECK_BODY(cmp_int, type, fmt); \
 }
 
 DEF_CHECKASM_CHECK_FUNC(uint8_t,  "%02x")
@@ -1265,3 +1271,15 @@ DEF_CHECKASM_CHECK_FUNC(uint16_t, "%04x")
 DEF_CHECKASM_CHECK_FUNC(uint32_t, "%08x")
 DEF_CHECKASM_CHECK_FUNC(int16_t,  "%6d")
 DEF_CHECKASM_CHECK_FUNC(int32_t,  "%9d")
+
+int checkasm_check_float_ulp(const char *file, int line,
+ const float *buf1, ptrdiff_t stride1,
+ const float *buf2, ptrdiff_t stride2,
+ int w, int h, const char *name,
+ unsigned max_ulp, int align_w, int align_h,
+ int padding)
+{
+#define cmp_float(a, b, len) float_near_ulp_array(a, b, max_ulp, len)

[FFmpeg-devel] [PATCH v2 02/17] swscale/format: add ff_fmt_clear()

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

Reset an SwsFormat to its fully unset/invalid state.
---
 libswscale/format.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/libswscale/format.h b/libswscale/format.h
index 3b6d745159..be92038f4f 100644
--- a/libswscale/format.h
+++ b/libswscale/format.h
@@ -85,6 +85,20 @@ typedef struct SwsFormat {
 SwsColor color;
 } SwsFormat;
 
+static inline void ff_fmt_clear(SwsFormat *fmt)
+{
+*fmt = (SwsFormat) {
+.format = AV_PIX_FMT_NONE,
+.range  = AVCOL_RANGE_UNSPECIFIED,
+.csp= AVCOL_SPC_UNSPECIFIED,
+.loc= AVCHROMA_LOC_UNSPECIFIED,
+.color = {
+.prim = AVCOL_PRI_UNSPECIFIED,
+.trc  = AVCOL_TRC_UNSPECIFIED,
+},
+};
+}
+
 /**
  * This function also sanitizes and strips the input data, removing irrelevant
  * fields for certain formats.
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 01/17] swscale/format: rename legacy format conversion table

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

---
 libswscale/format.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/libswscale/format.c b/libswscale/format.c
index e4c1348b90..b77081dd7a 100644
--- a/libswscale/format.c
+++ b/libswscale/format.c
@@ -24,14 +24,14 @@
 
 #include "format.h"
 
-typedef struct FormatEntry {
+typedef struct LegacyFormatEntry {
 uint8_t is_supported_in :1;
 uint8_t is_supported_out:1;
 uint8_t is_supported_endianness :1;
-} FormatEntry;
+} LegacyFormatEntry;
 
 /* Format support table for legacy swscale */
-static const FormatEntry format_entries[] = {
+static const LegacyFormatEntry legacy_format_entries[] = {
 [AV_PIX_FMT_YUV420P]= { 1, 1 },
 [AV_PIX_FMT_YUYV422]= { 1, 1 },
 [AV_PIX_FMT_RGB24]  = { 1, 1 },
@@ -262,20 +262,20 @@ static const FormatEntry format_entries[] = {
 
 int sws_isSupportedInput(enum AVPixelFormat pix_fmt)
 {
-return (unsigned)pix_fmt < FF_ARRAY_ELEMS(format_entries) ?
-   format_entries[pix_fmt].is_supported_in : 0;
+return (unsigned)pix_fmt < FF_ARRAY_ELEMS(legacy_format_entries) ?
+legacy_format_entries[pix_fmt].is_supported_in : 0;
 }
 
 int sws_isSupportedOutput(enum AVPixelFormat pix_fmt)
 {
-return (unsigned)pix_fmt < FF_ARRAY_ELEMS(format_entries) ?
-   format_entries[pix_fmt].is_supported_out : 0;
+return (unsigned)pix_fmt < FF_ARRAY_ELEMS(legacy_format_entries) ?
+legacy_format_entries[pix_fmt].is_supported_out : 0;
 }
 
 int sws_isSupportedEndiannessConversion(enum AVPixelFormat pix_fmt)
 {
-return (unsigned)pix_fmt < FF_ARRAY_ELEMS(format_entries) ?
-   format_entries[pix_fmt].is_supported_endianness : 0;
+return (unsigned)pix_fmt < FF_ARRAY_ELEMS(legacy_format_entries) ?
+legacy_format_entries[pix_fmt].is_supported_endianness : 0;
 }
 
 /**
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 03/17] tests/checkasm: increase number of runs in between measurements

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

Sometimes, when measuring very small functions, rdtsc is not accurate enough
to get a reliable measurement. This increases the number of runs inside the
inner loop from 4 to 32, which should help a lot. Less important when using
the more precise linux-perf API, but still useful.

There should be no user-visible change since the number of runs is adjusted
to keep the total time spent measuring the same.
---
 tests/checkasm/checkasm.c |  2 +-
 tests/checkasm/checkasm.h | 24 +++-
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 0734cd26bf..71d1e5766c 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -628,7 +628,7 @@ static inline double avg_cycles_per_call(const CheckasmPerf 
*const p)
 if (p->iterations) {
 const double cycles = (double)(10 * p->cycles) / p->iterations - 
state.nop_time;
 if (cycles > 0.0)
-return cycles / 4.0; /* 4 calls per iteration */
+return cycles / 32.0; /* 32 calls per iteration */
 }
 return 0.0;
 }
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 146bfdec35..ad7ed10613 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -342,6 +342,22 @@ typedef struct CheckasmPerf {
 #define PERF_STOP(t)  t = AV_READ_TIME() - t
 #endif
 
+#define CALL4(...)\
+do {\
+tfunc(__VA_ARGS__); \
+tfunc(__VA_ARGS__); \
+tfunc(__VA_ARGS__); \
+tfunc(__VA_ARGS__); \
+} while (0)
+
+#define CALL16(...)\
+do {\
+CALL4(__VA_ARGS__); \
+CALL4(__VA_ARGS__); \
+CALL4(__VA_ARGS__); \
+CALL4(__VA_ARGS__); \
+} while (0)
+
 /* Benchmark the function */
 #define bench_new(...)\
 do {\
@@ -352,14 +368,12 @@ typedef struct CheckasmPerf {
 uint64_t tsum = 0;\
 uint64_t ti, tcount = 0;\
 uint64_t t = 0; \
-const uint64_t truns = bench_runs;\
+const uint64_t truns = FFMAX(bench_runs >> 3, 1);\
 checkasm_set_signal_handler_state(1);\
 for (ti = 0; ti < truns; ti++) {\
 PERF_START(t);\
-tfunc(__VA_ARGS__);\
-tfunc(__VA_ARGS__);\
-tfunc(__VA_ARGS__);\
-tfunc(__VA_ARGS__);\
+CALL16(__VA_ARGS__);\
+CALL16(__VA_ARGS__);\
 PERF_STOP(t);\
 if (t*tcount <= tsum*4 && ti > 0) {\
 tsum += t;\
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 14/17] swscale/x86: add SIMD backend

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

This covers most 8-bit and 16-bit ops, and some 32-bit ops. It also covers all
floating point operations. While this is not yet 100% coverage, it's good
enough for the vast majority of formats out there.

Of special note is the packed shuffle fast path, which uses pshufb at vector
sizes up to AVX512.
---
 libswscale/ops.c  |4 +
 libswscale/x86/Makefile   |3 +
 libswscale/x86/ops.c  |  706 ++
 libswscale/x86/ops_common.asm |  187 ++
 libswscale/x86/ops_float.asm  |  386 
 libswscale/x86/ops_int.asm| 1050 +
 6 files changed, 2336 insertions(+)
 create mode 100644 libswscale/x86/ops.c
 create mode 100644 libswscale/x86/ops_common.asm
 create mode 100644 libswscale/x86/ops_float.asm
 create mode 100644 libswscale/x86/ops_int.asm

diff --git a/libswscale/ops.c b/libswscale/ops.c
index 6403eff324..8a27e70ef9 100644
--- a/libswscale/ops.c
+++ b/libswscale/ops.c
@@ -29,9 +29,13 @@
 
 extern SwsOpBackend backend_c;
 extern SwsOpBackend backend_murder;
+extern SwsOpBackend backend_x86;
 
 const SwsOpBackend * const ff_sws_op_backends[] = {
 &backend_murder,
+#if ARCH_X86
+&backend_x86,
+#endif
 &backend_c,
 NULL
 };
diff --git a/libswscale/x86/Makefile b/libswscale/x86/Makefile
index f00154941d..a04bc8336f 100644
--- a/libswscale/x86/Makefile
+++ b/libswscale/x86/Makefile
@@ -10,6 +10,9 @@ OBJS-$(CONFIG_XMM_CLOBBER_TEST) += x86/w64xmmtest.o
 
 X86ASM-OBJS += x86/input.o  \
x86/output.o \
+   x86/ops_int.o\
+   x86/ops_float.o  \
+   x86/ops.o\
x86/scale.o  \
x86/scale_avx2.o  \
x86/range_convert.o  \
diff --git a/libswscale/x86/ops.c b/libswscale/x86/ops.c
new file mode 100644
index 00..d5fd046d64
--- /dev/null
+++ b/libswscale/x86/ops.c
@@ -0,0 +1,706 @@
+/**
+ * Copyright (C) 2025 Niklas Haas
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include 
+
+#include 
+#include 
+
+#include "../ops_chain.h"
+
+#define DECL_ENTRY(TYPE, NAME, ...)
 \
+static const SwsOpEntry op_##NAME = {  
 \
+.op.type = SWS_PIXEL_##TYPE,   
 \
+__VA_ARGS__
 \
+}
+
+#define DECL_ASM(TYPE, NAME, ...)  
 \
+void ff_##NAME(void);  
 \
+DECL_ENTRY(TYPE, NAME, 
 \
+.func = ff_##NAME, 
 \
+__VA_ARGS__)
+
+#define DECL_PATTERN(TYPE, NAME, X, Y, Z, W, ...)  
 \
+DECL_ASM(TYPE, p##X##Y##Z##W##_##NAME, 
 \
+.op.comps.unused = { !X, !Y, !Z, !W }, 
 \
+__VA_ARGS__
 \
+)
+
+#define REF_PATTERN(NAME, X, Y, Z, W)  
 \
+&op_p##X##Y##Z##W##_##NAME
+
+#define DECL_COMMON_PATTERNS(TYPE, NAME, ...)  
 \
+DECL_PATTERN(TYPE, NAME, 1, 0, 0, 0, __VA_ARGS__); 
 \
+DECL_PATTERN(TYPE, NAME, 1, 0, 0, 1, __VA_ARGS__); 
 \
+DECL_PATTERN(TYPE, NAME, 1, 1, 1, 0, __VA_ARGS__); 
 \
+DECL_PATTERN(TYPE, NAME, 1, 1, 1, 1, __VA_ARGS__)  
 \
+
+#define REF_COMMON_PATTERNS(NAME)  
 \
+REF_PATTERN(NAME, 1, 0, 0, 0), 
 \
+REF_PATTERN(NAME, 1, 0, 0, 1), 
 \
+

[FFmpeg-devel] [PATCH v2 16/17] swscale/format: add new format decode/encode logic

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

This patch adds format handling code for the new operations. This entails
fully decoding a format to standardized RGB, and the inverse.

Handling it this way means we can always guarantee that a conversion path
exists from A to B without having to explicitly cover logic for each path;
and choosing RGB instead of YUV as the intermediate (as was done in swscale
v1) is more flexible with regards to enabling further operations such as
primaries conversions, linear scaling, etc.

In the case of YUV->YUV transform, the redundant matrix multiplication will
be canceled out anyways.
---
 libswscale/format.c | 926 
 libswscale/format.h |  23 ++
 2 files changed, 949 insertions(+)

diff --git a/libswscale/format.c b/libswscale/format.c
index b77081dd7a..7cbc5b37db 100644
--- a/libswscale/format.c
+++ b/libswscale/format.c
@@ -21,8 +21,22 @@
 #include "libavutil/avassert.h"
 #include "libavutil/hdr_dynamic_metadata.h"
 #include "libavutil/mastering_display_metadata.h"
+#include "libavutil/refstruct.h"
 
 #include "format.h"
+#include "csputils.h"
+#include "ops_internal.h"
+
+#define Q(N) ((AVRational) { N, 1 })
+#define Q0   Q(0)
+#define Q1   Q(1)
+
+#define RET(x) 
\
+do {   
\
+int __ret = (x);   
\
+if (__ret  < 0)
\
+return __ret;  
\
+} while (0)
 
 typedef struct LegacyFormatEntry {
 uint8_t is_supported_in :1;
@@ -582,3 +596,915 @@ int sws_is_noop(const AVFrame *dst, const AVFrame *src)
 
 return 1;
 }
+
+/* Returns the type suitable for a pixel after fully decoding/unpacking it */
+static SwsPixelType fmt_pixel_type(enum AVPixelFormat fmt)
+{
+const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(fmt);
+const int bits = FFALIGN(desc->comp[0].depth, 8);
+if (desc->flags & AV_PIX_FMT_FLAG_FLOAT) {
+switch (bits) {
+case 32: return SWS_PIXEL_F32;
+}
+} else {
+switch (bits) {
+case  8: return SWS_PIXEL_U8;
+case 16: return SWS_PIXEL_U16;
+case 32: return SWS_PIXEL_U32;
+}
+}
+
+return SWS_PIXEL_NONE;
+}
+
+static SwsSwizzleOp fmt_swizzle(enum AVPixelFormat fmt)
+{
+switch (fmt) {
+case AV_PIX_FMT_ARGB:
+case AV_PIX_FMT_0RGB:
+case AV_PIX_FMT_AYUV64LE:
+case AV_PIX_FMT_AYUV64BE:
+case AV_PIX_FMT_AYUV:
+case AV_PIX_FMT_X2RGB10LE:
+case AV_PIX_FMT_X2RGB10BE:
+return (SwsSwizzleOp) {{ .x = 3, 0, 1, 2 }};
+case AV_PIX_FMT_BGR24:
+case AV_PIX_FMT_BGR8:
+case AV_PIX_FMT_BGR4:
+case AV_PIX_FMT_BGR4_BYTE:
+case AV_PIX_FMT_BGRA:
+case AV_PIX_FMT_BGR565BE:
+case AV_PIX_FMT_BGR565LE:
+case AV_PIX_FMT_BGR555BE:
+case AV_PIX_FMT_BGR555LE:
+case AV_PIX_FMT_BGR444BE:
+case AV_PIX_FMT_BGR444LE:
+case AV_PIX_FMT_BGR48BE:
+case AV_PIX_FMT_BGR48LE:
+case AV_PIX_FMT_BGRA64BE:
+case AV_PIX_FMT_BGRA64LE:
+case AV_PIX_FMT_BGR0:
+case AV_PIX_FMT_VUYA:
+case AV_PIX_FMT_VUYX:
+return (SwsSwizzleOp) {{ .x = 2, 1, 0, 3 }};
+case AV_PIX_FMT_ABGR:
+case AV_PIX_FMT_0BGR:
+case AV_PIX_FMT_X2BGR10LE:
+case AV_PIX_FMT_X2BGR10BE:
+return (SwsSwizzleOp) {{ .x = 3, 2, 1, 0 }};
+case AV_PIX_FMT_YA8:
+case AV_PIX_FMT_YA16BE:
+case AV_PIX_FMT_YA16LE:
+return (SwsSwizzleOp) {{ .x = 0, 3, 1, 2 }};
+case AV_PIX_FMT_XV30BE:
+case AV_PIX_FMT_XV30LE:
+return (SwsSwizzleOp) {{ .x = 3, 2, 0, 1 }};
+case AV_PIX_FMT_VYU444:
+case AV_PIX_FMT_V30XBE:
+case AV_PIX_FMT_V30XLE:
+return (SwsSwizzleOp) {{ .x = 2, 0, 1, 3 }};
+case AV_PIX_FMT_XV36BE:
+case AV_PIX_FMT_XV36LE:
+case AV_PIX_FMT_XV48BE:
+case AV_PIX_FMT_XV48LE:
+case AV_PIX_FMT_UYVA:
+return (SwsSwizzleOp) {{ .x = 1, 0, 2, 3 }};
+case AV_PIX_FMT_GBRP:
+case AV_PIX_FMT_GBRP9BE:
+case AV_PIX_FMT_GBRP9LE:
+case AV_PIX_FMT_GBRP10BE:
+case AV_PIX_FMT_GBRP10LE:
+case AV_PIX_FMT_GBRP12BE:
+case AV_PIX_FMT_GBRP12LE:
+case AV_PIX_FMT_GBRP14BE:
+case AV_PIX_FMT_GBRP14LE:
+case AV_PIX_FMT_GBRP16BE:
+case AV_PIX_FMT_GBRP16LE:
+case AV_PIX_FMT_GBRPF16BE:
+case AV_PIX_FMT_GBRPF16LE:
+case AV_PIX_FMT_GBRAP:
+case AV_PIX_FMT_GBRAP10LE:
+case AV_PIX_FMT_GBRAP10BE:
+case AV_PIX_FMT_GBRAP12LE:
+case AV_PIX_FMT_GBRAP12BE:
+case AV_PIX_FMT_GBRAP14LE:
+case AV_PIX_FMT_GBRAP14BE:
+case AV_PIX_FMT_GBRAP16LE:
+case AV_PIX_FMT_GBRAP16BE:
+case AV_PIX_FMT_GBRPF32BE:
+case AV_PIX_FMT_GBRPF32LE:
+case AV_PIX_FMT_GBRAPF16BE:
+case AV_PIX_FMT_GBRAPF16LE:
+case AV_PIX_FMT_GBRAPF32BE:
+case AV_PIX_FMT_GBRAPF32LE:
+

[FFmpeg-devel] [PATCH v2 15/17] tests/checkasm: add checkasm tests for swscale ops

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

Because of the lack of an external ABI on low-level kernels, we cannot
directly test internal functions. Instead, we construct a minimal op chain
consisting of a read, the op to be tested, and a write.

The bigger complication arises from the fact that the backend may generate
arbitrary internal state that needs to be passed back to the implementation,
which means we cannot directly call `func_ref` on the generated chain. To get
around this, always compile the op chain twice - once using the backend to be
tested, and once using the reference C backend.

The actual entry point may also just be a shared wrapper, so we need to
be very careful to run checkasm_check_func() on a pseudo-pointer that will
actually be unique for each combination of backend and active CPU flags.
---
 tests/checkasm/Makefile   |   8 +-
 tests/checkasm/checkasm.c |   1 +
 tests/checkasm/checkasm.h |   1 +
 tests/checkasm/sw_ops.c   | 770 ++
 4 files changed, 779 insertions(+), 1 deletion(-)
 create mode 100644 tests/checkasm/sw_ops.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index fabbf595b4..d38ec371df 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -66,7 +66,13 @@ AVFILTEROBJS-$(CONFIG_SOBEL_FILTER)  += vf_convolution.o
 CHECKASMOBJS-$(CONFIG_AVFILTER) += $(AVFILTEROBJS-yes)
 
 # swscale tests
-SWSCALEOBJS += sw_gbrp.o sw_range_convert.o 
sw_rgb.o sw_scale.o sw_yuv2rgb.o sw_yuv2yuv.o
+SWSCALEOBJS += sw_gbrp.o\
+   sw_ops.o \
+   sw_range_convert.o   \
+   sw_rgb.o \
+   sw_scale.o   \
+   sw_yuv2rgb.o \
+   sw_yuv2yuv.o
 
 CHECKASMOBJS-$(CONFIG_SWSCALE)  += $(SWSCALEOBJS)
 
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index f393a0cb96..11bd5668cf 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -298,6 +298,7 @@ static const struct {
 { "sw_scale", checkasm_check_sw_scale },
 { "sw_yuv2rgb", checkasm_check_sw_yuv2rgb },
 { "sw_yuv2yuv", checkasm_check_sw_yuv2yuv },
+{ "sw_ops", checkasm_check_sw_ops },
 #endif
 #if CONFIG_AVUTIL
 { "aes",   checkasm_check_aes },
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index ec01bd6207..d69f4cb835 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -132,6 +132,7 @@ void checkasm_check_sw_rgb(void);
 void checkasm_check_sw_scale(void);
 void checkasm_check_sw_yuv2rgb(void);
 void checkasm_check_sw_yuv2yuv(void);
+void checkasm_check_sw_ops(void);
 void checkasm_check_takdsp(void);
 void checkasm_check_utvideodsp(void);
 void checkasm_check_v210dec(void);
diff --git a/tests/checkasm/sw_ops.c b/tests/checkasm/sw_ops.c
new file mode 100644
index 00..e82f028fd9
--- /dev/null
+++ b/tests/checkasm/sw_ops.c
@@ -0,0 +1,770 @@
+/**
+ * Copyright (C) 2025 Niklas Haas
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+
+#include "libavutil/avassert.h"
+#include "libavutil/mem_internal.h"
+#include "libavutil/refstruct.h"
+
+#include "libswscale/ops.h"
+#include "libswscale/ops_internal.h"
+
+#include "checkasm.h"
+
+enum {
+LINES  = 2,
+PLANES = 4,
+PIXELS = 64,
+};
+
+enum {
+U8  = SWS_PIXEL_U8,
+U16 = SWS_PIXEL_U16,
+U32 = SWS_PIXEL_U32,
+F32 = SWS_PIXEL_F32,
+};
+
+#define FMT(fmt, ...) tprintf((char[256]) {0}, 256, fmt, __VA_ARGS__)
+static const char *tprintf(char buf[], size_t size, const char *fmt, ...)
+{
+va_list ap;
+va_start(ap, fmt);
+vsnprintf(buf, size, fmt, ap);
+va_end(ap);
+return buf;
+}
+
+static int rw_pixel_bits(const SwsOp *op)
+{
+const int elems = op->rw.packed ? op->rw.elems : 1;
+const int size  = ff_sws_pixel_type_size(op->type);
+const int bits  = 8 >> op->rw.frac;
+av_assert1(bits >= 1);
+return elems * size * bits;
+}
+
+static float rndf(void)
+{
+union { uint32_t u; float f; } x;
+do {
+x.u = rnd();
+} while (!isnor

[FFmpeg-devel] [PATCH v2 17/17] swscale/graph: allow experimental use of new format handler

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

\o/
---
 libswscale/graph.c | 84 --
 1 file changed, 82 insertions(+), 2 deletions(-)

diff --git a/libswscale/graph.c b/libswscale/graph.c
index dc7784aa49..24930e7627 100644
--- a/libswscale/graph.c
+++ b/libswscale/graph.c
@@ -34,6 +34,7 @@
 #include "lut3d.h"
 #include "swscale_internal.h"
 #include "graph.h"
+#include "ops.h"
 
 static int pass_alloc_output(SwsPass *pass)
 {
@@ -453,6 +454,85 @@ static int add_legacy_sws_pass(SwsGraph *graph, SwsFormat 
src, SwsFormat dst,
 return 0;
 }
 
+/*
+ * Format conversion *
+ */
+
+static int add_convert_pass(SwsGraph *graph, SwsFormat src, SwsFormat dst,
+SwsPass *input, SwsPass **output)
+{
+const SwsPixelType type = SWS_PIXEL_F32;
+
+SwsContext *ctx = graph->ctx;
+SwsOpList *ops = NULL;
+int ret = AVERROR(ENOTSUP);
+
+/* Mark the entire new ops infrastructure as experimental for now */
+if (!(ctx->flags & SWS_UNSTABLE))
+goto fail;
+
+/* The new format conversion layer cannot scale for now */
+if (src.width != dst.width || src.height != dst.height ||
+src.desc->log2_chroma_h || src.desc->log2_chroma_w ||
+dst.desc->log2_chroma_h || dst.desc->log2_chroma_w)
+goto fail;
+
+/* The new code does not yet support alpha blending */
+if (src.desc->flags & AV_PIX_FMT_FLAG_ALPHA &&
+ctx->alpha_blend != SWS_ALPHA_BLEND_NONE)
+goto fail;
+
+ops = ff_sws_op_list_alloc();
+if (!ops)
+return AVERROR(ENOMEM);
+ops->src = src;
+ops->dst = dst;
+
+ret = ff_sws_decode_pixfmt(ops, src.format);
+if (ret < 0)
+goto fail;
+ret = ff_sws_decode_colors(ctx, type, ops, src, &graph->incomplete);
+if (ret < 0)
+goto fail;
+ret = ff_sws_encode_colors(ctx, type, ops, dst, &graph->incomplete);
+if (ret < 0)
+goto fail;
+ret = ff_sws_encode_pixfmt(ops, dst.format);
+if (ret < 0)
+goto fail;
+
+av_log(ctx, AV_LOG_VERBOSE, "Conversion pass for %s -> %s:\n",
+   av_get_pix_fmt_name(src.format), av_get_pix_fmt_name(dst.format));
+
+av_log(ctx, AV_LOG_DEBUG, "Unoptimized operation list:\n");
+ff_sws_op_list_print(ctx, AV_LOG_DEBUG, ops);
+av_log(ctx, AV_LOG_DEBUG, "Optimized operation list:\n");
+
+ff_sws_op_list_optimize(ops);
+if (ops->num_ops == 0) {
+av_log(ctx, AV_LOG_VERBOSE, "  optimized into memcpy\n");
+ff_sws_op_list_free(&ops);
+*output = input;
+return 0;
+}
+
+ff_sws_op_list_print(ctx, AV_LOG_VERBOSE, ops);
+
+ret = ff_sws_compile_pass(graph, ops, 0, dst, input, output);
+if (ret < 0)
+goto fail;
+
+ret = 0;
+/* fall through */
+
+fail:
+ff_sws_op_list_free(&ops);
+if (ret == AVERROR(ENOTSUP))
+return add_legacy_sws_pass(graph, src, dst, input, output);
+return ret;
+}
+
+
 /**
  * Gamut and tone mapping *
  **/
@@ -522,7 +602,7 @@ static int adapt_colors(SwsGraph *graph, SwsFormat src, 
SwsFormat dst,
 if (fmt_in != src.format) {
 SwsFormat tmp = src;
 tmp.format = fmt_in;
-ret = add_legacy_sws_pass(graph, src, tmp, input, &input);
+ret = add_convert_pass(graph, src, tmp, input, &input);
 if (ret < 0)
 return ret;
 }
@@ -564,7 +644,7 @@ static int init_passes(SwsGraph *graph)
 src.color  = dst.color;
 
 if (!ff_fmt_equal(&src, &dst)) {
-ret = add_legacy_sws_pass(graph, src, dst, pass, &pass);
+ret = add_convert_pass(graph, src, dst, pass, &pass);
 if (ret < 0)
 return ret;
 }
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 13/17] swscale/ops_memcpy: add 'memcpy' backend for plane->plane copies

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

Provides a generic fast path for any operation list that can be decomposed
into a series of memcpy and memset operations.

25% faster than the x86 backend for yuv444p -> yuva444p
33% faster than the x86 backend for gray -> yuvj444p
---
 libswscale/Makefile |   1 +
 libswscale/ops.c|   2 +
 libswscale/ops_memcpy.c | 132 
 3 files changed, 135 insertions(+)
 create mode 100644 libswscale/ops_memcpy.c

diff --git a/libswscale/Makefile b/libswscale/Makefile
index 6e5696c5a6..136d33f6bc 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -18,6 +18,7 @@ OBJS = alphablend.o \
ops.o\
ops_backend.o\
ops_chain.o  \
+   ops_memcpy.o \
ops_optimizer.o  \
options.o\
output.o \
diff --git a/libswscale/ops.c b/libswscale/ops.c
index 3b9c2844f8..6403eff324 100644
--- a/libswscale/ops.c
+++ b/libswscale/ops.c
@@ -28,8 +28,10 @@
 #include "ops_internal.h"
 
 extern SwsOpBackend backend_c;
+extern SwsOpBackend backend_murder;
 
 const SwsOpBackend * const ff_sws_op_backends[] = {
+&backend_murder,
 &backend_c,
 NULL
 };
diff --git a/libswscale/ops_memcpy.c b/libswscale/ops_memcpy.c
new file mode 100644
index 00..1fcb58d452
--- /dev/null
+++ b/libswscale/ops_memcpy.c
@@ -0,0 +1,132 @@
+/**
+ * Copyright (C) 2025 Niklas Haas
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/avassert.h"
+
+#include "ops_backend.h"
+
+typedef struct MemcpyPriv {
+int num_planes;
+int index[4]; /* or -1 to clear plane */
+uint8_t clear_value[4];
+} MemcpyPriv;
+
+/* Memcpy backend for trivial cases */
+
+static void process(const SwsOpExec *exec, const void *priv,
+int x_start, int y_start, int x_end, int y_end)
+{
+const MemcpyPriv *p = priv;
+const int lines = y_end - y_start;
+av_assert1(x_start == 0 && x_end == exec->width);
+
+for (int i = 0; i < p->num_planes; i++) {
+uint8_t *out = exec->out[i];
+const int idx = p->index[i];
+if (idx < 0) {
+memset(out, p->clear_value[i], exec->out_stride[i] * lines);
+} else if (exec->out_stride[i] == exec->in_stride[idx]) {
+memcpy(out, exec->in[idx], exec->out_stride[i] * lines);
+} else {
+const int bytes = x_end * exec->pixel_bits_out >> 3;
+const uint8_t *in = exec->in[idx];
+for (int y = y_start; y < y_end; y++) {
+memcpy(out, in, bytes);
+out += exec->out_stride[i];
+in  += exec->in_stride[idx];
+}
+}
+}
+}
+
+static int compile(SwsContext *ctx, SwsOpList *ops, SwsCompiledOp *out)
+{
+MemcpyPriv p = {0};
+
+for (int n = 0; n < ops->num_ops; n++) {
+const SwsOp *op = &ops->ops[n];
+switch (op->op) {
+case SWS_OP_READ:
+if ((op->rw.packed && op->rw.elems != 1) || op->rw.frac)
+return AVERROR(ENOTSUP);
+for (int i = 0; i < op->rw.elems; i++)
+p.index[i] = i;
+break;
+
+case SWS_OP_SWIZZLE: {
+const MemcpyPriv orig = p;
+for (int i = 0; i < 4; i++) {
+/* Explicitly exclude swizzle masks that contain duplicates,
+ * because these are wasteful to implement as a memcpy */
+for (int j = 0; j < i; j++) {
+if (op->swizzle.in[i] == op->swizzle.in[j])
+return AVERROR(ENOTSUP);
+}
+p.index[i] = orig.index[op->swizzle.in[i]];
+}
+break;
+}
+
+case SWS_OP_CLEAR:
+for (int i = 0; i < 4; i++) {
+if (!op->c.q4[i].den)
+continue;
+if (op->c.q4[i].den != 1)
+return AVERROR(ENOTSUP);
+
+/* Ensure all bytes to be cleared are the sam

[FFmpeg-devel] [PATCH v2 07/17] swscale/optimizer: add high-level ops optimizer

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

This is responsible for taking a "naive" ops list and optimizing it
as much as possible. Also includes a small analyzer that generates component
metadata for use by the optimizer.
---
 libswscale/Makefile|   1 +
 libswscale/ops.h   |  12 +
 libswscale/ops_optimizer.c | 783 +
 3 files changed, 796 insertions(+)
 create mode 100644 libswscale/ops_optimizer.c

diff --git a/libswscale/Makefile b/libswscale/Makefile
index e0beef4e69..810c9dee78 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -16,6 +16,7 @@ OBJS = alphablend.o \
input.o  \
lut3d.o  \
ops.o\
+   ops_optimizer.o  \
options.o\
output.o \
rgb2rgb.o\
diff --git a/libswscale/ops.h b/libswscale/ops.h
index 85462ae337..ae65d578b3 100644
--- a/libswscale/ops.h
+++ b/libswscale/ops.h
@@ -237,4 +237,16 @@ void ff_sws_op_list_remove_at(SwsOpList *ops, int index, 
int count);
  */
 void ff_sws_op_list_print(void *log_ctx, int log_level, const SwsOpList *ops);
 
+/**
+ * Infer + propagate known information about components. Called automatically
+ * when needed by the optimizer and compiler.
+ */
+void ff_sws_op_list_update_comps(SwsOpList *ops);
+
+/**
+ * Fuse compatible and eliminate redundant operations, as well as replacing
+ * some operations with more efficient alternatives.
+ */
+int ff_sws_op_list_optimize(SwsOpList *ops);
+
 #endif
diff --git a/libswscale/ops_optimizer.c b/libswscale/ops_optimizer.c
new file mode 100644
index 00..d503bf7bf3
--- /dev/null
+++ b/libswscale/ops_optimizer.c
@@ -0,0 +1,783 @@
+/**
+ * Copyright (C) 2025 Niklas Haas
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/avassert.h"
+#include "libavutil/rational.h"
+
+#include "ops.h"
+
+#define Q(N) ((AVRational) { N, 1 })
+
+#define RET(x) 
\
+do {   
\
+if ((ret = (x)) < 0)   
\
+return ret;
\
+} while (0)
+
+/* Returns true for operations that are independent per channel. These can
+ * usually be commuted freely other such operations. */
+static bool op_type_is_independent(SwsOpType op)
+{
+switch (op) {
+case SWS_OP_SWAP_BYTES:
+case SWS_OP_LSHIFT:
+case SWS_OP_RSHIFT:
+case SWS_OP_CONVERT:
+case SWS_OP_DITHER:
+case SWS_OP_MIN:
+case SWS_OP_MAX:
+case SWS_OP_SCALE:
+return true;
+case SWS_OP_INVALID:
+case SWS_OP_READ:
+case SWS_OP_WRITE:
+case SWS_OP_SWIZZLE:
+case SWS_OP_CLEAR:
+case SWS_OP_LINEAR:
+case SWS_OP_PACK:
+case SWS_OP_UNPACK:
+return false;
+case SWS_OP_TYPE_NB:
+break;
+}
+
+av_assert0(!"Invalid operation type!");
+return false;
+}
+
+static AVRational expand_factor(SwsPixelType from, SwsPixelType to)
+{
+const int src = ff_sws_pixel_type_size(from);
+const int dst = ff_sws_pixel_type_size(to);
+int scale = 0;
+for (int i = 0; i < dst / src; i++)
+scale = scale << src * 8 | 1;
+return Q(scale);
+}
+
+/* merge_comp_flags() forms a monoid with flags_identity as the null element */
+static const unsigned flags_identity = SWS_COMP_ZERO | SWS_COMP_EXACT;
+static unsigned merge_comp_flags(unsigned a, unsigned b)
+{
+const unsigned flags_or  = SWS_COMP_GARBAGE;
+const unsigned flags_and = SWS_COMP_ZERO | SWS_COMP_EXACT;
+return ((a & b) & flags_and) | ((a | b) & flags_or);
+}
+
+/* Infer + propagate known information about components */
+void ff_sws_op_list_update_comps(SwsOpList *ops)
+{
+SwsComps next = { .unused = {true, true, true, true} };
+SwsComps prev = { .flags = {
+SWS_COMP_GARBAGE, SWS_COMP_GARBAGE, SWS_COMP_GARBAGE, SWS_COMP_GARBAGE,
+}};
+
+

[FFmpeg-devel] [PATCH v2 06/17] swscale/ops: introduce new low level framework

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

See docs/swscale-v2.txt for an in-depth introduction to the new approach.

This commit merely introduces the ops definitions and boilerplate functions.
The subsequent commits will flesh out the underlying implementation.
---
 libswscale/Makefile |   1 +
 libswscale/ops.c| 522 
 libswscale/ops.h| 240 
 3 files changed, 763 insertions(+)
 create mode 100644 libswscale/ops.c
 create mode 100644 libswscale/ops.h

diff --git a/libswscale/Makefile b/libswscale/Makefile
index d5e10d17dc..e0beef4e69 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -15,6 +15,7 @@ OBJS = alphablend.o \
graph.o  \
input.o  \
lut3d.o  \
+   ops.o\
options.o\
output.o \
rgb2rgb.o\
diff --git a/libswscale/ops.c b/libswscale/ops.c
new file mode 100644
index 00..004686147d
--- /dev/null
+++ b/libswscale/ops.c
@@ -0,0 +1,522 @@
+/**
+ * Copyright (C) 2025 Niklas Haas
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/avassert.h"
+#include "libavutil/bswap.h"
+#include "libavutil/mem.h"
+#include "libavutil/rational.h"
+#include "libavutil/refstruct.h"
+
+#include "ops.h"
+
+#define Q(N) ((AVRational) { N, 1 })
+
+const char *ff_sws_pixel_type_name(SwsPixelType type)
+{
+switch (type) {
+case SWS_PIXEL_U8:   return "u8";
+case SWS_PIXEL_U16:  return "u16";
+case SWS_PIXEL_U32:  return "u32";
+case SWS_PIXEL_F32:  return "f32";
+case SWS_PIXEL_NONE: return "none";
+case SWS_PIXEL_TYPE_NB: break;
+}
+
+av_assert0(!"Invalid pixel type!");
+return "ERR";
+}
+
+int ff_sws_pixel_type_size(SwsPixelType type)
+{
+switch (type) {
+case SWS_PIXEL_U8:  return sizeof(uint8_t);
+case SWS_PIXEL_U16: return sizeof(uint16_t);
+case SWS_PIXEL_U32: return sizeof(uint32_t);
+case SWS_PIXEL_F32: return sizeof(float);
+case SWS_PIXEL_NONE: break;
+case SWS_PIXEL_TYPE_NB: break;
+}
+
+av_assert0(!"Invalid pixel type!");
+return 0;
+}
+
+bool ff_sws_pixel_type_is_int(SwsPixelType type)
+{
+switch (type) {
+case SWS_PIXEL_U8:
+case SWS_PIXEL_U16:
+case SWS_PIXEL_U32:
+return true;
+case SWS_PIXEL_F32:
+return false;
+case SWS_PIXEL_NONE:
+case SWS_PIXEL_TYPE_NB: break;
+}
+
+av_assert0(!"Invalid pixel type!");
+return false;
+}
+
+SwsPixelType ff_sws_pixel_type_to_uint(SwsPixelType type)
+{
+if (!type)
+return type;
+
+switch (ff_sws_pixel_type_size(type)) {
+case 8:  return SWS_PIXEL_U8;
+case 16: return SWS_PIXEL_U16;
+case 32: return SWS_PIXEL_U32;
+}
+
+av_assert0(!"Invalid pixel type!");
+return SWS_PIXEL_NONE;
+}
+
+/* biased towards `a` */
+static AVRational av_min_q(AVRational a, AVRational b)
+{
+return av_cmp_q(a, b) == 1 ? b : a;
+}
+
+static AVRational av_max_q(AVRational a, AVRational b)
+{
+return av_cmp_q(a, b) == -1 ? b : a;
+}
+
+static AVRational expand_factor(SwsPixelType from, SwsPixelType to)
+{
+const int src = ff_sws_pixel_type_size(from);
+const int dst = ff_sws_pixel_type_size(to);
+int scale = 0;
+for (int i = 0; i < dst / src; i++)
+scale = scale << src * 8 | 1;
+return Q(scale);
+}
+
+void ff_sws_apply_op_q(const SwsOp *op, AVRational x[4])
+{
+switch (op->op) {
+case SWS_OP_READ:
+case SWS_OP_WRITE:
+return;
+case SWS_OP_UNPACK: {
+unsigned val = x[0].num;
+int shift = ff_sws_pixel_type_size(op->type) * 8;
+for (int i = 0; i < 4; i++) {
+const unsigned mask = (1 << op->pack.pattern[i]) - 1;
+shift -= op->pack.pattern[i];
+x[i] = Q((val >> shift) & mask);
+}
+return;
+}
+case SWS_OP_PACK: {
+unsigned val = 0;
+int shift = ff_sws_pixel_type_size(op->type) * 8;
+for (int i = 0; i < 4; i++

[FFmpeg-devel] [PATCH v2 08/17] swscale/ops_internal: add internal ops backend API

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

This adds an internal API for ops backends, which are responsible for
compiling op lists into executable functions.
---
 libswscale/ops.c  |  62 ++
 libswscale/ops_internal.h | 108 ++
 2 files changed, 170 insertions(+)
 create mode 100644 libswscale/ops_internal.h

diff --git a/libswscale/ops.c b/libswscale/ops.c
index 004686147d..8491bd9cad 100644
--- a/libswscale/ops.c
+++ b/libswscale/ops.c
@@ -25,9 +25,22 @@
 #include "libavutil/refstruct.h"
 
 #include "ops.h"
+#include "ops_internal.h"
+
+const SwsOpBackend * const ff_sws_op_backends[] = {
+NULL
+};
+
+const int ff_sws_num_op_backends = FF_ARRAY_ELEMS(ff_sws_op_backends) - 1;
 
 #define Q(N) ((AVRational) { N, 1 })
 
+#define RET(x) 
\
+do {   
\
+if ((ret = (x)) < 0)   
\
+return ret;
\
+} while (0)
+
 const char *ff_sws_pixel_type_name(SwsPixelType type)
 {
 switch (type) {
@@ -520,3 +533,52 @@ void ff_sws_op_list_print(void *log, int lev, const 
SwsOpList *ops)
 
 av_log(log, lev, "(X = unused, + = exact, 0 = zero)\n");
 }
+
+int ff_sws_ops_compile_backend(SwsContext *ctx, const SwsOpBackend *backend,
+   const SwsOpList *ops, SwsCompiledOp *out)
+{
+SwsOpList *copy, rest;
+int ret = 0;
+
+copy = ff_sws_op_list_duplicate(ops);
+if (!copy)
+return AVERROR(ENOMEM);
+
+/* Ensure these are always set during compilation */
+ff_sws_op_list_update_comps(copy);
+
+/* Make an on-stack copy of `ops` to ensure we can still properly clean up
+ * the copy afterwards */
+rest = *copy;
+
+ret = backend->compile(ctx, &rest, out);
+if (ret == AVERROR(ENOTSUP)) {
+av_log(ctx, AV_LOG_DEBUG, "Backend '%s' does not support 
operations:\n", backend->name);
+ff_sws_op_list_print(ctx, AV_LOG_DEBUG, &rest);
+} else if (ret < 0) {
+av_log(ctx, AV_LOG_ERROR, "Failed to compile operations: %s\n", 
av_err2str(ret));
+ff_sws_op_list_print(ctx, AV_LOG_ERROR, &rest);
+}
+
+ff_sws_op_list_free(©);
+return ret;
+}
+
+int ff_sws_ops_compile(SwsContext *ctx, const SwsOpList *ops, SwsCompiledOp 
*out)
+{
+for (int n = 0; ff_sws_op_backends[n]; n++) {
+const SwsOpBackend *backend = ff_sws_op_backends[n];
+if (ff_sws_ops_compile_backend(ctx, backend, ops, out) < 0)
+continue;
+
+av_log(ctx, AV_LOG_VERBOSE, "Compiled using backend '%s': "
+   "block size = %d, over-read = %d, over-write = %d, cpu flags = 
0x%x\n",
+   backend->name, out->block_size, out->over_read, out->over_write,
+   out->cpu_flags);
+return 0;
+}
+
+av_log(ctx, AV_LOG_WARNING, "No backend found for operations:\n");
+ff_sws_op_list_print(ctx, AV_LOG_WARNING, ops);
+return AVERROR(ENOTSUP);
+}
diff --git a/libswscale/ops_internal.h b/libswscale/ops_internal.h
new file mode 100644
index 00..9fd866430b
--- /dev/null
+++ b/libswscale/ops_internal.h
@@ -0,0 +1,108 @@
+/**
+ * Copyright (C) 2025 Niklas Haas
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef SWSCALE_OPS_INTERNAL_H
+#define SWSCALE_OPS_INTERNAL_H
+
+#include "libavutil/mem_internal.h"
+
+#include "ops.h"
+
+/**
+ * Global execution context for all compiled functions.
+ *
+ * Note: This struct is hard-coded in assembly, so do not change the layout
+ * without updating the corresponding assembly definitions.
+ */
+typedef struct SwsOpExec {
+/* The data pointers point to the first pixel to process */
+DECLARE_ALIGNED_32(const uint8_t, *in[4]);
+DECLARE_ALIGNED_32(uint8_t, *out[4]);
+
+/* Separation between lines in bytes */
+DECLARE_ALIGNED_32(ptrdiff_t, in_stride[4]);
+DECLARE_ALIGNED_32(ptrdiff_t, out_stride[4]);
+
+/* Extra metadata, may or may not be useful */
+int32_t width, height;  /* Overall image dimensions */
+int32_t slice_y, slice_h;   /* Start and height of current slice */
+int32_t pixel_b

[FFmpeg-devel] [PATCH v2 05/17] swscale: add SWS_UNSTABLE flag

2025-05-21 Thread Niklas Haas

From: Niklas Haas 

Give users and developers a way to opt in to the new format conversion code,
and more code from the swscale rewrite in general, even while development is
still ongoing.
---
 doc/APIchanges   | 3 +++
 doc/scaler.texi  | 4 
 libswscale/options.c | 1 +
 libswscale/swscale.h | 7 +++
 libswscale/version.h | 2 +-
 5 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/doc/APIchanges b/doc/APIchanges
index d0869561f3..fb202c7908 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -2,6 +2,9 @@ The last version increases of all libraries were on 2025-03-28
 
 API changes, most recent first:
 
+2025-04-xx - xx - lsws 9.1.100 - swscale.h
+  Add SWS_UNSTABLE flag.
+
 2025-02-xx - xx - lavfi 10.10.100 - avfilter.h
   Add avfilter_link_get_hw_frames_ctx().
 
diff --git a/doc/scaler.texi b/doc/scaler.texi
index eb045de6b7..42b2377761 100644
--- a/doc/scaler.texi
+++ b/doc/scaler.texi
@@ -68,6 +68,10 @@ Select full chroma input.
 
 @item bitexact
 Enable bitexact output.
+
+@item unstable
+Allow the use of experimental new code. May subtly affect the output or even
+produce wrong results. For testing only.
 @end table
 
 @item srcw @var{(API only)}
diff --git a/libswscale/options.c b/libswscale/options.c
index feecae8c89..06e51dcfe9 100644
--- a/libswscale/options.c
+++ b/libswscale/options.c
@@ -50,6 +50,7 @@ static const AVOption swscale_options[] = {
 { "full_chroma_inp", "full chroma input", 0,  
AV_OPT_TYPE_CONST, { .i64 = SWS_FULL_CHR_H_INP }, .flags = VE, .unit = 
"sws_flags" },
 { "bitexact","bit-exact mode",0,  
AV_OPT_TYPE_CONST, { .i64 = SWS_BITEXACT   }, .flags = VE, .unit = 
"sws_flags" },
 { "error_diffusion", "error diffusion dither",0,  
AV_OPT_TYPE_CONST, { .i64 = SWS_ERROR_DIFFUSION}, .flags = VE, .unit = 
"sws_flags" },
+{ "unstable","allow experimental new code",   0,  
AV_OPT_TYPE_CONST, { .i64 = SWS_UNSTABLE   }, .flags = VE, .unit = 
"sws_flags" },
 
 { "param0",  "scaler param 0", OFFSET(scaler_params[0]), 
AV_OPT_TYPE_DOUBLE, { .dbl = SWS_PARAM_DEFAULT  }, INT_MIN, INT_MAX, VE },
 { "param1",  "scaler param 1", OFFSET(scaler_params[1]), 
AV_OPT_TYPE_DOUBLE, { .dbl = SWS_PARAM_DEFAULT  }, INT_MIN, INT_MAX, VE },
diff --git a/libswscale/swscale.h b/libswscale/swscale.h
index b04aa182d2..4aa072009c 100644
--- a/libswscale/swscale.h
+++ b/libswscale/swscale.h
@@ -155,6 +155,13 @@ typedef enum SwsFlags {
 SWS_ACCURATE_RND   = 1 << 18,
 SWS_BITEXACT   = 1 << 19,
 
+/**
+ * Allow using experimental new code paths. This may be faster, slower,
+ * or produce different output, with semantics subject to change at any
+ * point in time. For testing and debugging purposes only.
+ */
+SWS_UNSTABLE = 1 << 20,
+
 /**
  * Deprecated flags.
  */
diff --git a/libswscale/version.h b/libswscale/version.h
index 148efd83eb..4e54701aba 100644
--- a/libswscale/version.h
+++ b/libswscale/version.h
@@ -28,7 +28,7 @@
 
 #include "version_major.h"
 
-#define LIBSWSCALE_VERSION_MINOR   0
+#define LIBSWSCALE_VERSION_MINOR   1
 #define LIBSWSCALE_VERSION_MICRO 100
 
 #define LIBSWSCALE_VERSION_INT  AV_VERSION_INT(LIBSWSCALE_VERSION_MAJOR, \
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_interlace_vulkan: fix FPS and PTS calculation

2025-05-21 Thread Lynne


LGTM

On 21/05/2025 23:12, Niklas Haas wrote:

From: Niklas Haas 

ol->frame_rate is 0/0, so we need to calcalute the correct value based on
the il->frame_rate instead. Also adjust the time base, PTS and frame_duration
values accordingly. (Logic taken from vf_tinterlace.c)
---
  libavfilter/vf_interlace_vulkan.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libavfilter/vf_interlace_vulkan.c 
b/libavfilter/vf_interlace_vulkan.c
index b5cd321fef..7afb30c2d7 100644
--- a/libavfilter/vf_interlace_vulkan.c
+++ b/libavfilter/vf_interlace_vulkan.c
@@ -189,7 +189,9 @@ static int interlace_vulkan_filter_frame(AVFilterLink 
*link, AVFrame *in)
  AVFrame *out = NULL, *input_top, *input_bot;
  AVFilterContext *ctx = link->dst;
  InterlaceVulkanContext *s = ctx->priv;
+const AVFilterLink *inlink = ctx->inputs[0];
  AVFilterLink *outlink = ctx->outputs[0];
+FilterLink *l = ff_filter_link(outlink);
  
  if (!s->initialized)

  RET(init_filter(ctx));
@@ -226,6 +228,9 @@ static int interlace_vulkan_filter_frame(AVFilterLink 
*link, AVFrame *in)
  if (s->mode == MODE_TFF)
  out->flags |= AV_FRAME_FLAG_TOP_FIELD_FIRST;
  
+out->pts = av_rescale_q(out->pts, inlink->time_base, outlink->time_base);

+out->duration = av_rescale_q(1, av_inv_q(l->frame_rate), 
outlink->time_base);
+
  av_frame_free(&s->cur);
  av_frame_free(&in);
  
@@ -260,9 +265,12 @@ static void interlace_vulkan_uninit(AVFilterContext *avctx)
  
  static int config_out_props(AVFilterLink *outlink)

  {
+AVFilterLink *inlink = outlink->src->inputs[0];
+const FilterLink *il = ff_filter_link(inlink);
  FilterLink *ol = ff_filter_link(outlink);
  
-ol->frame_rate = av_mul_q(ol->frame_rate, av_make_q(1, 2));

+ol->frame_rate = av_mul_q(il->frame_rate, av_make_q(1, 2));
+outlink->time_base = av_mul_q(inlink->time_base, av_make_q(2, 1));
  return ff_vk_filter_config_output(outlink);
  }
  




OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [FEATURE PROPOSAL] Extracting codec-level data to binary files

2025-05-21 Thread Timothée

Hello,

I am interested in expanding ffmpeg's capabilities to extract
low-level data from video codecs. Specifically, I'd like to implement
functionality that would allow exporting frame data, macroblock
information, quantization tables, and similar codec-specific elements
to binary files for further analysis.

After searching through the documentation and existing features, I
haven't found similar functionality, though I may have missed
something. Has this been implemented before, or are there related
features I should examine?

I'd appreciate your feedback on whether this feature would be
considered valuable for the project and if it aligns with ffmpeg's
development goals. If there's interest, I can provide a more detailed
technical proposal.

I'm new to both ffmpeg development and open source contributions in
general, so I welcome any guidance or information that might help me
contribute effectively.

Thank you for your time and consideration.

Timothée

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v3] avformat/dhav: fix backward scanning for get_duration and optimize seeking

2025-05-21 Thread Derek Buitenhuis

From: Justin Ruggles 

The backwards scanning done for incomplete final packets should not
assume a specific alignment at the end of the file. Truncated files
result in hundreds of thousands of seeks if the final packet does not
fall on a specific byte boundary, which can be extremely slow.
For example, with HTTP, each backwards seek results in a separate
HTTP request.

This changes the scanning to check for the end tag 1 byte at a time
and buffers the last 1 MiB using ffio_ensure_seekback to avoid additional
seek operations.

Co-authored-by: Derek Buitenhuis 
Signed-off-by: Justin Ruggles 
Signed-off-by: Derek Buitenhuis 
---
 libavformat/dhav.c | 54 +-
 1 file changed, 39 insertions(+), 15 deletions(-)

diff --git a/libavformat/dhav.c b/libavformat/dhav.c
index b2ead99609..d9db775802 100644
--- a/libavformat/dhav.c
+++ b/libavformat/dhav.c
@@ -22,6 +22,7 @@
 
 #include 
 
+#include "libavutil/intreadwrite.h"
 #include "libavutil/mem.h"
 #include "libavutil/parseutils.h"
 #include "avio_internal.h"
@@ -232,37 +233,60 @@ static void get_timeinfo(unsigned date, struct tm 
*timeinfo)
 timeinfo->tm_sec  = sec;
 }
 
+#define MAX_DURATION_BUFFER_SIZE (1024*1024)
+
 static int64_t get_duration(AVFormatContext *s)
 {
-DHAVContext *dhav = s->priv_data;
 int64_t start_pos = avio_tell(s->pb);
+int64_t end_pos = -1;
 int64_t start = 0, end = 0;
 struct tm timeinfo;
-int max_interations = 10;
+uint8_t *end_buffer;
+int64_t end_buffer_size;
+int64_t end_buffer_pos;
+int64_t offset;
+unsigned date;
 
 if (!s->pb->seekable)
 return 0;
 
-avio_seek(s->pb, avio_size(s->pb) - 8, SEEK_SET);
-while (avio_tell(s->pb) > 12 && max_interations--) {
-if (avio_rl32(s->pb) == MKTAG('d','h','a','v')) {
-int64_t seek_back = avio_rl32(s->pb);
+if (start_pos + 16 > avio_size(s->pb))
+return 0;
 
-avio_seek(s->pb, -seek_back, SEEK_CUR);
-read_chunk(s);
-get_timeinfo(dhav->date, &timeinfo);
-end = av_timegm(&timeinfo) * 1000LL;
+avio_skip(s->pb, 16);
+date = avio_rl32(s->pb);
+get_timeinfo(date, &timeinfo);
+start = av_timegm(&timeinfo) * 1000LL;
+
+end_buffer_size = FFMIN(MAX_DURATION_BUFFER_SIZE, avio_size(s->pb));
+end_buffer = av_malloc(end_buffer_size);
+if (!end_buffer) {
+avio_seek(s->pb, start_pos, SEEK_SET);
+return 0;
+}
+end_buffer_pos = avio_size(s->pb) - end_buffer_size;
+avio_seek(s->pb, end_buffer_pos, SEEK_SET);
+avio_read(s->pb, end_buffer, end_buffer_size);
+
+offset = end_buffer_size - 8;
+while (offset > 0) {
+if (AV_RL32(end_buffer + offset) == MKTAG('d','h','a','v')) {
+int64_t seek_back = AV_RL32(end_buffer + offset + 4);
+end_pos = end_buffer_pos + offset - seek_back + 8;
 break;
 } else {
-avio_seek(s->pb, -12, SEEK_CUR);
+offset -= 9;
 }
 }
 
-avio_seek(s->pb, start_pos, SEEK_SET);
+if (end_pos < 0 || end_pos + 16 > end_buffer_pos + end_buffer_size) {
+avio_seek(s->pb, start_pos, SEEK_SET);
+return 0;
+}
 
-read_chunk(s);
-get_timeinfo(dhav->date, &timeinfo);
-start = av_timegm(&timeinfo) * 1000LL;
+date = AV_RL32(end_buffer + (end_pos - end_buffer_pos) + 16);
+get_timeinfo(date, &timeinfo);
+end = av_timegm(&timeinfo) * 1000LL;
 
 avio_seek(s->pb, start_pos, SEEK_SET);
 
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] avformat/dhav: fix backward scanning for get_duration and optimize seeking

2025-05-21 Thread Derek Buitenhuis

On 5/20/2025 8:18 PM, Andreas Rheinhardt wrote:
> The only thing that you want from parsing the end is the end timestamp
> and this is given by data alone. date is a 32bit le number at offset 16
> from the start of the dhav tag; it can be read directly from the buffer,
> so you do not need to seek again and use read_chunk to get the end.
> It also means that your ffio_ensure_seekback() is unnecessary (unless
> you wanted to ensure that the seek back to start_pos works (which is
> currently not guaranteed at all and not ensured by your code and would
> probably also be bad in general given that this would cache the whole
> file in memory).

Thanks for the review, v3 sent.

> (The above is based on the presumption that we are not really interested
> in what parse_ext() may parse in the chunk at the end; I don't know
> whether this is true or not.)

AFAICT, we are not.

Cheers,
- Derek
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 3/3] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Zhao Zhili

From: Zhao Zhili 

---
 tests/fate/hevc.mak| 3 +++
 tests/ref/fate/hevc-color-reserved | 6 ++
 2 files changed, 9 insertions(+)
 create mode 100644 tests/ref/fate/hevc-color-reserved

diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
index 390ccf46e2..8113c04300 100644
--- a/tests/fate/hevc.mak
+++ b/tests/fate/hevc.mak
@@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
fate-hevc-mv-position
 fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
 FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
 
+fate-hevc-color-reserved: CMD = framecrc -bsf:v 
hevc_metadata=colour_primaries=0:transfer_characteristics=0:matrix_coefficients=3
 -i $(TARGET_SAMPLES)/hevc-conformance/AMP_A_Samsung_4.bit -vf 
scale,format=nv12 -frames:v 1
+FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, HEVC_METADATA_BSF SCALE_FILTER) += 
fate-hevc-color-reserved
+
 FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
 FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
 
diff --git a/tests/ref/fate/hevc-color-reserved 
b/tests/ref/fate/hevc-color-reserved
new file mode 100644
index 00..cba6397aa8
--- /dev/null
+++ b/tests/ref/fate/hevc-color-reserved
@@ -0,0 +1,6 @@
+#tb 0: 1/25
+#media_type 0: video
+#codec_id 0: rawvideo
+#dimensions 0: 2560x1600
+#sar 0: 0/1
+0,  0,  0,1,  6144000, 0x427b9a00
-- 
2.46.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Martin Storsjö


On Wed, 21 May 2025, Andreas Rheinhardt wrote:


Jiawei:

This patch modifies the FFmpeg build system to remove the explicit disabling
of GCC's auto-vectorization feature.

Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
capabilities through extensive optimizations in loop analysis and SIMD
code generation. The explicit -fno-tree-vectorize flag originally added
in commit 973859f (2009) to workaround early GCC vectorization instability
is no longer necessary.

Key improvements justifying this change:
1. Enhanced heuristics for loop vectorization cost models
2. Mature handling of alignment and memory access patterns
3. Robust fallback mechanisms for unsupported architectures

This change allows FFmpeg to benefit from automated SIMD optimizations
when built with -O3 optimization level, particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.

[1] 
https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191

---
 configure | 1 -
 1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index 3730b0524c..b9e95ce4ec 100755
--- a/configure
+++ b/configure
@@ -7656,7 +7656,6 @@ if enabled icc; then
 disable aligned_stack
 fi
 elif enabled gcc; then
-check_optflags -fno-tree-vectorize
 check_cflags -Werror=format-security
 check_cflags -Werror=implicit-function-declaration
 check_cflags -Werror=missing-prototypes


FYI: The last discussion about auto-vectorization is here:
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299405.html
It contains a report about a failing build with vectorization enabled:
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299421.html
I don't know whether this is still reproducible with the latest GCC.


The issue which was reported last time, when compiling for i686 mingw32 
with --cpu=haswell, seems to have gone away in 
182663a58a7a099e02e76da3b0f96d63e5c26a6d, where we made the whole 
problematic x86 inline cabac assembly noinline on i386. (That whole inline 
assembly block has been problematic in a large number of cases anyway.)


// Martin

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_libplacebo: implement rotation option

2025-05-21 Thread Niklas Haas

On Fri, 16 May 2025 13:07:05 +0200 Niklas Haas  wrote:
> From: Niklas Haas 
>
> Flipping can already be accomplished by setting the crop_w/h expressions to
> their negative values, so together these options can implement any of the
> common frame orientations.

Will apply if there are no objections.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] avfilter/vf_libplacebo: add shader_cache option

2025-05-21 Thread Niklas Haas

On Fri, 16 May 2025 13:15:24 +0200 Niklas Haas  wrote:
> From: Niklas Haas 
>
> Useful to speed up shader compilation. May significantly lower startup
> times, in particular with large or complex shaders.
>
> Sponsored-by: nxtedition

Will apply if there are no further comments.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 01/19] avcodec/mpegvideo_enc: Set gob_index once during init

2025-05-21 Thread Andreas Rheinhardt

Patches attached; mostly about MPEG-4.

- Andreas
From 1a73463508298d6683658e86a5e0e5453c75e0d7 Mon Sep 17 00:00:00 2001
From: Andreas Rheinhardt 
Date: Sat, 17 May 2025 19:30:02 +0200
Subject: [PATCH 01/19] avcodec/mpegvideo_enc: Set gob_index once during init

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/ituh263enc.c|  3 +++
 libavcodec/mpegvideo_enc.c | 16 +++-
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/libavcodec/ituh263enc.c b/libavcodec/ituh263enc.c
index b9d903a220..9a6d5dc201 100644
--- a/libavcodec/ituh263enc.c
+++ b/libavcodec/ituh263enc.c
@@ -842,6 +842,9 @@ av_cold void ff_h263_encode_init(MPVMainEncContext *const m)
 if (s->c.modified_quant)
 s->c.chroma_qscale_table = ff_h263_chroma_qscale_table;
 
+// Only used for H.263 and H.263+
+s->c.gob_index = H263_GOB_HEIGHT(s->c.height);
+
 // use fcodes >1 only for MPEG-4 & H.263 & H.263+ FIXME
 switch(s->c.codec_id){
 case AV_CODEC_ID_H263P:
diff --git a/libavcodec/mpegvideo_enc.c b/libavcodec/mpegvideo_enc.c
index 6e9533ebc9..62a3a82ff3 100644
--- a/libavcodec/mpegvideo_enc.c
+++ b/libavcodec/mpegvideo_enc.c
@@ -3006,25 +3006,15 @@ static int encode_thread(AVCodecContext *c, void *arg){
 s->c.last_dc[0] = 128 * 8 / 13;
 s->c.last_dc[1] = 128 * 8 / 14;
 s->c.last_dc[2] = 128 * 8 / 14;
+} else if (CONFIG_MPEG4_ENCODER && s->c.codec_id == AV_CODEC_ID_MPEG4 &&
+   s->c.partitioned_frame) {
+ff_mpeg4_init_partitions(s);
 }
 s->c.mb_skip_run = 0;
 memset(s->c.last_mv, 0, sizeof(s->c.last_mv));
 
 s->last_mv_dir = 0;
 
-switch (s->c.codec_id) {
-case AV_CODEC_ID_H263:
-case AV_CODEC_ID_H263P:
-case AV_CODEC_ID_FLV1:
-if (CONFIG_H263_ENCODER)
-s->c.gob_index = H263_GOB_HEIGHT(s->c.height);
-break;
-case AV_CODEC_ID_MPEG4:
-if (CONFIG_MPEG4_ENCODER && s->c.partitioned_frame)
-ff_mpeg4_init_partitions(s);
-break;
-}
-
 s->c.resync_mb_x = 0;
 s->c.resync_mb_y = 0;
 s->c.first_slice_line = 1;
-- 
2.45.2

From 4527ee66d9fc5b13b6946ed3912de88b4d738a3f Mon Sep 17 00:00:00 2001
From: Andreas Rheinhardt 
Date: Sat, 17 May 2025 20:20:05 +0200
Subject: [PATCH 02/19] avcodec/h263dec: Move calculating gob_index to
 {intel,itu}h263dec.c

This avoids checks for whether it should be calculated at all.

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/h263dec.c  | 5 -
 libavcodec/intelh263dec.c | 3 +++
 libavcodec/ituh263dec.c   | 2 ++
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/libavcodec/h263dec.c b/libavcodec/h263dec.c
index c36070e23c..501b8b44ff 100644
--- a/libavcodec/h263dec.c
+++ b/libavcodec/h263dec.c
@@ -537,11 +537,6 @@ int ff_h263_decode_frame(AVCodecContext *avctx, AVFrame *pict,
 }
 }
 
-if (s->codec_id == AV_CODEC_ID_H263  ||
-s->codec_id == AV_CODEC_ID_H263P ||
-s->codec_id == AV_CODEC_ID_H263I)
-s->gob_index = H263_GOB_HEIGHT(s->height);
-
 /* skip B-frames if we don't have reference frames */
 if (!s->last_pic.ptr &&
 (s->pict_type == AV_PICTURE_TYPE_B || s->droppable))
diff --git a/libavcodec/intelh263dec.c b/libavcodec/intelh263dec.c
index 374dfdc0de..b2e7fa6c54 100644
--- a/libavcodec/intelh263dec.c
+++ b/libavcodec/intelh263dec.c
@@ -19,6 +19,7 @@
  */
 
 #include "codec_internal.h"
+#include "h263.h"
 #include "mpegvideo.h"
 #include "mpegvideodec.h"
 #include "h263data.h"
@@ -119,6 +120,8 @@ int ff_intel_h263_decode_picture_header(MpegEncContext *s)
 if (skip_1stop_8data_bits(&s->gb) < 0)
 return AVERROR_INVALIDDATA;
 
+s->gob_index = H263_GOB_HEIGHT(s->height);
+
 ff_h263_show_pict_info(s);
 
 return 0;
diff --git a/libavcodec/ituh263dec.c b/libavcodec/ituh263dec.c
index d19bdc4dab..7965b77ff3 100644
--- a/libavcodec/ituh263dec.c
+++ b/libavcodec/ituh263dec.c
@@ -1314,6 +1314,8 @@ int ff_h263_decode_picture_header(MpegEncContext *s)
 s->mb_height = (s->height  + 15) / 16;
 s->mb_num = s->mb_width * s->mb_height;
 
+s->gob_index = H263_GOB_HEIGHT(s->height);
+
 if (s->pb_frame) {
 skip_bits(&s->gb, 3); /* Temporal reference for B-pictures */
 if (s->custom_pcf)
-- 
2.45.2

From 5bc20ffb8881543dd3b22e713735bfca8a84368e Mon Sep 17 00:00:00 2001
From: Andreas Rheinhardt 
Date: Sun, 18 May 2025 01:01:28 +0200
Subject: [PATCH 03/19] avcodec/mpeg4videodec: Don't initialize unused parts of
 RLTables

The reversible VLC tables use a simpler escaping method
than the ordinary VLCs: It does not use max_run, max_level etc.
and therefore one does not need to initialize these at all.

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/mpeg4videodec.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/libavcodec/mpeg4videodec.c b/libavcodec/mpeg4videodec.c
index b6bb21174e..4d09a58ffb 100644
--- a/libavcodec/mpeg4videodec.c
+++ b/libavcodec/mpeg4videodec.c
@@ -3901,7 +3901,6 @@ static int mpeg

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Andreas Rheinhardt

Martin Storsjö:
> On Wed, 21 May 2025, Andreas Rheinhardt wrote:
> 
>> Jiawei:
>>> This patch modifies the FFmpeg build system to remove the explicit
>>> disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions (>= 10.0) have demonstrated stable auto-
>>> vectorization
>>> capabilities through extensive optimizations in loop analysis and SIMD
>>> code generation. The explicit -fno-tree-vectorize flag originally added
>>> in commit 973859f (2009) to workaround early GCC vectorization
>>> instability
>>> is no longer necessary.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/
>>> commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> ---
>>>  configure | 1 -
>>>  1 file changed, 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..b9e95ce4ec 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,6 @@ if enabled icc; then
>>>  disable aligned_stack
>>>  fi
>>>  elif enabled gcc; then
>>> -    check_optflags -fno-tree-vectorize
>>>  check_cflags -Werror=format-security
>>>  check_cflags -Werror=implicit-function-declaration
>>>  check_cflags -Werror=missing-prototypes
>>
>> FYI: The last discussion about auto-vectorization is here:
>> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299405.html
>> It contains a report about a failing build with vectorization enabled:
>> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299421.html
>> I don't know whether this is still reproducible with the latest GCC.
> 
> The issue which was reported last time, when compiling for i686 mingw32
> with --cpu=haswell, seems to have gone away in
> 182663a58a7a099e02e76da3b0f96d63e5c26a6d, where we made the whole
> problematic x86 inline cabac assembly noinline on i386. (That whole
> inline assembly block has been problematic in a large number of cases
> anyway.)
> 

So there are currently no known miscompilations due to vectorization
with GCC?

- Andreas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Martin Storsjö


On Wed, 21 May 2025, Andreas Rheinhardt wrote:


Martin Storsjö:

On Wed, 21 May 2025, Andreas Rheinhardt wrote:


Jiawei:

This patch modifies the FFmpeg build system to remove the explicit
disabling
of GCC's auto-vectorization feature.

Modern GCC versions (>= 10.0) have demonstrated stable auto-
vectorization
capabilities through extensive optimizations in loop analysis and SIMD
code generation. The explicit -fno-tree-vectorize flag originally added
in commit 973859f (2009) to workaround early GCC vectorization
instability
is no longer necessary.

Key improvements justifying this change:
1. Enhanced heuristics for loop vectorization cost models
2. Mature handling of alignment and memory access patterns
3. Robust fallback mechanisms for unsupported architectures

This change allows FFmpeg to benefit from automated SIMD optimizations
when built with -O3 optimization level, particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.

[1] https://git.ffmpeg.org/gitweb/ffmpeg.git/
commit/973859f5230e77beea7bb59dc081870689d6d191

---
 configure | 1 -
 1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index 3730b0524c..b9e95ce4ec 100755
--- a/configure
+++ b/configure
@@ -7656,7 +7656,6 @@ if enabled icc; then
 disable aligned_stack
 fi
 elif enabled gcc; then
-    check_optflags -fno-tree-vectorize
 check_cflags -Werror=format-security
 check_cflags -Werror=implicit-function-declaration
 check_cflags -Werror=missing-prototypes


FYI: The last discussion about auto-vectorization is here:
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299405.html
It contains a report about a failing build with vectorization enabled:
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299421.html
I don't know whether this is still reproducible with the latest GCC.


The issue which was reported last time, when compiling for i686 mingw32
with --cpu=haswell, seems to have gone away in
182663a58a7a099e02e76da3b0f96d63e5c26a6d, where we made the whole
problematic x86 inline cabac assembly noinline on i386. (That whole
inline assembly block has been problematic in a large number of cases
anyway.)



So there are currently no known miscompilations due to vectorization
with GCC?


I'm not aware of any, but I haven't tested widely. It certainly is worth 
evalulating.


(From dav1d, I can anecdotally add that autovectorization does seem to 
help, somewhat, especially when there's not 100% assembly coverage for the 
use case. For some cases it make things slower than without 
autovectorization, but generally the net result is positive.)


// Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/3] avcodec/h2645_vui: Ensure color primaries/trc/space isn't reserved value

2025-05-21 Thread Zhao Zhili

From: Zhao Zhili 

Fix error reported by swscaler:
Unsupported input (Operation not supported): fmt:yuv420p csp:unknown 
prim:reserved trc:bt709 -> fmt:yuv420p csp:bt709 prim:reserved trc:bt709
---
 libavcodec/h2645_vui.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/libavcodec/h2645_vui.c b/libavcodec/h2645_vui.c
index e5c7bf46f9..0e576c1563 100644
--- a/libavcodec/h2645_vui.c
+++ b/libavcodec/h2645_vui.c
@@ -67,11 +67,16 @@ void ff_h2645_decode_common_vui_params(GetBitContext *gb, 
H2645VUI *vui, void *l
 vui->matrix_coeffs= get_bits(gb, 8);
 
 // Set invalid values to "unspecified"
-if (!av_color_primaries_name(vui->colour_primaries))
+if (vui->colour_primaries == AVCOL_PRI_RESERVED0 ||
+vui->colour_primaries == AVCOL_PRI_RESERVED ||
+!av_color_primaries_name(vui->colour_primaries))
 vui->colour_primaries = AVCOL_PRI_UNSPECIFIED;
-if (!av_color_transfer_name(vui->transfer_characteristics))
+if (vui->transfer_characteristics == AVCOL_TRC_RESERVED0 ||
+vui->transfer_characteristics == AVCOL_TRC_RESERVED ||
+!av_color_transfer_name(vui->transfer_characteristics))
 vui->transfer_characteristics = AVCOL_TRC_UNSPECIFIED;
-if (!av_color_space_name(vui->matrix_coeffs))
+if (vui->matrix_coeffs == AVCOL_SPC_RESERVED ||
+!av_color_space_name(vui->matrix_coeffs))
 vui->matrix_coeffs = AVCOL_SPC_UNSPECIFIED;
 }
 }
-- 
2.46.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 2/3] tests/fate/hevc: Fix dependancy for hevc-alpha

2025-05-21 Thread Zhao Zhili

From: Zhao Zhili 

---
 tests/fate/hevc.mak | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
index e432345ef7..390ccf46e2 100644
--- a/tests/fate/hevc.mak
+++ b/tests/fate/hevc.mak
@@ -292,7 +292,7 @@ fate-hevc-mv-position: CMD = framecrc -i 
$(TARGET_SAMPLES)/hevc/multiview.mov -m
 FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-mv-position
 
 fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
-FATE_HEVC-$(call FRAMECRC, HEVC, HEVC) += fate-hevc-alpha
+FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
 
 FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
 FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
-- 
2.46.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 3/3] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Zhao Zhili

From: Zhao Zhili 

---
 tests/fate/hevc.mak| 3 +++
 tests/ref/fate/hevc-color-reserved | 6 ++
 2 files changed, 9 insertions(+)
 create mode 100644 tests/ref/fate/hevc-color-reserved

diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
index 390ccf46e2..5e721526d0 100644
--- a/tests/fate/hevc.mak
+++ b/tests/fate/hevc.mak
@@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
fate-hevc-mv-position
 fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
 FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
 
+fate-hevc-color-reserved: CMD = framecrc -i 
$(TARGET_SAMPLES)/hevc/color_prim_reserved0.hevc -fps_mode passthrough 
-sws_flags +accurate_rnd+bitexact -vf scale,format=nv12
+FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, SCALE_FILTER) += 
fate-hevc-color-reserved
+
 FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
 FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
 
diff --git a/tests/ref/fate/hevc-color-reserved 
b/tests/ref/fate/hevc-color-reserved
new file mode 100644
index 00..3351628209
--- /dev/null
+++ b/tests/ref/fate/hevc-color-reserved
@@ -0,0 +1,6 @@
+#tb 0: 1/60
+#media_type 0: video
+#codec_id 0: rawvideo
+#dimensions 0: 1920x900
+#sar 0: 1/1
+0,  0,  0,1,  2592000, 0xfa6fce1e
-- 
2.46.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/3] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Zhao Zhili




> On May 21, 2025, at 21:28, Zhao Zhili  
> wrote:
> 
>> On May 21, 2025, at 21:11, Andreas Rheinhardt 
>>  wrote:
>> 
>> Zhao Zhili:
>>> From: Zhao Zhili 
>>> 
>>> ---
>>> tests/fate/hevc.mak| 3 +++
>>> tests/ref/fate/hevc-color-reserved | 6 ++
>>> 2 files changed, 9 insertions(+)
>>> create mode 100644 tests/ref/fate/hevc-color-reserved
>>> 
>>> diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
>>> index 390ccf46e2..5e721526d0 100644
>>> --- a/tests/fate/hevc.mak
>>> +++ b/tests/fate/hevc.mak
>>> @@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
>>> fate-hevc-mv-position
>>> fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
>>> FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha
>>> 
>>> +fate-hevc-color-reserved: CMD = framecrc -i 
>>> $(TARGET_SAMPLES)/hevc/color_prim_reserved0.hevc -fps_mode passthrough 
>>> -sws_flags +accurate_rnd+bitexact -vf scale,format=nv12
>>> +FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, SCALE_FILTER) += 
>>> fate-hevc-color-reserved
>> 
>> A new sample for this? Why don't you just create one with hevc_metadata?

On the other hand, we can create invalid sample with hevc_metadata for now.

Does it make sense to check primaries/trc/matrix in hevc_metadata?

> 
> Great idea. See patch v2 3/3.
> 
> https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343884.html
> 
>> 
>>> +
>>> FATE_SAMPLES_AVCONV += $(FATE_HEVC-yes)
>>> FATE_SAMPLES_FFPROBE += $(FATE_HEVC_FFPROBE-yes)
>>> 
>>> diff --git a/tests/ref/fate/hevc-color-reserved 
>>> b/tests/ref/fate/hevc-color-reserved
>>> new file mode 100644
>>> index 00..3351628209
>>> --- /dev/null
>>> +++ b/tests/ref/fate/hevc-color-reserved
>>> @@ -0,0 +1,6 @@
>>> +#tb 0: 1/60
>>> +#media_type 0: video
>>> +#codec_id 0: rawvideo
>>> +#dimensions 0: 1920x900
>>> +#sar 0: 1/1
>>> +0,  0,  0,1,  2592000, 0xfa6fce1e
>> 
>> ___
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> 
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/3] tests: Add fate-hevc-color-reserved

2025-05-21 Thread Andreas Rheinhardt

Zhao Zhili:
> 
> 
>> On May 21, 2025, at 21:28, Zhao Zhili  
>> wrote:
>>
>>> On May 21, 2025, at 21:11, Andreas Rheinhardt 
>>>  wrote:
>>>
>>> Zhao Zhili:
 From: Zhao Zhili 

 ---
 tests/fate/hevc.mak| 3 +++
 tests/ref/fate/hevc-color-reserved | 6 ++
 2 files changed, 9 insertions(+)
 create mode 100644 tests/ref/fate/hevc-color-reserved

 diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
 index 390ccf46e2..5e721526d0 100644
 --- a/tests/fate/hevc.mak
 +++ b/tests/fate/hevc.mak
 @@ -294,6 +294,9 @@ FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += 
 fate-hevc-mv-position
 fate-hevc-alpha: CMD = framecrc -i $(TARGET_SAMPLES)/hevc/alpha.mp4
 FATE_HEVC-$(call FRAMECRC, MOV, HEVC) += fate-hevc-alpha

 +fate-hevc-color-reserved: CMD = framecrc -i 
 $(TARGET_SAMPLES)/hevc/color_prim_reserved0.hevc -fps_mode passthrough 
 -sws_flags +accurate_rnd+bitexact -vf scale,format=nv12
 +FATE_HEVC-$(call FRAMECRC, HEVC, HEVC, SCALE_FILTER) += 
 fate-hevc-color-reserved
>>>
>>> A new sample for this? Why don't you just create one with hevc_metadata?
> 
> On the other hand, we can create invalid sample with hevc_metadata for now.
> 
> Does it make sense to check primaries/trc/matrix in hevc_metadata?
> 

IMO reserved != invalid

- Andreas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 10bpc inverse transforms

2025-05-21 Thread Henrik Gramner via ffmpeg-devel

Tested to pass FATE on Linux and Windows.

Checkasm numbers vs the existing SSE2 code on Zen 5 (Strix Halo):
vp9_inv_adst_adst_16x16_sub16_add_10_sse2:   1041.8 ( 1.92x)
vp9_inv_adst_adst_16x16_sub16_add_10_avx512icl:   132.5 (15.06x)

vp9_inv_dct_adst_16x16_sub16_add_10_sse2: 901.0 ( 1.98x)
vp9_inv_dct_adst_16x16_sub16_add_10_avx512icl:120.8 (14.79x)

vp9_inv_dct_dct_16x16_sub16_add_10_sse2:  750.6 ( 2.10x)
vp9_inv_dct_dct_16x16_sub16_add_10_avx512icl: 110.9 (14.18x)

vp9_inv_dct_dct_32x32_sub32_add_10_sse2: 3922.6 ( 2.24x)
vp9_inv_dct_dct_32x32_sub32_add_10_avx512icl: 506.6 (17.37x)


vp9_itx_10_avx512.patch
Description: Binary data
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Frank Plowman

On 21/05/2025 11:17, Jiawei wrote:
> 
> 在 2025/5/21 14:52, Nicolas George 写道:
>> Jiawei (HE12025-05-21):
>>>   particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>> Benchmark needed.
>>
>> Regards,
> 
> 
> Hi Nicolas,
> 
> 
> Since I am a gcc developer, I'm not so familiar with the FFmpeg test 
> flow, here is my test process,
> if there exists anything uncorrect, please point me out:
> 
> 
> 1. Download the video bbb_sunflower_2160p_30fps_normal.mp4.zip 
> 
>  
> from https://download.blender.org/demo/movies/BBB/，
> 
> ```
> 
> ffmpeg -i bbb_sunflower_2160p_30fps_normal.mp4 -t 60 -vf 
> "scale=1920:1080" -c:v libx265 -c:a libmp3lame 1080p_hevc_mp3.mp4
> ```
> 
> get the 1080p video as Benchmark test video
> 
> 
> 2. Build two version of FFmpeg, one with the modify,  another without 
> the patch modif, using the gcc 13.3 release version,
> 
> verified with Intel(R) Core(TM) Ultra 9 285HX
> 
> 
> Using patch:
> 
> ```
> ./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
> ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
> developers
>    built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
>    configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
> --extra-cflags=-O3 --enable-static --target-os=linux
>    libavutil  60.  2.100 / 60.  2.100
>    libavcodec 62.  3.101 / 62.  3.101
>    libavformat    62.  0.102 / 62.  0.102
>    libavdevice    62.  0.100 / 62.  0.100
>    libavfilter    11.  0.100 / 11.  0.100
>    libswscale  9.  0.100 /  9.  0.100
>    libswresample   6.  0.100 /  6.  0.100
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
> '/home/pz9115/mp/1080p_hevc_mp3.mp4':
>    Metadata:
>      major_brand : isom
>      minor_version   : 512
>      compatible_brands: isomiso2mp41
>      title   : Big Buck Bunny, Sunflower version
>      artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
>      composer    : Sacha Goedegebure
>      encoder : Lavf60.16.100
>      comment : Creative Commons Attribution 3.0 - 
> http://bbb3d.renderfarming.net
>      genre   : Animation
>    Duration: 00:01:00.00, start: 0.00, bitrate: 1564 kb/s
>    Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
> yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
> fps, 30 tbr, 15360 tbn (default)
>      Metadata:
>    handler_name    : GPAC ISO Video Handler
>    vendor_id   : [0][0][0][0]
>    encoder : Lavc60.31.102 libx265
>    Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D), 
> 48000 Hz, stereo, fltp, 128 kb/s (default)
>      Metadata:
>    handler_name    : GPAC ISO Audio Handler
>    vendor_id   : [0][0][0][0]
> Stream mapping:
>    Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
>    Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
> Press [q] to stop, [?] for help
> Output #0, null, to 'pipe:':
>    Metadata:
>      major_brand : isom
>      minor_version   : 512
>      compatible_brands: isomiso2mp41
>      title   : Big Buck Bunny, Sunflower version
>      artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
>      composer    : Sacha Goedegebure
>      genre   : Animation
>      comment : Creative Commons Attribution 3.0 - 
> http://bbb3d.renderfarming.net
>      encoder : Lavf62.0.102
>    Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
> 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
>      Metadata:
>    encoder : Lavc62.3.101 wrapped_avframe
>    handler_name    : GPAC ISO Video Handler
>    vendor_id   : [0][0][0][0]
>    Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
> (default)
>      Metadata:
>    encoder : Lavc62.3.101 pcm_s16le
>    handler_name    : GPAC ISO Audio Handler
>    vendor_id   : [0][0][0][0]
> [out#0/null @ 0x565233669eb0] video:731KiB audio:11250KiB subtitle:0KiB 
> other streams:0KiB global headers:0KiB muxing overhead: unknown
> frame= 1800 fps=635 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A 
> speed=21.2x elapsed=0:00:02.83
> bench: utime=11.324s stime=0.290s rtime=2.834s
> bench: maxrss=186556KiB
> ```
> 
> Without patch(here I add the fno-tree-vectorize directly):
> 
> ./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
> ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
> developers
>    built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
>    configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
> --extra-cflags='-O3 -fno-tree-vectorize' --enable-static --target-os=linux
>    libavutil  60.  2.100 / 60.  2.100
>    libavcodec 62.  3.101 / 62.  3.101
>    libavformat    62.  0.10

Re: [FFmpeg-devel] [PATCH 0/3] Clean up build spam from graph css builder

2025-05-21 Thread softworkz .

> -Original Message-
> From: ffmpeg-devel  On Behalf Of Derek
> Buitenhuis
> Sent: Mittwoch, 21. Mai 2025 15:27
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH 0/3] Clean up build spam from graph css
> builder
> 
> On 5/20/2025 7:44 PM, softworkz . wrote:
> > Hi Derek,
> >
> > thanks a lot for the patch. This partially duplicates what Timo had
> > already submitted:
> >
> > https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250516230202.355445-1-
> t...@rothenpieler.org/
> 
> Dropping this set, then.
> 
> > Regarding patch 3/3, would you mind taking a look at the patch that I
> > have submitted in this regard, from which I believe that it's the "most
> > correct" way:
> >
> >
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/pull.80.v2.ffstaging.FFmpeg.
> 1747549830700.ffmpegag...@gmail.com/
> 
> I am actually unclear on which is "most correct", myself.
> 
> [...]
> 
> - Derek
> ___

I can't say that I have great expertise with Makefile development. I read the 
documentation and then I created a super-simple test to verify my understanding.
The Makefile content is this:

# Makefile
%.t: %.r.tmp
./cmd1 $< $@

%.r.tmp: %.q
./cmd2 $< $@

%.q: %.r
./cmd3 $< $@

and every cmd1, cmd2 and cmd3 does nothing else but copy the file. The "source"
file is test.t and no other files exist at the beginning.

When running make, it copies test.t to test.tmp and then test.tmp to test.r and
finally it deletes test.tmp.

When you run make again, it says everything is up to date. When you touch 
test.t,
it re-builds the result.

That's why I said I believe it's the "most correct" way and that's the behavior
that my patch is introducing. At the same time it allows to get rid of a few
lines (the filtering of files and adding them to the SECONDARY special target).

Thanks
sw

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-21 Thread Timothy Allen via ffmpeg-devel

On Tue, 2025-05-20 at 20:03 +, softworkz . wrote:
> I was just about to reply and suggest to replace those colons with
> %3A
> (url-encoded) when I read the ticket, which already suggests that.
> 
> Have you tried it? It sounds like a much better way to me.

I think this would be a common-sense solution as long as one controls
the server/content.

The reason I submitted the patch anyway was because not everyone will
control the content they're consuming, and because, although I
acknowledge it technically breaks RFCs, there is an argument that the
RFC's behaviour is surprising; we can certainly see that other
applications (Safari, in the linked ticket) break the RFC as well.

I can certainly understand if the patch is rejected -- even if it
didn't break the RFC, Postel's Law is not in vogue any longer. However,
I think the workflows that it might break ("host:port", with no scheme
or path) are much less obvious and intuitive than the workflows that it
rescues.

Regardless of the decision, thanks for reviewing the patch!

Best,

Tim
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Zhao Zhili



> On May 21, 2025, at 14:17, Jiawei  wrote:
> 
> This patch modifies the FFmpeg build system to remove the explicit disabling
> of GCC's auto-vectorization feature.
> 
> Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
> capabilities through extensive optimizations in loop analysis and SIMD
> code generation. The explicit -fno-tree-vectorize flag originally added
> in commit 973859f (2009) to workaround early GCC vectorization instability
> is no longer necessary.

This isn’t the whole story.

The flag was added by 973859f in 2009.
Then it was reverted by cb8646af in 2016.
Shortly after that, the revert was reverted again by fd6dbc5 in 2016.

> 
> Key improvements justifying this change:
> 1. Enhanced heuristics for loop vectorization cost models
> 2. Mature handling of alignment and memory access patterns
> 3. Robust fallback mechanisms for unsupported architectures
> 
> This change allows FFmpeg to benefit from automated SIMD optimizations
> when built with -O3 optimization level, particularly improving
> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.

Those flags can only be enabled in tightly controlled environments (e.g., built 
and run on the same
machine), while FFmpeg has hand written assembly, runtime cpu probe and dynamic 
binding/dispatch.

Those auto-vectorization and ARCH flags can be enabled manually, but be careful.

> 
> [1] 
> https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> 
> ---
> configure | 1 -
> 1 file changed, 1 deletion(-)
> 
> diff --git a/configure b/configure
> index 3730b0524c..b9e95ce4ec 100755
> --- a/configure
> +++ b/configure
> @@ -7656,7 +7656,6 @@ if enabled icc; then
> disable aligned_stack
> fi
> elif enabled gcc; then
> -check_optflags -fno-tree-vectorize
> check_cflags -Werror=format-security
> check_cflags -Werror=implicit-function-declaration
> check_cflags -Werror=missing-prototypes
> -- 
> 2.43.0
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Jiawei


> -原始邮件-
> 发件人: "Nicolas George" 
> 发送时间: 2025-05-21 14:52:12 (星期三)
> 收件人: "FFmpeg development discussions and patches" 


> 抄送:
> 主题: Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.
>
> Jiawei (HE12025-05-21):
> >  particularly improving
> > performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) 
architectures.

>
> Benchmark needed.
>
> Regards,
>
> --
>   Nicolas George


Hi Nicolas,


Since I am a gcc developer, I'm not so familiar with the FFmpeg test 
flow, here is my test process,

if there exists anything uncorrect, please point me out:


1. Download the video bbb_sunflower_2160p_30fps_normal.mp4.zip 
 
from https://download.blender.org/demo/movies/BBB/，


```

ffmpeg -i bbb_sunflower_2160p_30fps_normal.mp4 -t 60 -vf 
"scale=1920:1080" -c:v libx265 -c:a libmp3lame 1080p_hevc_mp3.mp4

```

get the 1080p video as Benchmark test video


2. Build two version of FFmpeg, one with the modify,  another without 
the patch modif, using the gcc 13.3 release version,


verified with Intel(R) Core(TM) Ultra 9 285HX


Using patch:

```
./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers

  built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
  configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags=-O3 --enable-static --target-os=linux

  libavutil  60.  2.100 / 60.  2.100
  libavcodec 62.  3.101 / 62.  3.101
  libavformat    62.  0.102 / 62.  0.102
  libavdevice    62.  0.100 / 62.  0.100
  libavfilter    11.  0.100 / 11.  0.100
  libswscale  9.  0.100 /  9.  0.100
  libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
'/home/pz9115/mp/1080p_hevc_mp3.mp4':

  Metadata:
    major_brand : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    title   : Big Buck Bunny, Sunflower version
    artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
    composer    : Sacha Goedegebure
    encoder : Lavf60.16.100
    comment : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net

    genre   : Animation
  Duration: 00:01:00.00, start: 0.00, bitrate: 1564 kb/s
  Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)

    Metadata:
  handler_name    : GPAC ISO Video Handler
  vendor_id   : [0][0][0][0]
  encoder : Lavc60.31.102 libx265
  Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D), 
48000 Hz, stereo, fltp, 128 kb/s (default)

    Metadata:
  handler_name    : GPAC ISO Audio Handler
  vendor_id   : [0][0][0][0]
Stream mapping:
  Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
  Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    major_brand : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    title   : Big Buck Bunny, Sunflower version
    artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
    composer    : Sacha Goedegebure
    genre   : Animation
    comment : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net

    encoder : Lavf62.0.102
  Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)

    Metadata:
  encoder : Lavc62.3.101 wrapped_avframe
  handler_name    : GPAC ISO Video Handler
  vendor_id   : [0][0][0][0]
  Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)

    Metadata:
  encoder : Lavc62.3.101 pcm_s16le
  handler_name    : GPAC ISO Audio Handler
  vendor_id   : [0][0][0][0]
[out#0/null @ 0x565233669eb0] video:731KiB audio:11250KiB subtitle:0KiB 
other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 1800 fps=635 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A 
speed=21.2x elapsed=0:00:02.83

bench: utime=11.324s stime=0.290s rtime=2.834s
bench: maxrss=186556KiB
```

Without patch(here I add the fno-tree-vectorize directly):

./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers

  built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
  configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags='-O3 -fno-tree-vectorize' --enable-static --target-os=linux

  libavutil  60.  2.100 / 60.  2.100
  libavcodec 62.  3.101 / 62.  3.101
  libavformat    62.  0.102 / 62.  0.102
  libavdevice    62.  0.100 / 62.  0.100
  libavfilter    11.  0.100 / 11.  0.100
  libswscale  9.  0.100 /  9.  0.100

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Jiawei


> -原始邮件-
> 发件人: "Nicolas George" 
> 发送时间: 2025-05-21 14:52:12 (星期三)
> 收件人: "FFmpeg development discussions and patches" 


> 抄送:
> 主题: Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.
>
> Jiawei (HE12025-05-21):
> >  particularly improving
> > performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) 
architectures.

>
> Benchmark needed.
>
> Regards,
>
> --
>   Nicolas George


Hi Nicolas,


Since I am a gcc developer, I'm not so familiar with the FFmpeg test 
flow, here is my test process,

if there exists anything uncorrect, please point me out:


1. Download the video bbb_sunflower_2160p_30fps_normal.mp4.zip 
 
from https://download.blender.org/demo/movies/BBB/，


```

ffmpeg -i bbb_sunflower_2160p_30fps_normal.mp4 -t 60 -vf 
"scale=1920:1080" -c:v libx265 -c:a libmp3lame 1080p_hevc_mp3.mp4

```

get the 1080p video as Benchmark test video


2. Build two version of FFmpeg, one with the modify,  another without 
the patch modif, using the gcc 13.3 release version,


verified with Intel(R) Core(TM) Ultra 9 285HX


Using patch:

```
./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers

  built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
  configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags=-O3 --enable-static --target-os=linux

  libavutil  60.  2.100 / 60.  2.100
  libavcodec 62.  3.101 / 62.  3.101
  libavformat    62.  0.102 / 62.  0.102
  libavdevice    62.  0.100 / 62.  0.100
  libavfilter    11.  0.100 / 11.  0.100
  libswscale  9.  0.100 /  9.  0.100
  libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
'/home/pz9115/mp/1080p_hevc_mp3.mp4':

  Metadata:
    major_brand : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    title   : Big Buck Bunny, Sunflower version
    artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
    composer    : Sacha Goedegebure
    encoder : Lavf60.16.100
    comment : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net

    genre   : Animation
  Duration: 00:01:00.00, start: 0.00, bitrate: 1564 kb/s
  Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)

    Metadata:
  handler_name    : GPAC ISO Video Handler
  vendor_id   : [0][0][0][0]
  encoder : Lavc60.31.102 libx265
  Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D), 
48000 Hz, stereo, fltp, 128 kb/s (default)

    Metadata:
  handler_name    : GPAC ISO Audio Handler
  vendor_id   : [0][0][0][0]
Stream mapping:
  Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
  Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    major_brand : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    title   : Big Buck Bunny, Sunflower version
    artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
    composer    : Sacha Goedegebure
    genre   : Animation
    comment : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net

    encoder : Lavf62.0.102
  Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)

    Metadata:
  encoder : Lavc62.3.101 wrapped_avframe
  handler_name    : GPAC ISO Video Handler
  vendor_id   : [0][0][0][0]
  Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)

    Metadata:
  encoder : Lavc62.3.101 pcm_s16le
  handler_name    : GPAC ISO Audio Handler
  vendor_id   : [0][0][0][0]
[out#0/null @ 0x565233669eb0] video:731KiB audio:11250KiB subtitle:0KiB 
other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 1800 fps=635 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A 
speed=21.2x elapsed=0:00:02.83

bench: utime=11.324s stime=0.290s rtime=2.834s
bench: maxrss=186556KiB
```

Without patch(here I add the fno-tree-vectorize directly):

./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers

  built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
  configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags='-O3 -fno-tree-vectorize' --enable-static --target-os=linux

  libavutil  60.  2.100 / 60.  2.100
  libavcodec 62.  3.101 / 62.  3.101
  libavformat    62.  0.102 / 62.  0.102
  libavdevice    62.  0.100 / 62.  0.100
  libavfilter    11.  0.100 / 11.  0.100
  libswscale  9.  0.100 /  9.  0.100

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Jiawei



在 2025/5/21 14:52, Nicolas George 写道:

Jiawei (HE12025-05-21):

  particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.

Benchmark needed.

Regards,



Hi Nicolas,


Since I am a gcc developer, I'm not so familiar with the FFmpeg test 
flow, here is my test process,

if there exists anything uncorrect, please point me out:


1. Download the video bbb_sunflower_2160p_30fps_normal.mp4.zip 
 
from https://download.blender.org/demo/movies/BBB/，


```

ffmpeg -i bbb_sunflower_2160p_30fps_normal.mp4 -t 60 -vf 
"scale=1920:1080" -c:v libx265 -c:a libmp3lame 1080p_hevc_mp3.mp4

```

get the 1080p video as Benchmark test video


2. Build two version of FFmpeg, one with the modify,  another without 
the patch modif, using the gcc 13.3 release version,


verified with Intel(R) Core(TM) Ultra 9 285HX


Using patch:

```
./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers

  built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
  configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags=-O3 --enable-static --target-os=linux

  libavutil  60.  2.100 / 60.  2.100
  libavcodec 62.  3.101 / 62.  3.101
  libavformat    62.  0.102 / 62.  0.102
  libavdevice    62.  0.100 / 62.  0.100
  libavfilter    11.  0.100 / 11.  0.100
  libswscale  9.  0.100 /  9.  0.100
  libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
'/home/pz9115/mp/1080p_hevc_mp3.mp4':

  Metadata:
    major_brand : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    title   : Big Buck Bunny, Sunflower version
    artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
    composer    : Sacha Goedegebure
    encoder : Lavf60.16.100
    comment : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net

    genre   : Animation
  Duration: 00:01:00.00, start: 0.00, bitrate: 1564 kb/s
  Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)

    Metadata:
  handler_name    : GPAC ISO Video Handler
  vendor_id   : [0][0][0][0]
  encoder : Lavc60.31.102 libx265
  Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D), 
48000 Hz, stereo, fltp, 128 kb/s (default)

    Metadata:
  handler_name    : GPAC ISO Audio Handler
  vendor_id   : [0][0][0][0]
Stream mapping:
  Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
  Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    major_brand : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    title   : Big Buck Bunny, Sunflower version
    artist  : Blender Foundation 2008, Janus Bager Kristensen 2013
    composer    : Sacha Goedegebure
    genre   : Animation
    comment : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net

    encoder : Lavf62.0.102
  Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)

    Metadata:
  encoder : Lavc62.3.101 wrapped_avframe
  handler_name    : GPAC ISO Video Handler
  vendor_id   : [0][0][0][0]
  Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)

    Metadata:
  encoder : Lavc62.3.101 pcm_s16le
  handler_name    : GPAC ISO Audio Handler
  vendor_id   : [0][0][0][0]
[out#0/null @ 0x565233669eb0] video:731KiB audio:11250KiB subtitle:0KiB 
other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 1800 fps=635 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A 
speed=21.2x elapsed=0:00:02.83

bench: utime=11.324s stime=0.290s rtime=2.834s
bench: maxrss=186556KiB
```

Without patch(here I add the fno-tree-vectorize directly):

./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers

  built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
  configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags='-O3 -fno-tree-vectorize' --enable-static --target-os=linux

  libavutil  60.  2.100 / 60.  2.100
  libavcodec 62.  3.101 / 62.  3.101
  libavformat    62.  0.102 / 62.  0.102
  libavdevice    62.  0.100 / 62.  0.100
  libavfilter    11.  0.100 / 11.  0.100
  libswscale  9.  0.100 /  9.  0.100
  libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
'/home/pz9115/mp/1080p_hevc_mp3.mp4':

  Metadata:
    major_brand : isom
    minor_version   : 512

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Michael Niedermayer

On Wed, May 21, 2025 at 02:17:50PM +0800, Jiawei wrote:
> This patch modifies the FFmpeg build system to remove the explicit disabling
> of GCC's auto-vectorization feature.
> 
> Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
> capabilities through extensive optimizations in loop analysis and SIMD
> code generation. The explicit -fno-tree-vectorize flag originally added
> in commit 973859f (2009) to workaround early GCC vectorization instability
> is no longer necessary.
> 
> Key improvements justifying this change:
> 1. Enhanced heuristics for loop vectorization cost models
> 2. Mature handling of alignment and memory access patterns
> 3. Robust fallback mechanisms for unsupported architectures
> 
> This change allows FFmpeg to benefit from automated SIMD optimizations
> when built with -O3 optimization level, particularly improving
> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> 
> [1] 
> https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> 
> ---
>  configure | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/configure b/configure
> index 3730b0524c..b9e95ce4ec 100755
> --- a/configure
> +++ b/configure
> @@ -7656,7 +7656,6 @@ if enabled icc; then
>  disable aligned_stack
>  fi
>  elif enabled gcc; then
> -check_optflags -fno-tree-vectorize
>  check_cflags -Werror=format-security
>  check_cflags -Werror=implicit-function-declaration
>  check_cflags -Werror=missing-prototypes

Your text speaks about this change being ok in a gcc version dependant
way

Your patch has no gcc version dependancy

If you claim that all issues where solved, please show the issues happening
in version v and no longer happening in w>v . Then it make sense to
change the flags for version w

Thx
[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/1] [ffmpeg-deve] avcodec/mpegaudiodec optimizing code size

2025-05-21 Thread Michael Niedermayer

On Wed, May 21, 2025 at 02:46:42AM +0200, Michael Niedermayer wrote:
> On Mon, May 19, 2025 at 08:15:37PM +0800, chenyu202...@gmail.com wrote:
> > From: chenyu 
> > 
> > Optimizing 160k code size by converting static array to dynamic malloc 
> > memory.
> > 
> > Signed-off-by: chenyu 
> > ---
> >  libavcodec/mpegaudiodata.h|  4 ++--
> >  libavcodec/mpegaudiodec_common_tablegen.h | 10 --
> >  2 files changed, 10 insertions(+), 4 deletions(-)
> 
> This segfaults:
> 
> ./ffmpeg_g -max_error_rate 2 -max_alloc 10 -i 
> ~/tickets/2950/mpeg2_fuzz.mpg -max_muxing_queue_size 8000 -f null -

==3638361== Invalid write of size 4
==3638361==at 0x2DFB01: mpegaudiodec_common_init_static (in ffmpeg/ffmpeg_g)
==3638361==by 0x4A114DE: __pthread_once_slow (pthread_once.c:116)
==3638361==by 0x2DFBB1: ff_mpegaudiodec_common_init_static (in 
ffmpeg/ffmpeg_g)
==3638361==by 0x4A114DE: __pthread_once_slow (pthread_once.c:116)
==3638361==by 0x2A6FFD: decode_init (in ffmpeg/ffmpeg_g)
==3638361==by 0x7E4BF1: avcodec_open2 (in ffmpeg/ffmpeg_g)
==3638361==by 0x62C444: try_decode_frame (in ffmpeg/ffmpeg_g)
==3638361==by 0x631575: avformat_find_stream_info (in ffmpeg/ffmpeg_g)
==3638361==by 0x306596: ifile_open (in ffmpeg/ffmpeg_g)
==3638361==by 0x31CA17: open_files.isra.0 (in ffmpeg/ffmpeg_g)
==3638361==by 0x31E9F5: ffmpeg_parse_options (in ffmpeg/ffmpeg_g)
==3638361==by 0x2FD297: main (in ffmpeg/ffmpeg_g)
==3638361==  Address 0x4 is not stack'd, malloc'd or (recently) free'd

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

For a strong democracy, genuine criticism is necessary, allegations benefit
noone, they just cause unnecessary conflicts. - Narendra Modi


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v6 3/4] ogg/vorbis: implement header packet skip in chained ogg bitstreams.

2025-05-21 Thread Michael Niedermayer

Hi Romain

On Tue, May 20, 2025 at 05:45:01PM -0500, Romain Beauxis wrote:
> Le mar. 20 mai 2025 à 16:46, Michael Niedermayer
>  a écrit :
> >
> > On Mon, May 19, 2025 at 09:46:38AM -0500, Romain Beauxis wrote:
> > > ---
> > >  libavcodec/vorbis_parser.h | 11 
> > >  libavcodec/vorbisdec.c | 76 +-
> > >  libavformat/oggparsevorbis.c   | 63 +-
> > >  tests/ref/fate/ogg-vorbis-chained-meta.txt |  3 -
> > >  4 files changed, 116 insertions(+), 37 deletions(-)
> >
> > breaks fate here (normal x86-64 ubuntu)
> >
> > --- ./tests/ref/fate/ogg-vorbis-chained-meta.txt2025-05-20 
> > 23:42:32.043927021 +0200
> > +++ tests/data/fate/ogg-vorbis-chained-meta 2025-05-20 
> > 23:43:07.908216645 +0200
> > @@ -7,8 +7,4 @@
> >  Stream ID: 0, frame PTS: 704, metadata: N/A
> >  Stream ID: 0, packet PTS: 0, packet DTS: 0
> >  Stream ID: 0, new metadata: encoder=Lavc61.19.100 libvorbis:title=Second 
> > Stream
> > -Stream ID: 0, frame PTS: 0, metadata: N/A
> >  Stream ID: 0, packet PTS: 128, packet DTS: 128
> > -Stream ID: 0, frame PTS: 128, metadata: N/A
> > -Stream ID: 0, packet PTS: 704, packet DTS: 704
> > -Stream ID: 0, frame PTS: 704, metadata: N/A
> > Test ogg-vorbis-chained-meta failed. Look at 
> > tests/data/fate/ogg-vorbis-chained-meta.err for details.
> > make: *** [tests/Makefile:316: fate-ogg-vorbis-chained-meta] Error 1
> 
> I'm not sure what I'm looking at. Is that the output of running the FATE 
> tests?

yes
probably was make fate-ogg-vorbis-chained-meta


> 
> This diff is already included in the patch:
> 
> ```
> % git show 37370e99451cf0750d5304764ba9031b80e5b3e0 tests/
> commit 37370e99451cf0750d5304764ba9031b80e5b3e0 (HEAD)
> Author: Romain Beauxis 
> Date:   Sat May 17 12:59:40 2025 -0500
> 
> ogg/vorbis: implement header packet skip in chained ogg bitstreams.
> 
> diff --git a/tests/ref/fate/ogg-vorbis-chained-meta.txt
> b/tests/ref/fate/ogg-vorbis-chained-meta.txt
> index b7a97c90e2..1206f86c1f 100644
> --- a/tests/ref/fate/ogg-vorbis-chained-meta.txt
> +++ b/tests/ref/fate/ogg-vorbis-chained-meta.txt
> @@ -6,10 +6,7 @@ Stream ID: 0, frame PTS: 128, metadata: N/A
>  Stream ID: 0, packet PTS: 704, packet DTS: 704
>  Stream ID: 0, frame PTS: 704, metadata: N/A
>  Stream ID: 0, packet PTS: 0, packet DTS: 0
> -Stream ID: 0, packet PTS: 0, packet DTS: 0
>  Stream ID: 0, new metadata: encoder=Lavc61.19.100 libvorbis:title=Second 
> Stream
> -Stream ID: 0, packet PTS: 0, packet DTS: 0
> -Stream ID: 0, packet PTS: 0, packet DTS: 0
>  Stream ID: 0, frame PTS: 0, metadata: N/A
>  Stream ID: 0, packet PTS: 128, packet DTS: 128
>  Stream ID: 0, frame PTS: 128, metadata: N/A

These 2 diffs are not the same

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

You can kill me, but you cannot change the truth.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] fftools/ffplay: Resolve input file path before processing

2025-05-21 Thread Appaji Chintimi

Hello, Can I get some eyes on this please?

On Fri, 16 May 2025 at 01:24, Appaji Chintimi  wrote:
>
> Can you elaborate a bit more why this requires more changes? My understanding 
> is, to check for different cases and handle them differently:
>
> 1. "-" gets replaced with "fd:", and it's passed directly to "input_filename" 
> without resolving.
> 2. if the path contains "://", that too gets passed on without resolving it 
> further (handles http://, https://, ftp:// etc..)
> 3. Only after these two checks, resolve the "filename" and pass on to 
> "input_filename".
>
> On Wed, 14 May 2025 at 23:40, Marton Balint  wrote:
>>
>>
>>
>> On Wed, 14 May 2025, Nicolas George wrote:
>>
>> > Appaji (HE12025-05-14):
>> >> Fixes ticket: https://trac.ffmpeg.org/ticket/11574
>> >>
>> >> Signed-off-by: Appaji 
>> >> ---
>> >>  fftools/ffplay.c | 13 +++--
>> >>  1 file changed, 11 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/fftools/ffplay.c b/fftools/ffplay.c
>> >> index 2a572fc3aa..42f0584b55 100644
>> >> --- a/fftools/ffplay.c
>> >> +++ b/fftools/ffplay.c
>> >> @@ -27,6 +27,7 @@
>> >>  #include "config_components.h"
>> >>  #include 
>> >>  #include 
>> >> +#include 
>> >>  #include 
>> >>  #include 
>> >>
>> >> @@ -3623,9 +3624,17 @@ static int opt_input_file(void *optctx, const char 
>> >> *filename)
>> >>  filename, input_filename);
>> >>  return AVERROR(EINVAL);
>> >>  }
>> >> -if (!strcmp(filename, "-"))
>> >> +
>> >> +char resolved_path[PATH_MAX];
>> >> +
>> >> +if (!realpath(filename, resolved_path)) {
>> >> +av_log(NULL, AV_LOG_FATAL, "Failed to resolve path for '%s': 
>> >> %s\n", filename, strerror(errno));
>> >> +return AVERROR(errno);
>> >> +}
>> >> +
>> >
>> > Hi. Thanks for the patch. Did you test it with non-filenames arguments,
>> > for example http://…?
>> >
>> >> +if (!strcmp(resolved_path, "-"))
>> >>  filename = "fd:";
>> >
>> > This should happen before resolution.
>> >
>> >> -input_filename = av_strdup(filename);
>> >> +input_filename = av_strdup(resolved_path);
>> >>  if (!input_filename)
>> >>  return AVERROR(ENOMEM);
>> >>
>> >
>> > On the whole, I think you are going at it wrong: you are only fixing
>> > this for ffplay, not for ffprobe, ffmpeg and other applications built on
>> > the libraries, and resolving the path can have side effects, for example
>> > if you do not have permission on a parent of the current working
>> > directory.
>> >
>> > IMO, the correct way would be to add a stat() early in the opening of
>> > the file and test the device number. But that requires changing quite a
>> > lot of things.
>>
>> Agreed. You should improve the probing function to fix the ticket, you can
>> do a stat in v4l2_read_probe() in libavdevice/v4l2.c, check if it is a
>> char device and try a V4L2 IOCTL on it to make sure it is a V4L2 device.
>>
>> Regards,
>> Marton
>>
>> >
>> > Regards,
>> >
>> > --
>> >  Nicolas George
>> > ___
>> > ffmpeg-devel mailing list
>> > ffmpeg-devel@ffmpeg.org
>> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> >
>> > To unsubscribe, visit link above, or email
>> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>> ___
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Jiawei



在 2025/5/21 17:04, Zhao Zhili 写道:



On May 21, 2025, at 14:17, Jiawei  wrote:

This patch modifies the FFmpeg build system to remove the explicit disabling
of GCC's auto-vectorization feature.

Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
capabilities through extensive optimizations in loop analysis and SIMD
code generation. The explicit -fno-tree-vectorize flag originally added
in commit 973859f (2009) to workaround early GCC vectorization instability
is no longer necessary.

This isn’t the whole story.

The flag was added by 973859f in 2009.
Then it was reverted by cb8646af in 2016.
Shortly after that, the revert was reverted again by fd6dbc5 in 2016.


Key improvements justifying this change:
1. Enhanced heuristics for loop vectorization cost models
2. Mature handling of alignment and memory access patterns
3. Robust fallback mechanisms for unsupported architectures

This change allows FFmpeg to benefit from automated SIMD optimizations
when built with -O3 optimization level, particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.

Those flags can only be enabled in tightly controlled environments (e.g., built 
and run on the same
machine), while FFmpeg has hand written assembly, runtime cpu probe and dynamic 
binding/dispatch.

Those auto-vectorization and ARCH flags can be enabled manually, but be careful.


Thank you point this out, since I am using x64 AVX2 and RISC-V RVV, when 
I enable the vector feature


by -O3 -mavx(-march=rv64gcv for RV). This configure will adds the 
`-fno-tree-vectorize` option automatically.


It will still add the vector load/store instructions in the result, but 
no vector operation here.



GCC import the explicit option to controll if there need generate the 
vectorized instructions. It's okay to use -O3

but not do auto-vectorization.




[1] 
https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191

---
configure | 1 -
1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index 3730b0524c..b9e95ce4ec 100755
--- a/configure
+++ b/configure
@@ -7656,7 +7656,6 @@ if enabled icc; then
 disable aligned_stack
 fi
elif enabled gcc; then
-check_optflags -fno-tree-vectorize
 check_cflags -Werror=format-security
 check_cflags -Werror=implicit-function-declaration
 check_cflags -Werror=missing-prototypes
--
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Jiawei



在 2025/5/21 15:46, Michael Niedermayer 写道:

On Wed, May 21, 2025 at 02:17:50PM +0800, Jiawei wrote:

This patch modifies the FFmpeg build system to remove the explicit disabling
of GCC's auto-vectorization feature.

Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
capabilities through extensive optimizations in loop analysis and SIMD
code generation. The explicit -fno-tree-vectorize flag originally added
in commit 973859f (2009) to workaround early GCC vectorization instability
is no longer necessary.

Key improvements justifying this change:
1. Enhanced heuristics for loop vectorization cost models
2. Mature handling of alignment and memory access patterns
3. Robust fallback mechanisms for unsupported architectures

This change allows FFmpeg to benefit from automated SIMD optimizations
when built with -O3 optimization level, particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.

[1] 
https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191

---
  configure | 1 -
  1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index 3730b0524c..b9e95ce4ec 100755
--- a/configure
+++ b/configure
@@ -7656,7 +7656,6 @@ if enabled icc; then
  disable aligned_stack
  fi
  elif enabled gcc; then
-check_optflags -fno-tree-vectorize
  check_cflags -Werror=format-security
  check_cflags -Werror=implicit-function-declaration
  check_cflags -Werror=missing-prototypes

Your text speaks about this change being ok in a gcc version dependant
way

Your patch has no gcc version dependancy

If you claim that all issues where solved, please show the issues happening
in version v and no longer happening in w>v . Then it make sense to
change the flags for version w

Thx
[...]



Sorry I forgot about that, thanks for reminding me. Here still exist 
many old version gcc user,


And I am not sure how will this impact them.

Maybe a later version gcc checking is good, like gcc 13-15, what you 
think about it?


  


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Andreas Rheinhardt

Jiawei:
> This patch modifies the FFmpeg build system to remove the explicit disabling
> of GCC's auto-vectorization feature.
> 
> Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
> capabilities through extensive optimizations in loop analysis and SIMD
> code generation. The explicit -fno-tree-vectorize flag originally added
> in commit 973859f (2009) to workaround early GCC vectorization instability
> is no longer necessary.
> 
> Key improvements justifying this change:
> 1. Enhanced heuristics for loop vectorization cost models
> 2. Mature handling of alignment and memory access patterns
> 3. Robust fallback mechanisms for unsupported architectures
> 
> This change allows FFmpeg to benefit from automated SIMD optimizations
> when built with -O3 optimization level, particularly improving
> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> 
> [1] 
> https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> 
> ---
>  configure | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/configure b/configure
> index 3730b0524c..b9e95ce4ec 100755
> --- a/configure
> +++ b/configure
> @@ -7656,7 +7656,6 @@ if enabled icc; then
>  disable aligned_stack
>  fi
>  elif enabled gcc; then
> -check_optflags -fno-tree-vectorize
>  check_cflags -Werror=format-security
>  check_cflags -Werror=implicit-function-declaration
>  check_cflags -Werror=missing-prototypes

FYI: The last discussion about auto-vectorization is here:
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299405.html
It contains a report about a failing build with vectorization enabled:
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299421.html
I don't know whether this is still reproducible with the latest GCC.

- Andreas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] Graphprint Patches Overview

2025-05-21 Thread Kieran Kunhya via ffmpeg-devel

On Wed, 21 May 2025, 01:45 softworkz ., 
wrote:

> Hello,
>
> thanks again to all for the patches. I figured it might be a bit difficult
> to
> keep track of what has already been submitted and fixed and is still
> pending, and I'm sorry that there has been some duplicate effort to fix the
> same things - so here's an overview. The ones with X are the ones I would
> like to apply eventually:
>
>
> Timo Rothenpieler
> https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14528
> (I would favor "ffbuild/commonmak" over for 1/3)
>
>   [1/3] fftools/resources: fix preservation of intermediary resman build
> artifacts
> X [2/3] ffbuild: correctly silence and tag new css/html steps
> X [3/3] fftools/resources: add missing extensions to .gitignore
>
>
>
> Mark Thompson (already merged)
> https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14537
>
> X [1/3] ffmpeg: Don't print graphs if there are no graphs to print
> X [2/3] fftools/graphprint: Fix leak of graphprint object
> X [3/3] fftools/graphprint: Fix leak of graph section header string
>
>
> softworkz
> https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14563
>
> X [1/5] fftools/makefile: Remove resources from ffprobe
> X [2/5] fftools/resources: Use .SECONDARY in Makefile comment
> X [3/5] fftools/ffmpeg: Free print_graph option variables
> X [4/5] fftools/graphprint: Fix memory leaks
> X [5/5] fftools/tf_mermaid: Add missing uninit and fix leaks
>
> https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14570
> X [v3] ffbuild/commonmak: Fix rebuild check with implicit rule chains
>
>
> Derek Buitenhuis
> https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=14569
> (1/3 and 2/3 correspond to 2/3 from Timo, and 3/3 doesn't fix
> the rebuild check like "commonmak" above does)
>
> [1/3] ffbuild/common: Remove what appears to be a temporary debugging
> comment
> [2/3] ffbuild/common: Properly tag/suppress sed command
> [3/3] fftools/resoirces: Mark .css.min and .css.min.gz as NOTINTERMEDIATE
>
>
> Thanks again,
> sw
>

Can we just revert the whole set until it's cleaned up properly?

There are more patches to fix issues than the set itself. This is
understandable if it's a bit architectural change like threading but it's
not.

Kieran

>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.

2025-05-21 Thread Michael Niedermayer

On Wed, May 21, 2025 at 06:32:49PM +0800, Jiawei wrote:
> 
> 在 2025/5/21 15:46, Michael Niedermayer 写道:
> > On Wed, May 21, 2025 at 02:17:50PM +0800, Jiawei wrote:
> > > This patch modifies the FFmpeg build system to remove the explicit 
> > > disabling
> > > of GCC's auto-vectorization feature.
> > > 
> > > Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
> > > capabilities through extensive optimizations in loop analysis and SIMD
> > > code generation. The explicit -fno-tree-vectorize flag originally added
> > > in commit 973859f (2009) to workaround early GCC vectorization instability
> > > is no longer necessary.
> > > 
> > > Key improvements justifying this change:
> > > 1. Enhanced heuristics for loop vectorization cost models
> > > 2. Mature handling of alignment and memory access patterns
> > > 3. Robust fallback mechanisms for unsupported architectures
> > > 
> > > This change allows FFmpeg to benefit from automated SIMD optimizations
> > > when built with -O3 optimization level, particularly improving
> > > performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> > > 
> > > [1] 
> > > https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> > > 
> > > ---
> > >   configure | 1 -
> > >   1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/configure b/configure
> > > index 3730b0524c..b9e95ce4ec 100755
> > > --- a/configure
> > > +++ b/configure
> > > @@ -7656,7 +7656,6 @@ if enabled icc; then
> > >   disable aligned_stack
> > >   fi
> > >   elif enabled gcc; then
> > > -check_optflags -fno-tree-vectorize
> > >   check_cflags -Werror=format-security
> > >   check_cflags -Werror=implicit-function-declaration
> > >   check_cflags -Werror=missing-prototypes
> > Your text speaks about this change being ok in a gcc version dependant
> > way
> > 
> > Your patch has no gcc version dependancy
> > 
> > If you claim that all issues where solved, please show the issues happening
> > in version v and no longer happening in w>v . Then it make sense to
> > change the flags for version w
> > 
> > Thx
> > [...]
> 
> 
> Sorry I forgot about that, thanks for reminding me. Here still exist many
> old version gcc user,
> 
> And I am not sure how will this impact them.
> 

> Maybe a later version gcc checking is good, like gcc 13-15, what you think
> about it?

i cannot speak about gcc versions, i know of them little more than i know
numbers from a dice throw.

But if we can turn on optimizations and make the code faster without breaking
anything, iam in favor of that. Its just that i cannot awnser the question
what checks, what exact version or other spatial limitation may be needed.
You would have to verify that the issues people encountered previously
no longer affect version XY and then put a XY check in the patch.

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The greatest way to live with honor in this world is to be what we pretend
to be. -- Socrates


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

93 matches

Mail list logo