date:20250519

[FFmpeg-devel] Follow-Up on ML Submissions vis GitHub/FFstaging

2025-05-19 Thread softworkz .

Hello everybody,


this is a follow-up to the recent discussion on IRC 
(https://libera.catirclogs.org/ffmpeg-devel/2025-05-16) about patch submission 
via https://github.com/ffstaging/FFmpeg. 


It involved such preposterous statements like:

"you've been refusing to use proper submission procedures, tried to introduce 
your own"

and raised some other questions that I would like to address:


What is it?
===

The implementation is adapted from GitGitGadget (https://gitgitgadget.github.io)
for FFmpeg without functional changes. GitGitGadget is developed by Git 
developers.

Git development goes through a mailing list, same like FFmpeg. Instead of using 
git send-email directly, developers can send patches via GGG on GitHub which is 
also advertised in their documentation 
(https://git-scm.com/docs/SubmittingPatches 
https://git-scm.com/docs/MyFirstContribution).

So, for anybody who's trying to depict the ffstaging submissions as something 
different or bad, we can tell for sure:

- If there would be anything wrong about it, then the Git developers themselves
  would be the very first people in the world to know about it

- The way how it works is how the Git developers consider it to be correct



Long E-Mail Threads
===

GGG runs format-patch with --thread=shallow (and sends the e-mails itself),
that gives the same result as when using git send-email (where the default
it --thread=shallow).
In turn, each patch e-mail will show as a reply to the cover-letter (0/X).

But additionally, it specifies the --in-reply-to parameter - pointing to
the cover letter of one the previous revision.
This chaining is essentially what is causing the threads to become huge.
It ties all revisions together. When looking at the Git docs for developers
(https://git-scm.com/docs/MyFirstContribution#v2-git-send-email), we can
see that this is intentional: According to the docs for manual submissions,
it's a requirement to do this when sending new revisions of a patchset.


As that is not a requirement for FFmpeg, I have removed this behavior now!



=

Other comments were:

"kludgy github submission on your end where you dump huge patches"

"it highly encourages huge patchsets"



Does it cause large Patchsets?
==

It is not quite clear how one might come to think that. Maybe the
idea is that it would require to always push new additional commits to
GitHub and that this would make the patchset longer with each revision?

=> Of course not. 
The reason why the patchset grew from 10 to 15 was from review comments
like "this should be in a separate commit" - so I separated them.
The last commit of the patchset has always been the last one, since v1.


Or is the idea perhaps that it would make it easier to deal with 
larger patchsets?

=> First of all, that would be great and nothing bad - but no!
When using git send-email, you are running a single command. What's 
tedious is to write that command, but it doesn't matter whether
it's 3 or 30 commits. Same for the GitHub path, you just don't 
need write the command.


Finally: The way I'm working doesn't depend on the submission
 procedure. Even when I had to print out everything on
 paper and send it by postal mail, that wouldn't make any
 change. No idea how someone could think it would be 
 any different.



And for this comment:

"Even if it posted it as a new thread each iteration, the general
problem is that it causes too many full iterations still remains"


=> I'm not sure how it could "cause too many full iterations"..

If it's about "many":

The number of iterations I send is not dependent on which tooling
I use. Just because I find submissions via CLI tedious doesn't mean
that I would do it differently.

If it's about "full":

I've never sent new versions of individual patches only from a set 
and I'm not intending to do that. Here's why:


I rarely have patchsets which are just a collection of independent
patches. But when it would turn out that there one patch in such a
set that needs more discussion or review, then I rather remove that
one from the set and resubmit that one separately.

In all other cases, where patches are depending on each other, I'm 
expecting for each revision I send, that it will be the final one
(except the first few in a complex set). If I wouldn't think 
that, I normally wouldn't send it (or mark as RFC). The final 
revision should be all-green on Patchwork and you can't get that 
with just partial patches sent. That's why I don't see much point
in this and I also see it done by others just very rarely.

Yet I'll try to send less revisions in the future.

Thanks
sw








___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec/rv60dec: Avoid branch when decoding cbp16

2025-05-19 Thread Peter Ross

On Mon, May 19, 2025 at 12:06:02AM +0200, Andreas Rheinhardt wrote:
> Patch attached.
> 
> - Andreas

> From 02724d5792348bea618c049034dc0febf24a46ac Mon Sep 17 00:00:00 2001
> From: Andreas Rheinhardt 
> Date: Sun, 18 May 2025 23:12:03 +0200
> Subject: [PATCH] avcodec/rv60dec: Avoid branch when decoding cbp16
> 
> Signed-off-by: Andreas Rheinhardt 
> ---
>  libavcodec/rv60dec.c | 11 ---
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/libavcodec/rv60dec.c b/libavcodec/rv60dec.c
> index d704ae512c..2bbcb1d620 100644
> --- a/libavcodec/rv60dec.c
> +++ b/libavcodec/rv60dec.c
> @@ -82,7 +82,7 @@ enum {
>  };
>  
>  static const VLCElem * cbp8_vlc[7][4];
> -static const VLCElem * cbp16_vlc[7][3][4];
> +static const VLCElem * cbp16_vlc[7][4][4];
>  
>  typedef struct {
>  const VLCElem * l0[2];
> @@ -137,12 +137,12 @@ static av_cold void rv60_init_static_data(void)
>  
>  for (int i = 0; i < 7; i++)
>  for (int j = 0; j < 4; j++)
> -cbp8_vlc[i][j] = gen_vlc(rv60_cbp8_lens[i][j], 64, &state);
> +cbp16_vlc[i][0][j] = cbp8_vlc[i][j] = 
> gen_vlc(rv60_cbp8_lens[i][j], 64, &state);
>  
>  for (int i = 0; i < 7; i++)
>  for (int j = 0; j < 3; j++)
>  for (int k = 0; k < 4; k++)
> -cbp16_vlc[i][j][k] = gen_vlc(rv60_cbp16_lens[i][j][k], 64, 
> &state);
> +cbp16_vlc[i][j + 1][k] = gen_vlc(rv60_cbp16_lens[i][j][k], 
> 64, &state);
>  
>  build_coeff_vlc(rv60_intra_lens, intra_coeff_vlc, 5, &state);
>  build_coeff_vlc(rv60_inter_lens, inter_coeff_vlc, 7, &state);
> @@ -1650,10 +1650,7 @@ static int decode_super_cbp(GetBitContext * gb, const 
> VLCElem * vlc[4])
>  static int decode_cbp16(GetBitContext * gb, int subset, int qp)
>  {
>  int cb_set = rv60_qp_to_idx[qp];
> -if (!subset)
> -return decode_super_cbp(gb, cbp8_vlc[cb_set]);
> -else
> -return decode_super_cbp(gb, cbp16_vlc[cb_set][subset - 1]);
> +return decode_super_cbp(gb, cbp16_vlc[cb_set][subset]);
>  }
>  
>  static int decode_cu_r(RV60Context * s, AVFrame * frame, ThreadContext * 
> thread, GetBitContext * gb, int xpos, int ypos, int log_size, int qp, int 
> sel_qp)
> -- 
> 2.45.2

Looks okay. What was the motivation for this change. Speed up; any numbers?

-- Peter
(A907 E02F A6E5 0CD2 34CD 20D2 6760 79C5 AC40 DD6B)


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/1] [ffmpeg-deve] avcodec/mpegaudiodec optimizing code size

2025-05-19 Thread chenyu202107

From: chenyu 

Optimizing 160k code size by converting static array to dynamic malloc memory.

Signed-off-by: chenyu 
---
 libavcodec/mpegaudiodata.h|  4 ++--
 libavcodec/mpegaudiodec_common_tablegen.h | 10 --
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/libavcodec/mpegaudiodata.h b/libavcodec/mpegaudiodata.h
index 720c4bee64..6dfb74cd01 100644
--- a/libavcodec/mpegaudiodata.h
+++ b/libavcodec/mpegaudiodata.h
@@ -50,8 +50,8 @@ extern const unsigned char * const ff_mpa_alloc_tables[5];
 extern const int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
 extern const uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
 #else
-extern int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
-extern uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
+extern int8_t   *ff_table_4_3_exp;
+extern uint32_t *ff_table_4_3_value;
 #endif
 
 /* VLCs for decoding layer 3 huffman tables */
diff --git a/libavcodec/mpegaudiodec_common_tablegen.h 
b/libavcodec/mpegaudiodec_common_tablegen.h
index bf402c9d84..66e93df27f 100644
--- a/libavcodec/mpegaudiodec_common_tablegen.h
+++ b/libavcodec/mpegaudiodec_common_tablegen.h
@@ -34,9 +34,10 @@
 #else
 #include 
 #include "libavutil/attributes.h"
+#include "libavutil/mem.h"
 
-int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
-uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
+int8_t   *ff_table_4_3_exp;
+uint32_t *ff_table_4_3_value;
 
 #define FRAC_BITS 23
 #define IMDCT_SCALAR 1.759
@@ -51,6 +52,11 @@ static av_cold void mpegaudiodec_common_tableinit(void)
 };
 double pow43_val = 0;
 
+#if !CONFIG_HARDCODED_TABLES
+ff_table_4_3_exp = (int8_t *)av_calloc(TABLE_4_3_SIZE, sizeof(int8_t));
+ff_table_4_3_value = (uint32_t *)av_calloc(TABLE_4_3_SIZE, 
sizeof(uint32_t));
+#endif
+
 for (int i = 1; i < TABLE_4_3_SIZE; i++) {
 double f, fm;
 int e, m;
-- 
2.34.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/1] [ffmpeg-deve] avcodec/mpegaudiodec optimizing code size

2025-05-19 Thread Andreas Rheinhardt

chenyu202...@gmail.com:
> From: chenyu 
> 
> Optimizing 160k code size by converting static array to dynamic malloc memory.
> 
> Signed-off-by: chenyu 
> ---
>  libavcodec/mpegaudiodata.h|  4 ++--
>  libavcodec/mpegaudiodec_common_tablegen.h | 10 --
>  2 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/libavcodec/mpegaudiodata.h b/libavcodec/mpegaudiodata.h
> index 720c4bee64..6dfb74cd01 100644
> --- a/libavcodec/mpegaudiodata.h
> +++ b/libavcodec/mpegaudiodata.h
> @@ -50,8 +50,8 @@ extern const unsigned char * const ff_mpa_alloc_tables[5];
>  extern const int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
>  extern const uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
>  #else
> -extern int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
> -extern uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
> +extern int8_t   *ff_table_4_3_exp;
> +extern uint32_t *ff_table_4_3_value;
>  #endif
>  
>  /* VLCs for decoding layer 3 huffman tables */
> diff --git a/libavcodec/mpegaudiodec_common_tablegen.h 
> b/libavcodec/mpegaudiodec_common_tablegen.h
> index bf402c9d84..66e93df27f 100644
> --- a/libavcodec/mpegaudiodec_common_tablegen.h
> +++ b/libavcodec/mpegaudiodec_common_tablegen.h
> @@ -34,9 +34,10 @@
>  #else
>  #include 
>  #include "libavutil/attributes.h"
> +#include "libavutil/mem.h"
>  
> -int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
> -uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
> +int8_t   *ff_table_4_3_exp;
> +uint32_t *ff_table_4_3_value;
>  
>  #define FRAC_BITS 23
>  #define IMDCT_SCALAR 1.759
> @@ -51,6 +52,11 @@ static av_cold void mpegaudiodec_common_tableinit(void)
>  };
>  double pow43_val = 0;
>  
> +#if !CONFIG_HARDCODED_TABLES
> +ff_table_4_3_exp = (int8_t *)av_calloc(TABLE_4_3_SIZE, sizeof(int8_t));
> +ff_table_4_3_value = (uint32_t *)av_calloc(TABLE_4_3_SIZE, 
> sizeof(uint32_t));
> +#endif
> +
>  for (int i = 1; i < TABLE_4_3_SIZE; i++) {
>  double f, fm;
>  int e, m;

This does not improve "code size" at all; after all you are actually
adding code (and an indirection) with your allocation. If one interprets
"code size" as "size of the binary" (instead of just .text), then this
patch still doesn't make sense, because the tables are currently in .bss
which does not increase the size of the binaries (that's at least the
case for common systems, is it different for you?).
And of course you are missing the error check (in fact, you have no way
to return an error from init functions like this).

- Andreas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] lavc/vvc: Validate num_signalled_palette_entries

2025-05-19 Thread Nuo Mi

On Sun, May 18, 2025 at 2:51 PM Frank Plowman  wrote:

> On 18/05/2025 02:42, Nuo Mi wrote:
> > Hi Frank,
> > 👍,your fuzzing infrastructure caught this issue as well.
> > How about this:
> >
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250517055150.807683-1-nuomi2...@gmail.com/
>
> Sorry, I missed this.  Your patch looks good to me: probably preferable
> in that it also validates the predicted size by the looks of it
> (although I don't have a stream which exercises this aspect).
>
Hi Frank,
Not needed — we get predictor_size from lc->ep->pp[c].size.
lc->ep->pp[i].size is protected by max_predictor, which is smaller than
VVC_MAX_NUM_PALETTE_PREDICTOR_SIZE.

>
> >
> > On Sun, May 18, 2025 at 5:05 AM Frank Plowman 
> wrote:
> >
> >> "The value of CurrentPaletteSize[ startComp ] shall be in the range of 0
> >> to maxNumPaletteEntries, inclusive."
> >>
> >> Signed-off-by: Frank Plowman 
> >> ---
> >>  libavcodec/vvc/ctu.c | 14 +++---
> >>  1 file changed, 11 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
> >> index 62c9d4f5c0..70800ba5fa 100644
> >> --- a/libavcodec/vvc/ctu.c
> >> +++ b/libavcodec/vvc/ctu.c
> >> @@ -20,6 +20,7 @@
> >>   * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> >> 02110-1301 USA
> >>   */
> >>
> >> +#include "libavutil/error.h"
> >>  #include "libavutil/refstruct.h"
> >>
> >>  #include "cabac.h"
> >> @@ -1873,7 +1874,7 @@ static void palette_predicted(VVCLocalContext *lc,
> >> const bool local_dual_tree, i
> >>  cu->plt[c].size = nb_predicted;
> >>  }
> >>
> >> -static void palette_signaled(VVCLocalContext *lc, const bool
> >> local_dual_tree,
> >> +static int palette_signaled(VVCLocalContext *lc, const bool
> >> local_dual_tree,
> >>  const int start, const int end, const int max_entries)
> >>  {
> >>  const VVCSPS *sps = lc->fc->ps.sps;
> >> @@ -1883,6 +1884,9 @@ static void palette_signaled(VVCLocalContext *lc,
> >> const bool local_dual_tree,
> >>  const int size= nb_predicted + nb_signaled;
> >>  const bool dual_tree_luma = local_dual_tree && cu->tree_type ==
> >> DUAL_TREE_LUMA;
> >>
> >> +if (size > max_entries)
> >> +return AVERROR_INVALIDDATA;
> >> +
> >>  for (int c = start; c < end; c++) {
> >>  Palette *plt = cu->plt + c;
> >>  for (int i = nb_predicted; i < size; i++) {
> >> @@ -1894,6 +1898,8 @@ static void palette_signaled(VVCLocalContext *lc,
> >> const bool local_dual_tree,
> >>  }
> >>  plt->size = size;
> >>  }
> >> +
> >> +return 0;
> >>  }
> >>
> >>  static void palette_update_predictor(VVCLocalContext *lc, const bool
> >> local_dual_tree, int start, int end,
> >> @@ -2070,7 +2076,7 @@ static int hls_palette_coding(VVCLocalContext *lc,
> >> const VVCTreeType tree_type)
> >>  int max_index = 0;
> >>  int prev_run_pos  = 0;
> >>
> >> -int predictor_size, start, end;
> >> +int predictor_size, start, end, ret;
> >>  bool reused[VVC_MAX_NUM_PALETTE_PREDICTOR_SIZE];
> >>  uint8_t run_type[MAX_PALETTE_CU_SIZE * MAX_PALETTE_CU_SIZE];
> >>  uint8_t index[MAX_PALETTE_CU_SIZE * MAX_PALETTE_CU_SIZE];
> >> @@ -2083,7 +2089,9 @@ static int hls_palette_coding(VVCLocalContext *lc,
> >> const VVCTreeType tree_type)
> >>  predictor_size = pp[start].size;
> >>  memset(reused, 0, sizeof(reused[0]) * predictor_size);
> >>  palette_predicted(lc, local_dual_tree, start, end, reused,
> >> predictor_size, max_entries);
> >> -palette_signaled(lc, local_dual_tree, start, end, max_entries);
> >> +ret = palette_signaled(lc, local_dual_tree, start, end,
> max_entries);
> >> +if (ret < 0)
> >> +return ret;
> >>  palette_update_predictor(lc, local_dual_tree, start, end, reused,
> >> predictor_size);
> >>
> >>  if (cu->plt[start].size > 0)
> >> --
> >> 2.47.0
> >>
> >>
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/1] [ffmpeg-deve] avcodec/mpegaudiodec optimizing code size

2025-05-19 Thread chenyu202107

From: chenyu 

Optimizing 160k code size by converting static array to dynamic malloc memory.

Signed-off-by: chenyu 
---
 libavcodec/mpegaudiodata.h|  4 ++--
 libavcodec/mpegaudiodec_common_tablegen.h | 10 --
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/libavcodec/mpegaudiodata.h b/libavcodec/mpegaudiodata.h
index 720c4bee64..6dfb74cd01 100644
--- a/libavcodec/mpegaudiodata.h
+++ b/libavcodec/mpegaudiodata.h
@@ -50,8 +50,8 @@ extern const unsigned char * const ff_mpa_alloc_tables[5];
 extern const int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
 extern const uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
 #else
-extern int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
-extern uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
+extern int8_t   *ff_table_4_3_exp;
+extern uint32_t *ff_table_4_3_value;
 #endif
 
 /* VLCs for decoding layer 3 huffman tables */
diff --git a/libavcodec/mpegaudiodec_common_tablegen.h 
b/libavcodec/mpegaudiodec_common_tablegen.h
index bf402c9d84..66e93df27f 100644
--- a/libavcodec/mpegaudiodec_common_tablegen.h
+++ b/libavcodec/mpegaudiodec_common_tablegen.h
@@ -34,9 +34,10 @@
 #else
 #include 
 #include "libavutil/attributes.h"
+#include "libavutil/mem.h"
 
-int8_t   ff_table_4_3_exp  [TABLE_4_3_SIZE];
-uint32_t ff_table_4_3_value[TABLE_4_3_SIZE];
+int8_t   *ff_table_4_3_exp;
+uint32_t *ff_table_4_3_value;
 
 #define FRAC_BITS 23
 #define IMDCT_SCALAR 1.759
@@ -51,6 +52,11 @@ static av_cold void mpegaudiodec_common_tableinit(void)
 };
 double pow43_val = 0;
 
+#if !CONFIG_HARDCODED_TABLES
+ff_table_4_3_exp = (int8_t *)av_calloc(TABLE_4_3_SIZE, sizeof(int8_t));
+ff_table_4_3_value = (uint32_t *)av_calloc(TABLE_4_3_SIZE, 
sizeof(uint32_t));
+#endif
+
 for (int i = 1; i < TABLE_4_3_SIZE; i++) {
 double f, fm;
 int e, m;
-- 
2.34.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] configure: correct liboapv feature support

2025-05-19 Thread Gyan Doshi





On 2025-05-18 02:17 pm, Gyan Doshi wrote:

Only encoding support has been added for liboapv


Pushed as c55d65ac0a789c23192aa555d8c1da399cee71aa

Regards,
Gyan



---
  configure | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index 0609dac4ab..30063e6b34 100755
--- a/configure
+++ b/configure
@@ -249,7 +249,7 @@ External library support:
--enable-liblensfun  enable lensfun lens correction [no]
--enable-libmodplug  enable ModPlug via libmodplug [no]
--enable-libmp3lame  enable MP3 encoding via libmp3lame [no]
-  --enable-liboapv enable APV encoding/decoding via liboapv [no]
+  --enable-liboapv enable APV encoding via liboapv [no]
--enable-libopencore-amrnb enable AMR-NB de/encoding via libopencore-amrnb 
[no]
--enable-libopencore-amrwb enable AMR-WB decoding via libopencore-amrwb [no]
--enable-libopencv   enable video filtering via libopencv [no]


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/1] avcodec/pcm: reduce code size

2025-05-19 Thread chenyu202107

From: chenyu 

add depends to pcm.c for reducing size when ALAW/MULAW/VIDC not defined

Signed-off-by: chenyu 
---
 libavcodec/pcm.c  | 36 +++-
 libavcodec/pcm_tablegen.h | 22 ++
 2 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/libavcodec/pcm.c b/libavcodec/pcm.c
index bff61f2195..60a2c544a8 100644
--- a/libavcodec/pcm.c
+++ b/libavcodec/pcm.c
@@ -44,15 +44,20 @@ static av_cold int pcm_encode_init(AVCodecContext *avctx)
 #if !CONFIG_HARDCODED_TABLES
 switch (avctx->codec->id) {
 #define INIT_ONCE(id, name) \
-case AV_CODEC_ID_PCM_ ## id:\
-if (CONFIG_PCM_ ## id ## _ENCODER) {\
+case AV_CODEC_ID_PCM_ ## id: {  \
 static AVOnce init_static_once = AV_ONCE_INIT;  \
 ff_thread_once(&init_static_once, pcm_ ## name ## _tableinit);  \
-}   \
-break
+break;  \
+}
+#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER
 INIT_ONCE(ALAW,  alaw);
+#endif
+#if CONFIG_PCM_MULAW_DECODER || CONFIG_PCM_MULAW_ENCODER
 INIT_ONCE(MULAW, ulaw);
+#endif
+#if CONFIG_PCM_VIDC_DECODER || CONFIG_PCM_VIDC_ENCODER
 INIT_ONCE(VIDC,  vidc);
+#endif
 default:
 break;
 }
@@ -216,24 +221,30 @@ static int pcm_encode_frame(AVCodecContext *avctx, 
AVPacket *avpkt,
 bytestream_put_buffer(&dst, src, n * sample_size);
 }
 break;
+#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER
 case AV_CODEC_ID_PCM_ALAW:
 for (; n > 0; n--) {
 v  = *samples++;
 *dst++ = linear_to_alaw[(v + 32768) >> 2];
 }
 break;
+#endif
+#if CONFIG_PCM_MULAW_DECODER || CONFIG_PCM_MULAW_ENCODER
 case AV_CODEC_ID_PCM_MULAW:
 for (; n > 0; n--) {
 v  = *samples++;
 *dst++ = linear_to_ulaw[(v + 32768) >> 2];
 }
 break;
+#endif
+#if CONFIG_PCM_VIDC_DECODER || CONFIG_PCM_VIDC_ENCODER
 case AV_CODEC_ID_PCM_VIDC:
 for (; n > 0; n--) {
 v  = *samples++;
 *dst++ = linear_to_vidc[(v + 32768) >> 2];
 }
 break;
+#endif
 default:
 return -1;
 }
@@ -327,20 +338,25 @@ static av_cold av_unused int 
pcm_lut_decode_init(AVCodecContext *avctx)
 PCMLUTDecode *s = avctx->priv_data;
 
 switch (avctx->codec_id) {
+#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER
 case AV_CODEC_ID_PCM_ALAW:
 for (int i = 0; i < 256; i++)
 s->table[i] = alaw2linear(i);
 break;
+#endif
+#if CONFIG_PCM_MULAW_DECODER || CONFIG_PCM_MULAW_ENCODER
 case AV_CODEC_ID_PCM_MULAW:
 for (int i = 0; i < 256; i++)
 s->table[i] = ulaw2linear(i);
 break;
+#endif
+#if CONFIG_PCM_VIDC_DECODER || CONFIG_PCM_VIDC_ENCODER
 case AV_CODEC_ID_PCM_VIDC:
 for (int i = 0; i < 256; i++)
 s->table[i] = vidc2linear(i);
 break;
 }
-
+#endif
 avctx->sample_fmt = AV_SAMPLE_FMT_S16;
 s->base.sample_size = 1;
 
@@ -545,6 +561,9 @@ static int pcm_decode_frame(AVCodecContext *avctx, AVFrame 
*frame,
 bytestream_get_buffer(&src, samples, n * sample_size);
 }
 break;
+#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER || \
+CONFIG_PCM_MULAW_DECODER || CONFIG_PCM_MULAW_ENCODER || \
+CONFIG_PCM_VIDC_DECODER || CONFIG_PCM_VIDC_ENCODER
 case AV_CODEC_ID_PCM_ALAW:
 case AV_CODEC_ID_PCM_MULAW:
 case AV_CODEC_ID_PCM_VIDC: {
@@ -555,6 +574,7 @@ static int pcm_decode_frame(AVCodecContext *avctx, AVFrame 
*frame,
 *samples_16++ = lut[*src++];
 break;
 }
+#endif
 case AV_CODEC_ID_PCM_LXF:
 {
 int i;
@@ -655,7 +675,9 @@ const FFCodec ff_ ## name_ ## _decoder = {  
\
  *   to the table in pcm_decode_init() as well. */
 //AV_CODEC_ID_*  pcm_* name
 //  AV_SAMPLE_FMT_*long name   
 DecodeContext   decode init func
+#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER
 PCM_CODEC_EXT(ALAW, S16, alaw, "PCM A-law / G.711 A-law",  
 PCMLUTDecode,   pcm_lut_decode_init);
+#endif
 PCM_DEC_EXT  (F16LE,FLT, f16le,"PCM 16.8 floating point 
little-endian", PCMScaleDecode, pcm_scale_decode_init);
 PCM_DEC_EXT  (F24LE,FLT, f24le,"PCM 24.0 floating point 
little-endian", PCMScaleDecode, pcm_scale_decode_init);
 PCM_CODEC(F32BE,FLT, f32be,"PCM 32-bit floating point 
big-endian");
@@ -663,7 +685,9 @@ PCM_CODEC(F32LE,FLT, f32le,"PCM 32-bit 
floating point little
 PCM_CODE

Re: [FFmpeg-devel] [PATCH 1/1] avcodec/pcm: reduce code size

2025-05-19 Thread Andreas Rheinhardt

chenyu202...@gmail.com:
> From: chenyu 
> 
> add depends to pcm.c for reducing size when ALAW/MULAW/VIDC not defined
> 
> Signed-off-by: chenyu 
> ---
>  libavcodec/pcm.c  | 36 +++-
>  libavcodec/pcm_tablegen.h | 22 ++
>  2 files changed, 53 insertions(+), 5 deletions(-)
> 
> diff --git a/libavcodec/pcm.c b/libavcodec/pcm.c
> index bff61f2195..60a2c544a8 100644
> --- a/libavcodec/pcm.c
> +++ b/libavcodec/pcm.c
> @@ -44,15 +44,20 @@ static av_cold int pcm_encode_init(AVCodecContext *avctx)
>  #if !CONFIG_HARDCODED_TABLES
>  switch (avctx->codec->id) {
>  #define INIT_ONCE(id, name) \
> -case AV_CODEC_ID_PCM_ ## id:\
> -if (CONFIG_PCM_ ## id ## _ENCODER) {\
> +case AV_CODEC_ID_PCM_ ## id: {  \
>  static AVOnce init_static_once = AV_ONCE_INIT;  \
>  ff_thread_once(&init_static_once, pcm_ ## name ## _tableinit);  \
> -}   \
> -break
> +break;  \
> +}
> +#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER
>  INIT_ONCE(ALAW,  alaw);
> +#endif
> +#if CONFIG_PCM_MULAW_DECODER || CONFIG_PCM_MULAW_ENCODER
>  INIT_ONCE(MULAW, ulaw);
> +#endif
> +#if CONFIG_PCM_VIDC_DECODER || CONFIG_PCM_VIDC_ENCODER
>  INIT_ONCE(VIDC,  vidc);
> +#endif

The macro already checks whether the relevant encoder is enabled; your
code meanwhile enables the code when the encoder or the decoder is
enabled, although only the encoders will ever use this.

>  default:
>  break;
>  }
> @@ -216,24 +221,30 @@ static int pcm_encode_frame(AVCodecContext *avctx, 
> AVPacket *avpkt,
>  bytestream_put_buffer(&dst, src, n * sample_size);
>  }
>  break;
> +#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER
>  case AV_CODEC_ID_PCM_ALAW:
>  for (; n > 0; n--) {
>  v  = *samples++;
>  *dst++ = linear_to_alaw[(v + 32768) >> 2];
>  }
>  break;
> +#endif
> +#if CONFIG_PCM_MULAW_DECODER || CONFIG_PCM_MULAW_ENCODER
>  case AV_CODEC_ID_PCM_MULAW:
>  for (; n > 0; n--) {
>  v  = *samples++;
>  *dst++ = linear_to_ulaw[(v + 32768) >> 2];
>  }
>  break;
> +#endif
> +#if CONFIG_PCM_VIDC_DECODER || CONFIG_PCM_VIDC_ENCODER
>  case AV_CODEC_ID_PCM_VIDC:
>  for (; n > 0; n--) {
>  v  = *samples++;
>  *dst++ = linear_to_vidc[(v + 32768) >> 2];
>  }
>  break;
> +#endif
>  default:
>  return -1;
>  }
> @@ -327,20 +338,25 @@ static av_cold av_unused int 
> pcm_lut_decode_init(AVCodecContext *avctx)
>  PCMLUTDecode *s = avctx->priv_data;
>  
>  switch (avctx->codec_id) {
> +#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER
>  case AV_CODEC_ID_PCM_ALAW:
>  for (int i = 0; i < 256; i++)
>  s->table[i] = alaw2linear(i);
>  break;
> +#endif
> +#if CONFIG_PCM_MULAW_DECODER || CONFIG_PCM_MULAW_ENCODER
>  case AV_CODEC_ID_PCM_MULAW:
>  for (int i = 0; i < 256; i++)
>  s->table[i] = ulaw2linear(i);
>  break;
> +#endif
> +#if CONFIG_PCM_VIDC_DECODER || CONFIG_PCM_VIDC_ENCODER
>  case AV_CODEC_ID_PCM_VIDC:
>  for (int i = 0; i < 256; i++)
>  s->table[i] = vidc2linear(i);
>  break;
>  }
> -
> +#endif

The encoder checks here are wrong; and the placement of the last endif
is wrong, too.

>  avctx->sample_fmt = AV_SAMPLE_FMT_S16;
>  s->base.sample_size = 1;
>  
> @@ -545,6 +561,9 @@ static int pcm_decode_frame(AVCodecContext *avctx, 
> AVFrame *frame,
>  bytestream_get_buffer(&src, samples, n * sample_size);
>  }
>  break;
> +#if CONFIG_PCM_ALAW_DECODER || CONFIG_PCM_ALAW_ENCODER || \
> +CONFIG_PCM_MULAW_DECODER || CONFIG_PCM_MULAW_ENCODER || \
> +CONFIG_PCM_VIDC_DECODER || CONFIG_PCM_VIDC_ENCODER

This should not check for the encoders at all, because this is in the
decoder.

>  case AV_CODEC_ID_PCM_ALAW:
>  case AV_CODEC_ID_PCM_MULAW:
>  case AV_CODEC_ID_PCM_VIDC: {
> @@ -555,6 +574,7 @@ static int pcm_decode_frame(AVCodecContext *avctx, 
> AVFrame *frame,
>  *samples_16++ = lut[*src++];
>  break;
>  }
> +#endif
>  case AV_CODEC_ID_PCM_LXF:
>  {
>  int i;
> @@ -655,7 +675,9 @@ const FFCodec ff_ ## name_ ## _decoder = {
>   \
>   *   to the table in pcm_decode_init() as well. */
>  //AV_CODEC_ID_*  pcm_* name
>  //  AV_SAMPLE_FMT_*long name 
>DecodeContext   decode init func
> +#if CONFIG_

Re: [FFmpeg-devel] [PATCH] aarch64: increase default alignment for functions and constants

2025-05-19 Thread Ramiro Polla

On Fri, May 16, 2025 at 8:03 AM Martin Storsjö  wrote:
> On Fri, 16 May 2025, Ramiro Polla wrote:
> > Use 16-byte alignment (align=4) instead of 4-byte (align=2) in the function 
> > and
> > const macros. This improves instruction fetch and NEON load performance on
> > modern AArch64 CPUs.
> > ---
> > libavutil/aarch64/asm.S | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Ok

Thanks. Pushed.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v5 6/7] ogg/vorbis: implement header packet skip in chained ogg bitstreams.

2025-05-19 Thread Romain Beauxis

Le sam. 17 mai 2025 à 17:07, Michael Niedermayer
 a écrit :
>
> On Sat, May 17, 2025 at 01:10:26PM -0500, Romain Beauxis wrote:
> > Le mar. 13 mai 2025 à 14:23, Michael Niedermayer  a
> > écrit :
> > >
> > > On Fri, May 09, 2025 at 06:43:26PM -0500, Romain Beauxis wrote:
> > > > ---
> > > >  libavcodec/vorbisdec.c |  37 +
> > > >  libavformat/oggparsevorbis.c   | 174 +
> > > >  tests/ref/fate/ogg-vorbis-chained-meta.txt |   3 -
> > > >  3 files changed, 117 insertions(+), 97 deletions(-)
> > > >
> > > > diff --git a/libavcodec/vorbisdec.c b/libavcodec/vorbisdec.c
> > > > index a778dc6b58..f069ac6ab3 100644
> > > > --- a/libavcodec/vorbisdec.c
> > > > +++ b/libavcodec/vorbisdec.c
> > > > @@ -1776,39 +1776,17 @@ static int vorbis_decode_frame(AVCodecContext
> > *avctx, AVFrame *frame,
> > > >  GetBitContext *gb = &vc->gb;
> > > >  float *channel_ptrs[255];
> > > >  int i, len, ret;
> > > > +const int8_t *new_extradata;
> > > > +size_t new_extradata_size;
> > > >
> > > >  ff_dlog(NULL, "packet length %d \n", buf_size);
> > > >
> > > > -if (*buf == 1 && buf_size > 7) {
> > > > -if ((ret = init_get_bits8(gb, buf + 1, buf_size - 1)) < 0)
> > > > -return ret;
> > > > -
> > > > -vorbis_free(vc);
> > > > -if ((ret = vorbis_parse_id_hdr(vc))) {
> > > > -av_log(avctx, AV_LOG_ERROR, "Id header corrupt.\n");
> > > > -vorbis_free(vc);
> > > > -return ret;
> > > > -}
> > > > -
> > > > -av_channel_layout_uninit(&avctx->ch_layout);
> > > > -if (vc->audio_channels > 8) {
> > > > -avctx->ch_layout.order   = AV_CHANNEL_ORDER_UNSPEC;
> > > > -avctx->ch_layout.nb_channels = vc->audio_channels;
> > > > -} else {
> > > > -av_channel_layout_copy(&avctx->ch_layout,
> > &ff_vorbis_ch_layouts[vc->audio_channels - 1]);
> > > > -}
> > > > -
> > > > -avctx->sample_rate = vc->audio_samplerate;
> > > > -return buf_size;
> > > > -}
> > > > -
> > > > -if (*buf == 3 && buf_size > 7) {
> > > > -av_log(avctx, AV_LOG_DEBUG, "Ignoring comment header\n");
> > > > -return buf_size;
> > > > -}
> > > > +new_extradata = av_packet_get_side_data(avpkt,
> > AV_PKT_DATA_NEW_EXTRADATA,
> > > > +&new_extradata_size);
> > > >
> > > > -if (*buf == 5 && buf_size > 7 && vc->channel_residues &&
> > !vc->modes) {
> > > > -if ((ret = init_get_bits8(gb, buf + 1, buf_size - 1)) < 0)
> > > > +if (new_extradata && *new_extradata == 5 && new_extradata_size > 7
> > &&
> > > > +vc->channel_residues && !vc->modes) {
> > > > +if ((ret = init_get_bits8(gb, new_extradata + 1,
> > new_extradata_size - 1)) < 0)
> > > >  return ret;
> > > >
> > > >  if ((ret = vorbis_parse_setup_hdr(vc))) {
> > > > @@ -1816,7 +1794,6 @@ static int vorbis_decode_frame(AVCodecContext
> > *avctx, AVFrame *frame,
> > > >  vorbis_free(vc);
> > > >  return ret;
> > > >  }
> > > > -return buf_size;
> > > >  }
> > > >
> > > >  if (!vc->channel_residues || !vc->modes) {
> > > > diff --git a/libavformat/oggparsevorbis.c b/libavformat/oggparsevorbis.c
> > > > index 9f50ab9ffc..452728b54d 100644
> > > > --- a/libavformat/oggparsevorbis.c
> > > > +++ b/libavformat/oggparsevorbis.c
> > > > @@ -293,6 +293,62 @@ static int vorbis_update_metadata(AVFormatContext
> > *s, int idx)
> > > >  return ret;
> > > >  }
> > > >
> > > > +static int vorbis_parse_header(AVFormatContext *s, AVStream *st,
> > > > +   const uint8_t *p, unsigned int psize)
> > > > +{
> > > > +unsigned blocksize, bs0, bs1;
> > > > +int srate;
> > > > +int channels;
> > > > +
> > > > +if (psize != 30)
> > > > +return AVERROR_INVALIDDATA;
> > > > +
> > > > +p += 7; /* skip "\001vorbis" tag */
> > > > +
> > > > +if (bytestream_get_le32(&p) != 0) /* vorbis_version */
> > > > +return AVERROR_INVALIDDATA;
> > > > +
> > > > +channels = bytestream_get_byte(&p);
> > > > +if (st->codecpar->ch_layout.nb_channels &&
> > > > +channels != st->codecpar->ch_layout.nb_channels) {
> > > > +av_log(s, AV_LOG_ERROR, "Channel change is not supported\n");
> > > > +return AVERROR_PATCHWELCOME;
> > > > +}
> > > > +st->codecpar->ch_layout.nb_channels = channels;
> > > > +srate   = bytestream_get_le32(&p);
> > > > +p += 4; // skip maximum bitrate
> > > > +st->codecpar->bit_rate = bytestream_get_le32(&p); // nominal
> > bitrate
> > > > +p += 4; // skip minimum bitrate
> > > > +
> > > > +blocksize = bytestream_get_byte(&p);
> > > > +bs0   = blocksize & 15;
> > > > +bs1   = blocksize >> 4;
> > > > +
> > > > +if (bs0 > bs1)
> > > > +return AVERROR_INVALIDDATA;
> > > > +if (bs0 < 6 |

Re: [FFmpeg-devel] [PATCH] avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 8bpc inverse transforms

2025-05-19 Thread Henrik Gramner via ffmpeg-devel

On Sat, May 17, 2025 at 12:59 AM Henrik Gramner  wrote:
>
> Placed in a new separate file as the existing combined MMX/SSE/AVX
> file is humongous and takes forever to assemble as is.
>
> This adds ~16 KiB of .text. The existing 8bpc asm is ~240 KiB of which
> the corresponding AVX2 functions makes up ~42 KiB.
>
> Tested to pass FATE on Linux and Windows.

Pushed.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v6 0/4] Remove chained ogg stream header packets from the demuxer

2025-05-19 Thread Romain Beauxis

## Changes since last revision:
* Patches for opus and flac have been comitted.
* Split up code refactorization and new extradata mechanism.
* Added full header+setup as extradata passed down to the vorbis
  decoder.

Romain Beauxis (4):
  libavformat/oggdec.{c,h}: Add new_extradata, use it to pass extradata
to the next decoded packet.
  ogg/vorbis: factor out header processing logic.
  ogg/vorbis: implement header packet skip in chained ogg bitstreams.
  libavformat/oggdec.h: Change paket function documentation to return 1
on header packets only.

 libavcodec/vorbis_parser.h |  11 ++
 libavcodec/vorbisdec.c |  76 ++
 libavformat/oggdec.c   |  11 ++
 libavformat/oggdec.h   |   6 +-
 libavformat/oggparsevorbis.c   | 167 +++--
 tests/ref/fate/ogg-vorbis-chained-meta.txt |   3 -
 6 files changed, 191 insertions(+), 83 deletions(-)

-- 
2.39.5 (Apple Git-154)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v6 1/4] libavformat/oggdec.{c, h}: Add new_extradata, use it to pass extradata to the next decoded packet.

2025-05-19 Thread Romain Beauxis

---
 libavformat/oggdec.c | 11 +++
 libavformat/oggdec.h |  2 ++
 2 files changed, 13 insertions(+)

diff --git a/libavformat/oggdec.c b/libavformat/oggdec.c
index 5557eb4a14..cb77cdd994 100644
--- a/libavformat/oggdec.c
+++ b/libavformat/oggdec.c
@@ -77,6 +77,7 @@ static void free_stream(AVFormatContext *s, int i)
 
 av_freep(&stream->private);
 av_freep(&stream->new_metadata);
+av_freep(&stream->new_extradata);
 }
 
 //FIXME We could avoid some structure duplication
@@ -888,6 +889,16 @@ retry:
 os->new_metadata_size = 0;
 }
 
+if (os->new_extradata) {
+ret = av_packet_add_side_data(pkt, AV_PKT_DATA_NEW_EXTRADATA,
+  os->new_extradata, 
os->new_extradata_size);
+if (ret < 0)
+return ret;
+
+os->new_extradata = NULL;
+os->new_extradata_size = 0;
+}
+
 return psize;
 }
 
diff --git a/libavformat/oggdec.h b/libavformat/oggdec.h
index bc670d0f1e..5083de646c 100644
--- a/libavformat/oggdec.h
+++ b/libavformat/oggdec.h
@@ -94,6 +94,8 @@ struct ogg_stream {
 int end_trimming; ///< set the number of packets to drop from the end
 uint8_t *new_metadata;
 size_t new_metadata_size;
+uint8_t *new_extradata;
+size_t new_extradata_size;
 void *private;
 };
 
-- 
2.39.5 (Apple Git-154)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v6 3/4] ogg/vorbis: implement header packet skip in chained ogg bitstreams.

2025-05-19 Thread Romain Beauxis

---
 libavcodec/vorbis_parser.h | 11 
 libavcodec/vorbisdec.c | 76 +-
 libavformat/oggparsevorbis.c   | 63 +-
 tests/ref/fate/ogg-vorbis-chained-meta.txt |  3 -
 4 files changed, 116 insertions(+), 37 deletions(-)

diff --git a/libavcodec/vorbis_parser.h b/libavcodec/vorbis_parser.h
index 789932ac49..b176fe536c 100644
--- a/libavcodec/vorbis_parser.h
+++ b/libavcodec/vorbis_parser.h
@@ -30,6 +30,17 @@
 
 typedef struct AVVorbisParseContext AVVorbisParseContext;
 
+/**
+ * Used by the vorbis parser to pass new chained stream headers
+ * as extradata.
+ */
+typedef struct vorbis_new_extradata {
+uint8_t *header;
+size_t   header_size;
+uint8_t *setup;
+size_t   setup_size;
+} vorbis_new_extradata;
+
 /**
  * Allocate and initialize the Vorbis parser using headers in the extradata.
  */
diff --git a/libavcodec/vorbisdec.c b/libavcodec/vorbisdec.c
index adbd726183..c9bbc60b49 100644
--- a/libavcodec/vorbisdec.c
+++ b/libavcodec/vorbisdec.c
@@ -43,6 +43,7 @@
 #include "vorbis.h"
 #include "vorbisdsp.h"
 #include "vorbis_data.h"
+#include "vorbis_parser.h"
 #include "xiph.h"
 
 #define V_NB_BITS 8
@@ -1778,47 +1779,60 @@ static int vorbis_decode_frame(AVCodecContext *avctx, 
AVFrame *frame,
 GetBitContext *gb = &vc->gb;
 float *channel_ptrs[255];
 int i, len, ret;
+size_t new_extradata_size;
+vorbis_new_extradata *new_extradata;
+const uint8_t *header;
+const uint8_t *setup;
 
 ff_dlog(NULL, "packet length %d \n", buf_size);
 
-if (*buf == 1 && buf_size > 7) {
-if ((ret = init_get_bits8(gb, buf + 1, buf_size - 1)) < 0)
-return ret;
+new_extradata = (vorbis_new_extradata *)av_packet_get_side_data(
+avpkt, AV_PKT_DATA_NEW_EXTRADATA, &new_extradata_size);
 
-vorbis_free(vc);
-if ((ret = vorbis_parse_id_hdr(vc))) {
-av_log(avctx, AV_LOG_ERROR, "Id header corrupt.\n");
-vorbis_free(vc);
-return ret;
-}
+if (new_extradata) {
+header = new_extradata->header;
+setup = new_extradata->setup;
 
-av_channel_layout_uninit(&avctx->ch_layout);
-if (vc->audio_channels > 8) {
-avctx->ch_layout.order   = AV_CHANNEL_ORDER_UNSPEC;
-avctx->ch_layout.nb_channels = vc->audio_channels;
-} else {
-av_channel_layout_copy(&avctx->ch_layout, 
&ff_vorbis_ch_layouts[vc->audio_channels - 1]);
-}
+if (header && *header == 1 && new_extradata->header_size > 7) {
+if ((ret = init_get_bits8(
+gb, header + 1,
+new_extradata->header_size - 1)) < 0)
+return ret;
 
-avctx->sample_rate = vc->audio_samplerate;
-return buf_size;
-}
+vorbis_free(vc);
+if ((ret = vorbis_parse_id_hdr(vc))) {
+av_log(avctx, AV_LOG_ERROR, "Id header corrupt.\n");
+vorbis_free(vc);
+return ret;
+}
 
-if (*buf == 3 && buf_size > 7) {
-av_log(avctx, AV_LOG_DEBUG, "Ignoring comment header\n");
-return buf_size;
-}
+av_channel_layout_uninit(&avctx->ch_layout);
+if (vc->audio_channels > 8) {
+avctx->ch_layout.order   = AV_CHANNEL_ORDER_UNSPEC;
+avctx->ch_layout.nb_channels = vc->audio_channels;
+} else {
+av_channel_layout_copy(
+&avctx->ch_layout,
+&ff_vorbis_ch_layouts[vc->audio_channels - 1]);
+}
 
-if (*buf == 5 && buf_size > 7 && vc->channel_residues && !vc->modes) {
-if ((ret = init_get_bits8(gb, buf + 1, buf_size - 1)) < 0)
-return ret;
+avctx->sample_rate = vc->audio_samplerate;
+return buf_size;
+}
 
-if ((ret = vorbis_parse_setup_hdr(vc))) {
-av_log(avctx, AV_LOG_ERROR, "Setup header corrupt.\n");
-vorbis_free(vc);
-return ret;
+if (setup && *setup == 5 && new_extradata->setup_size > 7 &&
+vc->channel_residues && !vc->modes) {
+if ((ret = init_get_bits8(
+   gb, setup + 1,
+   new_extradata->setup_size - 1)) < 0)
+return ret;
+
+if ((ret = vorbis_parse_setup_hdr(vc))) {
+av_log(avctx, AV_LOG_ERROR, "Setup header corrupt.\n");
+vorbis_free(vc);
+return ret;
+}
 }
-return buf_size;
 }
 
 if (!vc->channel_residues || !vc->modes) {
diff --git a/libavformat/oggparsevorbis.c b/libavformat/oggparsevorbis.c
index 62cc2da6de..ee2e01f468 100644
--- a/libavformat/oggparsevorbis.c
+++ b/libavformat/oggparsevorbis.c
@@ -255,12 +255,19 @@ static void vorbis_cleanup(AVFormatContext *s, int idx)
 struct ogg *ogg = s->priv_data

[FFmpeg-devel] [PATCH v6 4/4] libavformat/oggdec.h: Change paket function documentation to return 1 on header packets only.

2025-05-19 Thread Romain Beauxis

---
 libavformat/oggdec.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavformat/oggdec.h b/libavformat/oggdec.h
index 5083de646c..c15fbe738e 100644
--- a/libavformat/oggdec.h
+++ b/libavformat/oggdec.h
@@ -42,8 +42,8 @@ struct ogg_codec {
  * Attempt to process a packet as a data packet
  * @return < 0 (AVERROR) code or -1 on error
  * == 0 if the packet was a regular data packet.
- * == 0 or 1 if the packet was a header from a chained bitstream.
- *   (1 will cause the packet to be skiped in calling code 
(ogg_packet())
+ * == 1 if the packet was a header from a chained bitstream.
+ *This will cause the packet to be skiped in calling code 
(ogg_packet()
  */
 int (*packet)(AVFormatContext *, int);
 /**
-- 
2.39.5 (Apple Git-154)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v6 2/4] ogg/vorbis: factor out header processing logic.

2025-05-19 Thread Romain Beauxis

---
 libavformat/oggparsevorbis.c | 104 ---
 1 file changed, 60 insertions(+), 44 deletions(-)

diff --git a/libavformat/oggparsevorbis.c b/libavformat/oggparsevorbis.c
index 9f50ab9ffc..62cc2da6de 100644
--- a/libavformat/oggparsevorbis.c
+++ b/libavformat/oggparsevorbis.c
@@ -293,6 +293,62 @@ static int vorbis_update_metadata(AVFormatContext *s, int 
idx)
 return ret;
 }
 
+static int vorbis_parse_header(AVFormatContext *s, AVStream *st,
+   const uint8_t *p, unsigned int psize)
+{
+unsigned blocksize, bs0, bs1;
+int srate;
+int channels;
+
+if (psize != 30)
+return AVERROR_INVALIDDATA;
+
+p += 7; /* skip "\001vorbis" tag */
+
+if (bytestream_get_le32(&p) != 0) /* vorbis_version */
+return AVERROR_INVALIDDATA;
+
+channels = bytestream_get_byte(&p);
+if (st->codecpar->ch_layout.nb_channels &&
+channels != st->codecpar->ch_layout.nb_channels) {
+av_log(s, AV_LOG_ERROR, "Channel change is not supported\n");
+return AVERROR_PATCHWELCOME;
+}
+st->codecpar->ch_layout.nb_channels = channels;
+srate   = bytestream_get_le32(&p);
+p += 4; // skip maximum bitrate
+st->codecpar->bit_rate = bytestream_get_le32(&p); // nominal bitrate
+p += 4; // skip minimum bitrate
+
+blocksize = bytestream_get_byte(&p);
+bs0   = blocksize & 15;
+bs1   = blocksize >> 4;
+
+if (bs0 > bs1)
+return AVERROR_INVALIDDATA;
+if (bs0 < 6 || bs1 > 13)
+return AVERROR_INVALIDDATA;
+
+if (bytestream_get_byte(&p) != 1) /* framing_flag */
+return AVERROR_INVALIDDATA;
+
+st->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
+st->codecpar->codec_id   = AV_CODEC_ID_VORBIS;
+
+if (srate > 0) {
+if (st->codecpar->sample_rate &&
+srate != st->codecpar->sample_rate) {
+av_log(s, AV_LOG_ERROR, "Sample rate change is not supported\n");
+return AVERROR_PATCHWELCOME;
+}
+
+st->codecpar->sample_rate = srate;
+avpriv_set_pts_info(st, 64, 1, srate);
+}
+
+return 1;
+}
+
 static int vorbis_header(AVFormatContext *s, int idx)
 {
 struct ogg *ogg = s->priv_data;
@@ -329,50 +385,10 @@ static int vorbis_header(AVFormatContext *s, int idx)
 priv->packet[pkt_type >> 1] = av_memdup(os->buf + os->pstart, os->psize);
 if (!priv->packet[pkt_type >> 1])
 return AVERROR(ENOMEM);
-if (os->buf[os->pstart] == 1) {
-const uint8_t *p = os->buf + os->pstart + 7; /* skip "\001vorbis" tag 
*/
-unsigned blocksize, bs0, bs1;
-int srate;
-int channels;
-
-if (os->psize != 30)
-return AVERROR_INVALIDDATA;
-
-if (bytestream_get_le32(&p) != 0) /* vorbis_version */
-return AVERROR_INVALIDDATA;
-
-channels = bytestream_get_byte(&p);
-if (st->codecpar->ch_layout.nb_channels &&
-channels != st->codecpar->ch_layout.nb_channels) {
-av_log(s, AV_LOG_ERROR, "Channel change is not supported\n");
-return AVERROR_PATCHWELCOME;
-}
-st->codecpar->ch_layout.nb_channels = channels;
-srate   = bytestream_get_le32(&p);
-p += 4; // skip maximum bitrate
-st->codecpar->bit_rate = bytestream_get_le32(&p); // nominal bitrate
-p += 4; // skip minimum bitrate
-
-blocksize = bytestream_get_byte(&p);
-bs0   = blocksize & 15;
-bs1   = blocksize >> 4;
-
-if (bs0 > bs1)
-return AVERROR_INVALIDDATA;
-if (bs0 < 6 || bs1 > 13)
-return AVERROR_INVALIDDATA;
-
-if (bytestream_get_byte(&p) != 1) /* framing_flag */
-return AVERROR_INVALIDDATA;
-
-st->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
-st->codecpar->codec_id   = AV_CODEC_ID_VORBIS;
-
-if (srate > 0) {
-st->codecpar->sample_rate = srate;
-avpriv_set_pts_info(st, 64, 1, srate);
-}
-} else if (os->buf[os->pstart] == 3) {
+if (pkt_type == 1)
+return vorbis_parse_header(s, st, os->buf + os->pstart, os->psize);
+
+if (pkt_type == 3) {
 if (vorbis_update_metadata(s, idx) >= 0 && priv->len[1] > 10) {
 unsigned new_len;
 
-- 
2.39.5 (Apple Git-154)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v4 1/4] libavcodec/vc2enc: Split out common functions between software and hardware encoders

2025-05-19 Thread Andreas Rheinhardt

IndecisiveTurtle:
> From: IndecisiveTurtle 
> 
> ---
>  libavcodec/Makefile|   2 +-
>  libavcodec/vc2enc.c| 679 ++---
>  libavcodec/vc2enc_common.c | 571 +++
>  libavcodec/vc2enc_common.h | 178 ++
>  4 files changed, 772 insertions(+), 658 deletions(-)
>  create mode 100644 libavcodec/vc2enc_common.c
>  create mode 100644 libavcodec/vc2enc_common.h
> 
> diff --git a/libavcodec/Makefile b/libavcodec/Makefile
> index 77734dff24..bdf0d6742e 100644
> --- a/libavcodec/Makefile
> +++ b/libavcodec/Makefile
> @@ -771,7 +771,7 @@ OBJS-$(CONFIG_VC1_CUVID_DECODER)   += cuviddec.o
>  OBJS-$(CONFIG_VC1_MMAL_DECODER)+= mmaldec.o
>  OBJS-$(CONFIG_VC1_QSV_DECODER) += qsvdec.o
>  OBJS-$(CONFIG_VC1_V4L2M2M_DECODER) += v4l2_m2m_dec.o
> -OBJS-$(CONFIG_VC2_ENCODER) += vc2enc.o vc2enc_dwt.o diractab.o
> +OBJS-$(CONFIG_VC2_ENCODER) += vc2enc.o vc2enc_dwt.o 
> vc2enc_common.o diractab.o

Seems like this should be split into two lines

>  OBJS-$(CONFIG_VCR1_DECODER)+= vcr1.o
>  OBJS-$(CONFIG_VMDAUDIO_DECODER)+= vmdaudio.o
>  OBJS-$(CONFIG_VMDVIDEO_DECODER)+= vmdvideo.o
> diff --git a/libavcodec/vc2enc.c b/libavcodec/vc2enc.c
> index 99ca95c40a..939bafa195 100644
> --- a/libavcodec/vc2enc.c
> +++ b/libavcodec/vc2enc.c
> @@ -30,505 +30,11 @@
>  #include "put_bits.h"
>  #include "version.h"
>  
> -#include "vc2enc_dwt.h"
> -#include "diractab.h"
> -
> -/* The limited size resolution of each slice forces us to do this */
> -#define SSIZE_ROUND(b) (FFALIGN((b), s->size_scaler) + 4 + s->prefix_bytes)
> +#include "vc2enc_common.h"
>  
>  /* Decides the cutoff point in # of slices to distribute the leftover bytes 
> */
>  #define SLICE_REDIST_TOTAL 150
>  
> -typedef struct VC2BaseVideoFormat {
> -enum AVPixelFormat pix_fmt;
> -AVRational time_base;
> -int width, height;
> -uint8_t interlaced, level;
> -char name[13];
> -} VC2BaseVideoFormat;
> -
> -static const VC2BaseVideoFormat base_video_fmts[] = {
> -{ 0 }, /* Custom format, here just to make indexing equal to base_vf */
> -{ AV_PIX_FMT_YUV420P,   { 1001, 15000 },  176,  120, 0, 1, "QSIF525" 
> },
> -{ AV_PIX_FMT_YUV420P,   {2,25 },  176,  144, 0, 1, "QCIF"
> },
> -{ AV_PIX_FMT_YUV420P,   { 1001, 15000 },  352,  240, 0, 1, "SIF525"  
> },
> -{ AV_PIX_FMT_YUV420P,   {2,25 },  352,  288, 0, 1, "CIF" 
> },
> -{ AV_PIX_FMT_YUV420P,   { 1001, 15000 },  704,  480, 0, 1, "4SIF525" 
> },
> -{ AV_PIX_FMT_YUV420P,   {2,25 },  704,  576, 0, 1, "4CIF"
> },
> -
> -{ AV_PIX_FMT_YUV422P10, { 1001, 3 },  720,  480, 1, 2,   "SD480I-60" 
> },
> -{ AV_PIX_FMT_YUV422P10, {1,25 },  720,  576, 1, 2,   "SD576I-50" 
> },
> -
> -{ AV_PIX_FMT_YUV422P10, { 1001, 6 }, 1280,  720, 0, 3,  "HD720P-60"  
> },
> -{ AV_PIX_FMT_YUV422P10, {1,50 }, 1280,  720, 0, 3,  "HD720P-50"  
> },
> -{ AV_PIX_FMT_YUV422P10, { 1001, 3 }, 1920, 1080, 1, 3,  "HD1080I-60" 
> },
> -{ AV_PIX_FMT_YUV422P10, {1,25 }, 1920, 1080, 1, 3,  "HD1080I-50" 
> },
> -{ AV_PIX_FMT_YUV422P10, { 1001, 6 }, 1920, 1080, 0, 3,  "HD1080P-60" 
> },
> -{ AV_PIX_FMT_YUV422P10, {1,50 }, 1920, 1080, 0, 3,  "HD1080P-50" 
> },
> -
> -{ AV_PIX_FMT_YUV444P12, {1,24 }, 2048, 1080, 0, 4,"DC2K" 
> },
> -{ AV_PIX_FMT_YUV444P12, {1,24 }, 4096, 2160, 0, 5,"DC4K" 
> },
> -
> -{ AV_PIX_FMT_YUV422P10, { 1001, 6 }, 3840, 2160, 0, 6, "UHDTV 4K-60" 
> },
> -{ AV_PIX_FMT_YUV422P10, {1,50 }, 3840, 2160, 0, 6, "UHDTV 4K-50" 
> },
> -
> -{ AV_PIX_FMT_YUV422P10, { 1001, 6 }, 7680, 4320, 0, 7, "UHDTV 8K-60" 
> },
> -{ AV_PIX_FMT_YUV422P10, {1,50 }, 7680, 4320, 0, 7, "UHDTV 8K-50" 
> },
> -
> -{ AV_PIX_FMT_YUV422P10, { 1001, 24000 }, 1920, 1080, 0, 3,  "HD1080P-24" 
> },
> -{ AV_PIX_FMT_YUV422P10, { 1001, 3 },  720,  486, 1, 2,  "SD Pro486"  
> },
> -};
> -static const int base_video_fmts_len = FF_ARRAY_ELEMS(base_video_fmts);
> -
> -enum VC2_QM {
> -VC2_QM_DEF = 0,
> -VC2_QM_COL,
> -VC2_QM_FLAT,
> -
> -VC2_QM_NB
> -};
> -
> -typedef struct SubBand {
> -dwtcoef *buf;
> -ptrdiff_t stride;
> -int width;
> -int height;
> -} SubBand;
> -
> -typedef struct Plane {
> -SubBand band[MAX_DWT_LEVELS][4];
> -dwtcoef *coef_buf;
> -int width;
> -int height;
> -int dwt_width;
> -int dwt_height;
> -ptrdiff_t coef_stride;
> -} Plane;
> -
> -typedef struct SliceArgs {
> -const struct VC2EncContext *ctx;
> -union {
> -int cache[DIRAC_MAX_QUANT_INDEX];
> -uint8_t *buf;
> -};
> -int x;
> -int y;
> -int quant_idx;
> -int bits_ceil;
> -int bits_floor;
> -int bytes;
> -} SliceArgs;
> -
> -typedef struct TransformArgs {
> -const struct VC2EncContext *ctx;
> -

Re: [FFmpeg-devel] [PATCH v4 3/4] libavcodec/vulkan: Add modifications to common shader for VC2 vulkan encoder

2025-05-19 Thread Andreas Rheinhardt

IndecisiveTurtle:
> From: IndecisiveTurtle 
> 
> ---
>  libavcodec/vulkan/common.comp | 54 ---
>  1 file changed, 44 insertions(+), 10 deletions(-)
> 
> diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
> index 10af9c0623..db216a2ac6 100644
> --- a/libavcodec/vulkan/common.comp
> +++ b/libavcodec/vulkan/common.comp
> @@ -18,6 +18,9 @@
>   * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
>   */
>  
> +#extension GL_EXT_buffer_reference : require
> +#extension GL_EXT_buffer_reference2 : require
> +
>  layout(buffer_reference, buffer_reference_align = 1) buffer u8buf {
>  uint8_t v;
>  };
> @@ -61,22 +64,20 @@ layout(buffer_reference, buffer_reference_align = 8) 
> buffer u64buf {
>  #define mid_pred(a, b, c) \
>  max(min((a), (b)), min(max((a), (b)), (c)))
>  
> -/* TODO: optimize */
> +
>  uint align(uint src, uint a)
>  {
> -uint res = src % a;
> -if (res == 0)
> -return src;
> -return src + a - res;
> +return (src + a - 1) & ~(a - 1);
> +}
> +
> +int align(int src, int a)
> +{
> +return (src + a - 1) & ~(a - 1);
>  }
>  
> -/* TODO: optimize */
>  uint64_t align64(uint64_t src, uint64_t a)
>  {
> -uint64_t res = src % a;
> -if (res == 0)
> -return src;
> -return src + a - res;
> +return (src + a - 1) & ~(a - 1);
>  }
>  
>  #define reverse4(src) \
> @@ -167,6 +168,39 @@ uint32_t flush_put_bits(inout PutBitContext pb)
>  return uint32_t(pb.buf - pb.buf_start);
>  }
>  
> +void skip_put_bytes(inout PutBitContext pb, int n)
> +{
> +int bytes_left = pb.bit_left >> 3;
> +if (n < bytes_left)
> +{
> +int n_bits = n << 3;
> +int mask = (1 << n_bits) - 1;
> +pb.bit_buf <<= n_bits;
> +pb.bit_buf |= mask;
> +pb.bit_left -= uint8_t(n_bits);
> +return;
> +}
> +if (pb.bit_left < BUF_BITS)
> +{
> +int mask = (1 << pb.bit_left) - 1;
> +pb.bit_buf <<= pb.bit_left;
> +pb.bit_buf |= mask;
> +u32vec2buf(pb.buf).v = BUF_REVERSE(pb.bit_buf);
> +pb.buf += BUF_BYTES;
> +n -= pb.bit_left >> 3;
> +}
> +int skip_dwords = n >> 2;
> +while (skip_dwords > 0)
> +{
> +u8vec4buf(pb.buf).v = u8vec4(0xFF);
> +pb.buf += 4;
> +skip_dwords--;
> +}
> +int skip_bits = (n & 3) << 3;
> +pb.bit_buf = (1 << skip_bits) - 1;
> +pb.bit_left = uint8_t(BUF_BITS - skip_bits);
> +}

This differs quite a lot from the software implementation: It does not
presume that the PutBitContext is flushed and instead of simply skipping
over the buffer it actually fills the buffer with n 0xFF bytes,
effectively adding the memset used in the VC2 slice writing code to
skip_put_bytes(). But this file is (if I am not mistaken) supposed to be
generic, not vc2 specific, so this feels very wrong.

> +
>  void init_put_bits(out PutBitContext pb, u8buf data, uint64_t len)
>  {
>  pb.buf_start = uint64_t(data);

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v4 3/4] libavcodec/vulkan: Add modifications to common shader for VC2 vulkan encoder

2025-05-19 Thread IndecisiveTurtle

> This differs quite a lot from the software implementation: It does not
> presume that the PutBitContext is flushed and instead of simply skipping
> over the buffer it actually fills the buffer with n 0xFF bytes,
> effectively adding the memset used in the VC2 slice writing code to
> skip_put_bytes(). But this file is (if I am not mistaken) supposed to be
> generic, not vc2 specific, so this feels very wrong.

Would it be enough to move it to vc2_encode.comp or should I also
rename the function?

Στις Δευ 19 Μαΐ 2025 στις 7:46 μ.μ., ο/η Andreas Rheinhardt
 έγραψε:
>
> IndecisiveTurtle:
> > From: IndecisiveTurtle 
> >
> > ---
> >  libavcodec/vulkan/common.comp | 54 ---
> >  1 file changed, 44 insertions(+), 10 deletions(-)
> >
> > diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
> > index 10af9c0623..db216a2ac6 100644
> > --- a/libavcodec/vulkan/common.comp
> > +++ b/libavcodec/vulkan/common.comp
> > @@ -18,6 +18,9 @@
> >   * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 
> > 02110-1301 USA
> >   */
> >
> > +#extension GL_EXT_buffer_reference : require
> > +#extension GL_EXT_buffer_reference2 : require
> > +
> >  layout(buffer_reference, buffer_reference_align = 1) buffer u8buf {
> >  uint8_t v;
> >  };
> > @@ -61,22 +64,20 @@ layout(buffer_reference, buffer_reference_align = 8) 
> > buffer u64buf {
> >  #define mid_pred(a, b, c) \
> >  max(min((a), (b)), min(max((a), (b)), (c)))
> >
> > -/* TODO: optimize */
> > +
> >  uint align(uint src, uint a)
> >  {
> > -uint res = src % a;
> > -if (res == 0)
> > -return src;
> > -return src + a - res;
> > +return (src + a - 1) & ~(a - 1);
> > +}
> > +
> > +int align(int src, int a)
> > +{
> > +return (src + a - 1) & ~(a - 1);
> >  }
> >
> > -/* TODO: optimize */
> >  uint64_t align64(uint64_t src, uint64_t a)
> >  {
> > -uint64_t res = src % a;
> > -if (res == 0)
> > -return src;
> > -return src + a - res;
> > +return (src + a - 1) & ~(a - 1);
> >  }
> >
> >  #define reverse4(src) \
> > @@ -167,6 +168,39 @@ uint32_t flush_put_bits(inout PutBitContext pb)
> >  return uint32_t(pb.buf - pb.buf_start);
> >  }
> >
> > +void skip_put_bytes(inout PutBitContext pb, int n)
> > +{
> > +int bytes_left = pb.bit_left >> 3;
> > +if (n < bytes_left)
> > +{
> > +int n_bits = n << 3;
> > +int mask = (1 << n_bits) - 1;
> > +pb.bit_buf <<= n_bits;
> > +pb.bit_buf |= mask;
> > +pb.bit_left -= uint8_t(n_bits);
> > +return;
> > +}
> > +if (pb.bit_left < BUF_BITS)
> > +{
> > +int mask = (1 << pb.bit_left) - 1;
> > +pb.bit_buf <<= pb.bit_left;
> > +pb.bit_buf |= mask;
> > +u32vec2buf(pb.buf).v = BUF_REVERSE(pb.bit_buf);
> > +pb.buf += BUF_BYTES;
> > +n -= pb.bit_left >> 3;
> > +}
> > +int skip_dwords = n >> 2;
> > +while (skip_dwords > 0)
> > +{
> > +u8vec4buf(pb.buf).v = u8vec4(0xFF);
> > +pb.buf += 4;
> > +skip_dwords--;
> > +}
> > +int skip_bits = (n & 3) << 3;
> > +pb.bit_buf = (1 << skip_bits) - 1;
> > +pb.bit_left = uint8_t(BUF_BITS - skip_bits);
> > +}
>
> This differs quite a lot from the software implementation: It does not
> presume that the PutBitContext is flushed and instead of simply skipping
> over the buffer it actually fills the buffer with n 0xFF bytes,
> effectively adding the memset used in the VC2 slice writing code to
> skip_put_bytes(). But this file is (if I am not mistaken) supposed to be
> generic, not vc2 specific, so this feels very wrong.
>
> > +
> >  void init_put_bits(out PutBitContext pb, u8buf data, uint64_t len)
> >  {
> >  pb.buf_start = uint64_t(data);
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v4 4/4] lavc: implement a Vulkan-based VC-2 encoder Implements a Vulkan based dirac encoder. Supports Haar and Legall wavelets and should work with all wavelet depths.

2025-05-19 Thread Andreas Rheinhardt

IndecisiveTurtle:
> From: IndecisiveTurtle 
> 
> Performance wise, encoding a 3440x1440 1-minute video is performed in about 
> 2.4 minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes 
> about 1.3 minutes on my NVIDIA GTX 1650

The last iteration of this patchset claimed 2.5m for the software
encoder vs 30s hardware. The software performance improvement seems
small compared to what I expected, yet I am surprised about the hardware
slowdown (presuming it was the same file). Was the switch to the lut
based writing of codes not beneficial?

> 
> Haar shader has a subgroup optimized variant that applies when configured 
> wavelet depth allows it
> ---
>  configure|   1 +
>  libavcodec/Makefile  |   3 +
>  libavcodec/allcodecs.c   |   1 +
>  libavcodec/vc2enc_vulkan.c   | 775 +++
>  libavcodec/vulkan/vc2_dwt_haar.comp  |  82 ++
>  libavcodec/vulkan/vc2_dwt_haar_subgroup.comp |  75 ++
>  libavcodec/vulkan/vc2_dwt_hor_legall.comp|  82 ++
>  libavcodec/vulkan/vc2_dwt_upload.comp|  96 +++
>  libavcodec/vulkan/vc2_dwt_ver_legall.comp|  78 ++
>  libavcodec/vulkan/vc2_encode.comp| 159 
>  libavcodec/vulkan/vc2_slice_sizes.comp   | 170 
>  11 files changed, 1522 insertions(+)
>  create mode 100644 libavcodec/vc2enc_vulkan.c
>  create mode 100644 libavcodec/vulkan/vc2_dwt_haar.comp
>  create mode 100644 libavcodec/vulkan/vc2_dwt_haar_subgroup.comp
>  create mode 100644 libavcodec/vulkan/vc2_dwt_hor_legall.comp
>  create mode 100644 libavcodec/vulkan/vc2_dwt_upload.comp
>  create mode 100644 libavcodec/vulkan/vc2_dwt_ver_legall.comp
>  create mode 100644 libavcodec/vulkan/vc2_encode.comp
>  create mode 100644 libavcodec/vulkan/vc2_slice_sizes.comp
> 


> +#define VC2ENC_FLAGS (AV_OPT_FLAG_ENCODING_PARAM | AV_OPT_FLAG_VIDEO_PARAM)
> +static const AVOption vc2enc_options[] = {
> +{"tolerance", "Max undershoot in percent", offsetof(VC2EncContext, 
> tolerance), AV_OPT_TYPE_DOUBLE, {.dbl = 5.0f}, 0.0f, 45.0f, VC2ENC_FLAGS, 
> .unit = "tolerance"},
> +{"slice_width",   "Slice width",  offsetof(VC2EncContext, slice_width), 
> AV_OPT_TYPE_INT, {.i64 = 32}, 32, 1024, VC2ENC_FLAGS, .unit = "slice_width"},
> +{"slice_height",  "Slice height", offsetof(VC2EncContext, slice_height), 
> AV_OPT_TYPE_INT, {.i64 = 16}, 8, 1024, VC2ENC_FLAGS, .unit = "slice_height"},
> +{"wavelet_depth", "Transform depth", offsetof(VC2EncContext, 
> wavelet_depth), AV_OPT_TYPE_INT, {.i64 = 4}, 1, 5, VC2ENC_FLAGS, .unit = 
> "wavelet_depth"},
> +{"wavelet_type",  "Transform type",  offsetof(VC2EncContext, 
> wavelet_idx), AV_OPT_TYPE_INT, {.i64 = VC2_TRANSFORM_5_3}, 0, 
> VC2_TRANSFORMS_NB, VC2ENC_FLAGS, .unit = "wavelet_idx"},

You don't allow the 9_7 wavelet here (intentionally?), but then you
should restrict the range to disallow the value 0 (== VC2_TRANSFORM_9_7).

> +{"5_3",  "LeGall (5,3)",0, AV_OPT_TYPE_CONST, 
> {.i64 = VC2_TRANSFORM_5_3},INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit = 
> "wavelet_idx"},
> +{"haar", "Haar (with shift)",   0, AV_OPT_TYPE_CONST, 
> {.i64 = VC2_TRANSFORM_HAAR_S}, INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit = 
> "wavelet_idx"},
> +{"haar_noshift", "Haar (without shift)",0, AV_OPT_TYPE_CONST, 
> {.i64 = VC2_TRANSFORM_HAAR},   INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit = 
> "wavelet_idx"},
> +{"qm", "Custom quantization matrix", offsetof(VC2EncContext, 
> quant_matrix), AV_OPT_TYPE_INT, {.i64 = VC2_QM_DEF}, 0, VC2_QM_NB, 
> VC2ENC_FLAGS, .unit = "quant_matrix"},
> +{"default",   "Default from the specifications", 0, 
> AV_OPT_TYPE_CONST, {.i64 = VC2_QM_DEF}, INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit 
> = "quant_matrix"},
> +{"color", "Prevents low bitrate discoloration", 0, 
> AV_OPT_TYPE_CONST, {.i64 = VC2_QM_COL}, INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit 
> = "quant_matrix"},
> +{"flat",  "Optimize for PSNR", 0, AV_OPT_TYPE_CONST, {.i64 = 
> VC2_QM_FLAT}, INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit = "quant_matrix"},
> +{NULL}
> +};
> +
> +static const AVClass vc2enc_class = {
> +.class_name = "vc2_vulkan_encoder",
> +.category = AV_CLASS_CATEGORY_ENCODER,
> +.option = vc2enc_options,
> +.item_name = av_default_item_name,
> +.version = LIBAVUTIL_VERSION_INT
> +};
> +
> +static const FFCodecDefault vc2enc_defaults[] = {
> +{ "b",  "6"   },
> +{ NULL },
> +};
> +
> +static const AVCodecHWConfigInternal *const ff_vc2_hw_configs[] = {

Should not use ff_ prefix.


> +HW_CONFIG_ENCODER_FRAMES(VULKAN, VULKAN),
> +HW_CONFIG_ENCODER_DEVICE(NONE,  VULKAN),
> +NULL,
> +};
> +
> +const FFCodec ff_vc2_vulkan_encoder = {
> +.p.name = "vc2_vulkan",
> +CODEC_LONG_NAME("SMPTE VC-2"),
> +.p.type = AVMEDIA_TYPE_VIDEO,
> +.p.id   = AV_CODEC_ID_DIRAC,
> +.p.capabi

Re: [FFmpeg-devel] [PATCH v4 4/4] lavc: implement a Vulkan-based VC-2 encoder Implements a Vulkan based dirac encoder. Supports Haar and Legall wavelets and should work with all wavelet depths.

2025-05-19 Thread IndecisiveTurtle

> The last iteration of this patchset claimed 2.5m for the software
> encoder vs 30s hardware. The software performance improvement seems
> small compared to what I expected, yet I am surprised about the hardware
> slowdown (presuming it was the same file). Was the switch to the lut
> based writing of codes not beneficial?

It is not the same video file. The last description was for a 1080p
video, this one is between 1440p and 4K. I wanted to put more stress
on the encoder to test new performance gains.

> You don't allow the 9_7 wavelet here (intentionally?)
Yes it is not implemented in vulkan encoder. This is also why I
couldn't unify this array as you mentioned before.

Στις Δευ 19 Μαΐ 2025 στις 8:09 μ.μ., ο/η Andreas Rheinhardt
 έγραψε:
>
> IndecisiveTurtle:
> > From: IndecisiveTurtle 
> >
> > Performance wise, encoding a 3440x1440 1-minute video is performed in about 
> > 2.4 minutes with the cpu encoder running on my Ryzen 5 4600H, while it 
> > takes about 1.3 minutes on my NVIDIA GTX 1650
>
> The last iteration of this patchset claimed 2.5m for the software
> encoder vs 30s hardware. The software performance improvement seems
> small compared to what I expected, yet I am surprised about the hardware
> slowdown (presuming it was the same file). Was the switch to the lut
> based writing of codes not beneficial?
>
> >
> > Haar shader has a subgroup optimized variant that applies when configured 
> > wavelet depth allows it
> > ---
> >  configure|   1 +
> >  libavcodec/Makefile  |   3 +
> >  libavcodec/allcodecs.c   |   1 +
> >  libavcodec/vc2enc_vulkan.c   | 775 +++
> >  libavcodec/vulkan/vc2_dwt_haar.comp  |  82 ++
> >  libavcodec/vulkan/vc2_dwt_haar_subgroup.comp |  75 ++
> >  libavcodec/vulkan/vc2_dwt_hor_legall.comp|  82 ++
> >  libavcodec/vulkan/vc2_dwt_upload.comp|  96 +++
> >  libavcodec/vulkan/vc2_dwt_ver_legall.comp|  78 ++
> >  libavcodec/vulkan/vc2_encode.comp| 159 
> >  libavcodec/vulkan/vc2_slice_sizes.comp   | 170 
> >  11 files changed, 1522 insertions(+)
> >  create mode 100644 libavcodec/vc2enc_vulkan.c
> >  create mode 100644 libavcodec/vulkan/vc2_dwt_haar.comp
> >  create mode 100644 libavcodec/vulkan/vc2_dwt_haar_subgroup.comp
> >  create mode 100644 libavcodec/vulkan/vc2_dwt_hor_legall.comp
> >  create mode 100644 libavcodec/vulkan/vc2_dwt_upload.comp
> >  create mode 100644 libavcodec/vulkan/vc2_dwt_ver_legall.comp
> >  create mode 100644 libavcodec/vulkan/vc2_encode.comp
> >  create mode 100644 libavcodec/vulkan/vc2_slice_sizes.comp
> >
>
>
> > +#define VC2ENC_FLAGS (AV_OPT_FLAG_ENCODING_PARAM | AV_OPT_FLAG_VIDEO_PARAM)
> > +static const AVOption vc2enc_options[] = {
> > +{"tolerance", "Max undershoot in percent", offsetof(VC2EncContext, 
> > tolerance), AV_OPT_TYPE_DOUBLE, {.dbl = 5.0f}, 0.0f, 45.0f, VC2ENC_FLAGS, 
> > .unit = "tolerance"},
> > +{"slice_width",   "Slice width",  offsetof(VC2EncContext, 
> > slice_width), AV_OPT_TYPE_INT, {.i64 = 32}, 32, 1024, VC2ENC_FLAGS, .unit = 
> > "slice_width"},
> > +{"slice_height",  "Slice height", offsetof(VC2EncContext, 
> > slice_height), AV_OPT_TYPE_INT, {.i64 = 16}, 8, 1024, VC2ENC_FLAGS, .unit = 
> > "slice_height"},
> > +{"wavelet_depth", "Transform depth", offsetof(VC2EncContext, 
> > wavelet_depth), AV_OPT_TYPE_INT, {.i64 = 4}, 1, 5, VC2ENC_FLAGS, .unit = 
> > "wavelet_depth"},
> > +{"wavelet_type",  "Transform type",  offsetof(VC2EncContext, 
> > wavelet_idx), AV_OPT_TYPE_INT, {.i64 = VC2_TRANSFORM_5_3}, 0, 
> > VC2_TRANSFORMS_NB, VC2ENC_FLAGS, .unit = "wavelet_idx"},
>
> You don't allow the 9_7 wavelet here (intentionally?), but then you
> should restrict the range to disallow the value 0 (== VC2_TRANSFORM_9_7).
>
> > +{"5_3",  "LeGall (5,3)",0, AV_OPT_TYPE_CONST, 
> > {.i64 = VC2_TRANSFORM_5_3},INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit = 
> > "wavelet_idx"},
> > +{"haar", "Haar (with shift)",   0, AV_OPT_TYPE_CONST, 
> > {.i64 = VC2_TRANSFORM_HAAR_S}, INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit = 
> > "wavelet_idx"},
> > +{"haar_noshift", "Haar (without shift)",0, AV_OPT_TYPE_CONST, 
> > {.i64 = VC2_TRANSFORM_HAAR},   INT_MIN, INT_MAX, VC2ENC_FLAGS, .unit = 
> > "wavelet_idx"},
> > +{"qm", "Custom quantization matrix", offsetof(VC2EncContext, 
> > quant_matrix), AV_OPT_TYPE_INT, {.i64 = VC2_QM_DEF}, 0, VC2_QM_NB, 
> > VC2ENC_FLAGS, .unit = "quant_matrix"},
> > +{"default",   "Default from the specifications", 0, 
> > AV_OPT_TYPE_CONST, {.i64 = VC2_QM_DEF}, INT_MIN, INT_MAX, VC2ENC_FLAGS, 
> > .unit = "quant_matrix"},
> > +{"color", "Prevents low bitrate discoloration", 0, 
> > AV_OPT_TYPE_CONST, {.i64 = VC2_QM_COL}, INT_MIN, INT_MAX, VC2ENC_FLAGS, 
> > .unit = "quant_matrix"},
> > +{"flat",  "Optimize for PSNR", 0, AV_OPT_TYPE_CONST, {.i64 = 
> > VC2_QM_F

[FFmpeg-devel] [PATCH] lavfi: add noop multimedia filter

2025-05-19 Thread Marvin Scholz

This filter does nothing, it is mainly useful during
development/debugging and demonstrates a simple case
of a mixed-input filter.
---
 doc/filters.texi |  20 
 libavfilter/Makefile |   1 +
 libavfilter/allfilters.c |   1 +
 libavfilter/avf_noop.c   | 247 +++
 4 files changed, 269 insertions(+)
 create mode 100644 libavfilter/avf_noop.c

diff --git a/doc/filters.texi b/doc/filters.texi
index 679b71f2906..492e5bb5c94 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -31171,10 +31171,30 @@ Direct all metadata to a pipe with file descriptor 4.
 @example
 metadata=mode=print:file='pipe\:4'
 @end example
 @end itemize
 
+@section noop
+Pass the inputs unchanged to the outputs.
+
+This filter is equivalent to the null/anull filters but works with
+multiple video/audio inputs. The respective number of inputs must
+be given via options. By default the filter accepts one video input
+followed by one audio input.
+
+The video inputs always come first, followed by the audio inputs.
+
+The following options are supported:
+
+@table @option
+@item v
+The number of video inputs. Default value is @var{1}.
+
+@item a
+The number of audio inputs. Default value is @var{1}.
+@end table
+
 @section perms, aperms
 
 Set read/write permissions for the output frames.
 
 These filters are mainly aimed at developers to test direct path in the
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 0effe4127ff..cc716e27996 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -631,10 +631,11 @@ OBJS-$(CONFIG_ABITSCOPE_FILTER)  += 
avf_abitscope.o
 OBJS-$(CONFIG_ADRAWGRAPH_FILTER) += f_drawgraph.o
 OBJS-$(CONFIG_AGRAPHMONITOR_FILTER)  += f_graphmonitor.o
 OBJS-$(CONFIG_AHISTOGRAM_FILTER) += avf_ahistogram.o
 OBJS-$(CONFIG_APHASEMETER_FILTER)+= avf_aphasemeter.o
 OBJS-$(CONFIG_AVECTORSCOPE_FILTER)   += avf_avectorscope.o
+OBJS-$(CONFIG_NOOP_FILTER)   += avf_noop.o
 OBJS-$(CONFIG_CONCAT_FILTER) += avf_concat.o
 OBJS-$(CONFIG_SHOWCQT_FILTER)+= avf_showcqt.o lswsutils.o 
lavfutils.o
 OBJS-$(CONFIG_SHOWCWT_FILTER)+= avf_showcwt.o
 OBJS-$(CONFIG_SHOWFREQS_FILTER)  += avf_showfreqs.o
 OBJS-$(CONFIG_SHOWSPATIAL_FILTER)+= avf_showspatial.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 5ea33cdf01b..960aa545385 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -593,10 +593,11 @@ extern const FFFilter ff_avf_adrawgraph;
 extern const FFFilter ff_avf_agraphmonitor;
 extern const FFFilter ff_avf_ahistogram;
 extern const FFFilter ff_avf_aphasemeter;
 extern const FFFilter ff_avf_avectorscope;
 extern const FFFilter ff_avf_concat;
+extern const FFFilter ff_avf_noop;
 extern const FFFilter ff_avf_showcqt;
 extern const FFFilter ff_avf_showcwt;
 extern const FFFilter ff_avf_showfreqs;
 extern const FFFilter ff_avf_showspatial;
 extern const FFFilter ff_avf_showspectrum;
diff --git a/libavfilter/avf_noop.c b/libavfilter/avf_noop.c
new file mode 100644
index 000..0b4c850e94e
--- /dev/null
+++ b/libavfilter/avf_noop.c
@@ -0,0 +1,247 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/**
+ * @file
+ * No-op video filter
+ */
+
+#include 
+
+#include "libavutil/avassert.h"
+#include "libavutil/avstring.h"
+#include "libavutil/avutil.h"
+#include "libavutil/channel_layout.h"
+#include "libavutil/error.h"
+#include "libavutil/internal.h"
+#include "libavutil/mem.h"
+#include "libavutil/opt.h"
+
+#include "avfilter.h"
+#include "filters.h"
+
+typedef struct NoopContext {
+const AVClass *class;
+unsigned nb_video; // Number of video inputs
+unsigned nb_audio; // Number of audio inputs
+unsigned *in_status; // Array of inputs status
+} NoopContext;
+
+static char get_media_type_char(enum AVMediaType media_type)
+{
+switch (media_type) {
+case AVMEDIA_TYPE_VIDEO:  return 'v';
+case AVMEDIA_TYPE_AUDIO:  return 'a';
+default:  return 'u';
+}
+}
+
+static int create_pads(AVFilterContext *ctx, int count, AVFilterPad pad, bool 
inpad)
+{
+av_assert0(pad.name == NULL);
+
+for (int i = 0; i < count;

[FFmpeg-devel] [PATCH] swscale: rgb_to_yuv neon optimizations

2025-05-19 Thread Dmitriy Kovalenko


I've found quite a few ways to optimize existing ffmpeg's rgb to yuv
subsampled conversion. In this patch stack I'll try to
improve the performance.

This particular set of changes is a small improvement to all the
existing functions and macro. The biggest performance gain is
coming from post loading increment of the pointer and immediate
pref etching of the memory blocks and interleaving the multiplication 
shifting operations of

different registers for better scheduling.

Also changed a bunch of places where cmp + b.le was used instead
of one instruction cbnz/tbnz and some other small cleanups.

Here are checkasm results on the macbook pro with the latest M4 max



bgra_to_uv_1080_c: 257.5 ( 1.00x)
bgra_to_uv_1080_neon:  211.9 ( 1.22x)
bgra_to_uv_1920_c: 467.1 ( 1.00x)
bgra_to_uv_1920_neon:  379.3 ( 1.23x)
bgra_to_uv_half_1080_c:198.9 ( 1.00x)
bgra_to_uv_half_1080_neon: 125.7 ( 1.58x)
bgra_to_uv_half_1920_c:346.3 ( 1.00x)
bgra_to_uv_half_1920_neon: 223.7 ( 1.55x)



bgra_to_uv_1080_c: 268.3 ( 1.00x)
bgra_to_uv_1080_neon:  176.0 ( 1.53x)
bgra_to_uv_1920_c: 456.6 ( 1.00x)
bgra_to_uv_1920_neon:  307.7 ( 1.48x)
bgra_to_uv_half_1080_c:193.2 ( 1.00x)
bgra_to_uv_half_1080_neon:  96.8 ( 2.00x)
bgra_to_uv_half_1920_c:347.2 ( 1.00x)
bgra_to_uv_half_1920_neon: 182.6 ( 1.92x)

With my proprietary test on IOS it gives around 70% of performance
improvement converting bgra 1920x1920 image to yuv420p

On my linux arm cortex-r processing the performance improvement not that
visible but still consistently faster by 5-10% than the current
implementation.

Signed-off-by: Dmitriy Kovalenko 
---
 libswscale/aarch64/input.S | 166 +
 1 file changed, 112 insertions(+), 54 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index c1c0adffc8..ee8eb24c14 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -1,5 +1,4 @@
-/*
- * Copyright (c) 2024 Zhao Zhili 
+/* Copyright (c) 2024 Zhao Zhili 
  *
  * This file is part of FFmpeg.
  *
@@ -57,20 +56,41 @@
 sqshrn2 \dst\().8h, \dst2\().4s, \right_shift   // 
dst_higher_half = dst2 >> right_shift

 .endm
 +// interleaved product version of the rgb to yuv gives slightly 
better performance on non-performant mobile +.macro 
rgb_to_uv_interleaved_product r, g, b, u_coef0, u_coef1, u_coef2, 
v_coef0, v_coef1, v_coef2, u_dst1, u_dst2, v_dst1, v_dst2, u_dst, v_dst, 
right_shift
+smlal   \u_dst1\().4s, \u_coef0\().4h, \r\().4h // U += ru * r 
(first 4)
+smlal   \v_dst1\().4s, \v_coef0\().4h, \r\().4h // V += rv * r 
(first 4)
+smlal2  \u_dst2\().4s, \u_coef0\().8h, \r\().8h // U += ru * r 
(second 4)
+smlal2  \v_dst2\().4s, \v_coef0\().8h, \r\().8h // V += rv * r 
(second 4)
++smlal   \u_dst1\().4s, \u_coef1\().4h, \g\().4h // U += gu 
* g (first 4)
+smlal   \v_dst1\().4s, \v_coef1\().4h, \g\().4h // V += gv * g 
(first 4)
+smlal2  \u_dst2\().4s, \u_coef1\().8h, \g\().8h // U += gu * g 
(second 4)
+smlal2  \v_dst2\().4s, \v_coef1\().8h, \g\().8h // V += gv * g 
(second 4)
++smlal   \u_dst1\().4s, \u_coef2\().4h, \b\().4h // U += bu 
* b (first 4)
+smlal   \v_dst1\().4s, \v_coef2\().4h, \b\().4h // V += bv * b 
(first 4)
+smlal2  \u_dst2\().4s, \u_coef2\().8h, \b\().8h // U += bu * b 
(second 4)
+smlal2  \v_dst2\().4s, \v_coef2\().8h, \b\().8h // V += bv * b 
(second 4)

+
+sqshrn  \u_dst\().4h, \u_dst1\().4s, \right_shift   // U first 4 pixels
+sqshrn2 \u_dst\().8h, \u_dst2\().4s, \right_shift   // U all 8 pixels
+sqshrn  \v_dst\().4h, \v_dst1\().4s, \right_shift   // V first 4 pixels
+sqshrn2 \v_dst\().8h, \v_dst2\().4s, \right_shift   // V all 8 pixels
+.endm
+
 .macro rgbToY_neon fmt_bgr, fmt_rgb, element, alpha_first=0
 function ff_\fmt_bgr\()ToY_neon, export=1
-cmp w4, #0  // check width > 0
+cbz w4, 3f  // check width > 0
 ldp w12, w11, [x5]  // w12: ry, w11: gy
 ldr w10, [x5, #8]   // w10: by
-b.gt4f
-ret
+b   4f
 endfunc
  function ff_\fmt_rgb\()ToY_neon, export=1
-cmp w4, #0  // check width > 0
+cbz w4, 3f  // check width > 0
 ldp w10, w11, [x5]  // w10: ry, w11: gy
 ldr

Re: [FFmpeg-devel] [PATCH v2] ffmpeg: Don't print graphs if there are no outputs yet

2025-05-19 Thread Mark Thompson

On 18/05/2025 15:57, softworkz . wrote:
>> -Original Message-
>> From: ffmpeg-devel  On Behalf Of Mark
>> Thompson
>> Sent: Sonntag, 18. Mai 2025 16:22
>> To: ffmpeg-devel@ffmpeg.org
>> Subject: Re: [FFmpeg-devel] [PATCH v2] ffmpeg: Don't print graphs if there 
>> are
>> no outputs yet
>>
>> ...
>>
>> Suggest doing any non-performance-critical development (like this) with asan
>> enabled in future; it doesn't slow things down very much and makes it easier
>> to catch and fix leaks as you go along.
> 
> 
> It's a good idea - I didn't have it on the record anymore after the pause.
> 
> In the past, it had often caused trouble with MSVC (/fsanitize=address), so 
> we had it only in a Linux CI - which this work didn't go through 😊
> I'll check it out, maybe MS have made some progress with it.
> 
> Thanks for the suggestion and the patches,
> sw

I've run with the mermaidhtml output a bit as well.  The output looks nice but 
there are many asan errors, mostly from lost string alloctions - see below.

I had a look at fixing these, but the object lifetime model appears more 
complex than I could straightforwardly divine - it's not obvious when any given 
object can be freed.  I suggest that you with your greater understanding would 
be better placed to fix these.

Thanks,

- Mark


$ ./ffmpeg_g -print_graphs_file out.html -print_graphs_format mermaidhtml -i 
in.mp4 -frames:v 1 -f null -
...
=
==3654286==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 37 byte(s) in 2 object(s) allocated from:
#0 0x7f42e6af3b58 in realloc 
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x55e86407e4ca in av_realloc src/libavutil/mem.c:164
#2 0x55e86407f195 in av_strdup src/libavutil/mem.c:277
#3 0x55e85befffdb in mermaid_print_value 
src/fftools/textformat/tf_mermaid.c:538
#4 0x55e85bf01465 in mermaid_print_str 
src/fftools/textformat/tf_mermaid.c:629
#5 0x55e85bee71b8 in avtext_print_string 
src/fftools/textformat/avtextformat.c:483
#6 0x55e85bec2660 in print_sanizied_id src/fftools/graph/graphprint.c:347
#7 0x55e85bec4a6f in print_filter src/fftools/graph/graphprint.c:460
#8 0x55e85bec7ba5 in print_filtergraph_single 
src/fftools/graph/graphprint.c:577
#9 0x55e85becf7e2 in print_filtergraph src/fftools/graph/graphprint.c:1003
#10 0x55e85be3d1f5 in filter_thread src/fftools/ffmpeg_filter.c:2989
#11 0x55e85bebcf30 in task_wrapper src/fftools/ffmpeg_sched.c:2534
#12 0x7f42e6a5b1d5 in asan_thread_start 
../../../../src/libsanitizer/asan/asan_interceptors.cpp:234

Direct leak of 37 byte(s) in 2 object(s) allocated from:
#0 0x7f42e6af3b58 in realloc 
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x55e86407e4ca in av_realloc src/libavutil/mem.c:164
#2 0x55e86407f195 in av_strdup src/libavutil/mem.c:277
#3 0x55e85befbf79 in mermaid_print_section_header 
src/fftools/textformat/tf_mermaid.c:329
#4 0x55e85bee3501 in avtext_print_section_header 
src/fftools/textformat/avtextformat.c:270
#5 0x55e85bec2aff in print_section_header_id 
src/fftools/graph/graphprint.c:372
#6 0x55e85bec2df9 in print_filter src/fftools/graph/graphprint.c:387
#7 0x55e85bec7ba5 in print_filtergraph_single 
src/fftools/graph/graphprint.c:577
#8 0x55e85becf7e2 in print_filtergraph src/fftools/graph/graphprint.c:1003
#9 0x55e85be3d1f5 in filter_thread src/fftools/ffmpeg_filter.c:2989
#10 0x55e85bebcf30 in task_wrapper src/fftools/ffmpeg_sched.c:2534
#11 0x7f42e6a5b1d5 in asan_thread_start 
../../../../src/libsanitizer/asan/asan_interceptors.cpp:234

Direct leak of 37 byte(s) in 2 object(s) allocated from:
#0 0x7f42e6af3b58 in realloc 
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x55e86407e4ca in av_realloc src/libavutil/mem.c:164
#2 0x55e86407f195 in av_strdup src/libavutil/mem.c:277
#3 0x55e85bf005d1 in mermaid_print_value 
src/fftools/textformat/tf_mermaid.c:548
#4 0x55e85bf01465 in mermaid_print_str 
src/fftools/textformat/tf_mermaid.c:629
#5 0x55e85bee71b8 in avtext_print_string 
src/fftools/textformat/avtextformat.c:483
#6 0x55e85bec2660 in print_sanizied_id src/fftools/graph/graphprint.c:347
#7 0x55e85bec4a6f in print_filter src/fftools/graph/graphprint.c:460
#8 0x55e85bec7ba5 in print_filtergraph_single 
src/fftools/graph/graphprint.c:577
#9 0x55e85becf7e2 in print_filtergraph src/fftools/graph/graphprint.c:1003
#10 0x55e85be3d1f5 in filter_thread src/fftools/ffmpeg_filter.c:2989
#11 0x55e85bebcf30 in task_wrapper src/fftools/ffmpeg_sched.c:2534
#12 0x7f42e6a5b1d5 in asan_thread_start 
../../../../src/libsanitizer/asan/asan_interceptors.cpp:234

Direct leak of 32 byte(s) in 2 object(s) allocated from:
#0 0x7f42e6af424d in posix_memalign 
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x55e86407e295 in av_malloc src/libavutil/mem.c:107
#2 0x55e86407efeb in av

Re: [FFmpeg-devel] [PATCH v2] ffmpeg: Don't print graphs if there are no outputs yet

2025-05-19 Thread softworkz .



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Mark
> Thompson
> Sent: Montag, 19. Mai 2025 22:08
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v2] ffmpeg: Don't print graphs if there are
> no outputs yet
> 
> On 18/05/2025 15:57, softworkz . wrote:
> >> -Original Message-
> >> From: ffmpeg-devel  On Behalf Of Mark
> >> Thompson
> >> Sent: Sonntag, 18. Mai 2025 16:22
> >> To: ffmpeg-devel@ffmpeg.org
> >> Subject: Re: [FFmpeg-devel] [PATCH v2] ffmpeg: Don't print graphs if there
> are
> >> no outputs yet
> >>
> >> ...
> >>
> >> Suggest doing any non-performance-critical development (like this) with
> asan
> >> enabled in future; it doesn't slow things down very much and makes it
> easier
> >> to catch and fix leaks as you go along.
> >
> >
> > It's a good idea - I didn't have it on the record anymore after the pause.
> >
> > In the past, it had often caused trouble with MSVC (/fsanitize=address), so
> > we had it only in a Linux CI - which this work didn't go through 😊
> > I'll check it out, maybe MS have made some progress with it.
> >
> > Thanks for the suggestion and the patches,
> > sw
> 
> I've run with the mermaidhtml output a bit as well.  The output looks nice but
> there are many asan errors, mostly from lost string alloctions - see below.
> 
> I had a look at fixing these, but the object lifetime model appears more
> complex than I could straightforwardly divine - it's not obvious when any
> given object can be freed.  I suggest that you with your greater understanding
> would be better placed to fix these.

Hi Mark,

of course that's on me to fix, and yes - the string handling is really
a bit tricky. At some point I had introduced av_strdup() everywhere after I 
realized that it's too hard to track which strings might stem from literals 
and which are not, so it's very plausible that I missed some cleanup.

I've enabled ASAN just an hour ago and currently digging into other errors
it is showing like that one:

==11668==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x122d130b6580 in 
thread T0:
  object passed to delete has wrong type:
  size of the allocated type:   1424 bytes;
  size of the deallocated type: 1400 bytes.
==11668==WARNING: Failed to use and restart external symbolizer!
#0 0x7ff77cd812e3 in operator delete 
D:\a\_work\1\s\src\vctools\asan\llvm\compiler-rt\lib\asan\asan_win_delete_scalar_size_thunk.cpp:41
#1 0x7ff779326a70 in MFX_DISP_HANDLE::`scalar deleting destructor'+0x40 
(V:\ffbuild\msvc\bin\x64\ffmpegd.exe+0x141af6a70)
#2 0x7ff779328485 in MFXClose 
V:\ffbuild\source\mfx_dispatch\src\main.cpp:634
#3 0x7ff77cc6f38a in qsv_create_mfx_session 
V:\ffbuild\source\ffmpeg\libavutil\hwcontext_qsv.c:1266
#4 0x7ff77cc789ec in qsv_device_derive_from_child 
V:\ffbuild\source\ffmpeg\libavutil\hwcontext_qsv.c:2456
#5 0x7ff77cc78f7a in qsv_device_derive 
V:\ffbuild\source\ffmpeg\libavutil\hwcontext_qsv.c:2498


Maybe a mismatch of lib and include..


Anyway, I'm on it, thanks a lot for the output,

sw




> 
> 
> $ ./ffmpeg_g -print_graphs_file out.html -print_graphs_format mermaidhtml -i
> in.mp4 -frames:v 1 -f null -
> ...
> =
> ==3654286==ERROR: LeakSanitizer: detected memory leaks
> 
> Direct leak of 37 byte(s) in 2 object(s) allocated from:
> #0 0x7f42e6af3b58 in realloc
> ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
> #1 0x55e86407e4ca in av_realloc src/libavutil/mem.c:164
> #2 0x55e86407f195 in av_strdup src/libavutil/mem.c:277
> #3 0x55e85befffdb in mermaid_print_value
> src/fftools/textformat/tf_mermaid.c:538
> #4 0x55e85bf01465 in mermaid_print_str
> src/fftools/textformat/tf_mermaid.c:629
> #5 0x55e85bee71b8 in avtext_print_string
> src/fftools/textformat/avtextformat.c:483
> #6 0x55e85bec2660 in print_sanizied_id src/fftools/graph/graphprint.c:347
> #7 0x55e85bec4a6f in print_filter src/fftools/graph/graphprint.c:460
> #8 0x55e85bec7ba5 in print_filtergraph_single
> src/fftools/graph/graphprint.c:577
> #9 0x55e85becf7e2 in print_filtergraph src/fftools/graph/graphprint.c:1003
> #10 0x55e85be3d1f5 in filter_thread src/fftools/ffmpeg_filter.c:2989
> #11 0x55e85bebcf30 in task_wrapper src/fftools/ffmpeg_sched.c:2534
> #12 0x7f42e6a5b1d5 in asan_thread_start
> ../../../../src/libsanitizer/asan/asan_interceptors.cpp:234
> 
> Direct leak of 37 byte(s) in 2 object(s) allocated from:
> #0 0x7f42e6af3b58 in realloc
> ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
> #1 0x55e86407e4ca in av_realloc src/libavutil/mem.c:164
> #2 0x55e86407f195 in av_strdup src/libavutil/mem.c:277
> #3 0x55e85befbf79 in mermaid_print_section_header
> src/fftools/textformat/tf_mermaid.c:329
> #4 0x55e85bee3501 in avtext_print_section_header
> src/fftools/textformat/avtextformat.c:270
> #5 0x55e85bec2aff in print_section_header_id
> src/fftools/graph/graphprint.c:372
>

Re: [FFmpeg-devel] [PATCH] libavformat/rtpdec_opus: Set duration field on Opus AVPacket

2025-05-19 Thread Jonathan Baudanza

Does anyone have feedback on this?

Here are some steps to reproduce the current issue:

# Start broadcasting
ffmpeg -re -i input.opus -c:a copy -f rtp -sdp_file stream.sdp  
rtp://127.0.0.1:9000

# Start recording (in another terminal)
ffmpeg -protocol_whitelist file,udp,rtp -i stream.sdp -y -c:a copy output.opus

# Observe that output.opus pts starts at -960
ffprobe -show_entries packet=pts,duration output.opus | head
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] configure: identify loong64 for loongarch.

2025-05-19 Thread yinshiyou-hf

> -原始邮件-
> 发件人: "Brad Smith" 
> 发送时间:2025-05-14 09:06:31 (星期三)
> 收件人: "FFmpeg development discussions and patches" , 
> "Shiyou Yin" 
> 主题: Re: [FFmpeg-devel] [PATCH] configure: identify loong64 for loongarch.
> 
> On 2025-05-12 9:11 p.m., Shiyou Yin wrote:
> > dpkg-architecture set DEB_HOST_ARCH_CPU as loong64 on loongarch.
> > ---
> >   configure | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/configure b/configure
> > index 2e69b3c56c..5f39050374 100755
> > --- a/configure
> > +++ b/configure
> > @@ -5290,7 +5290,7 @@ case "$arch" in
> >   arm*|iPad*|iPhone*)
> >   arch="arm"
> >   ;;
> > -loongarch*)
> > +loongarch*|loong64)
> >   arch="loongarch"
> >   ;;
> >   mips*|IP*)
> 
> It would be better to change this to just loong*
> 

Thank you for your suggestion.
loong32 has not been throughly tested, so I only add loong64 here.
Once loong32 has been tested sufficiently, it can be switched to loong*.

本邮件及其附件含有龙芯中科的商业秘密信息，仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制或散发）本邮件及其附件中的信息。如果您错收本邮件，请您立即电话或邮件通知发件人并删除本邮件。

This email and its attachments contain confidential information from Loongson 
Technology , which is intended only for the person or entity whose address is 
listed above. Any use of the information contained herein in any way 
(including, but not limited to, total or partial disclosure, reproduction or 
dissemination) by persons other than the intended recipient(s) is prohibited. 
If you receive this email in error, please notify the sender by phone or email 
immediately and delete it. 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] configure: identify loong64 for loongarch.

2025-05-19 Thread yinshiyou-hf




> -原始邮件-
> 发件人: "Zhao Zhili" 
> 发送时间:2025-05-18 20:53:05 (星期日)
> 收件人: "FFmpeg development discussions and patches" 
> 主题: Re: [FFmpeg-devel] [PATCH] configure: identify loong64 for loongarch.
> 
> 
> 
> > On May 18, 2025, at 15:43, yinshiyou...@loongson.cn wrote:
> > 
> >> -原始邮件-
> >> 发件人: 陈昊 mailto:chen...@loongson.cn>>
> >> 发送时间:2025-05-14 08:55:14 (星期三)
> >> 收件人: "FFmpeg development discussions and patches"  >> >
> >> 主题: Re: [FFmpeg-devel] [PATCH] configure: identify loong64 for loongarch.
> >> 
> >>> -原始邮件-
> >>> 发件人: "Shiyou Yin" 
> >>> 发送时间:2025-05-13 09:11:34 (星期二)
> >>> 收件人: ffmpeg-devel@ffmpeg.org
> >>> 主题: [FFmpeg-devel] [PATCH] configure: identify loong64 for loongarch.
> >>> 
> >>> dpkg-architecture set DEB_HOST_ARCH_CPU as loong64 on loongarch.
> >>> ---
> >>> configure | 2 +-
> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>> 
> >>> diff --git a/configure b/configure
> >>> index 2e69b3c56c..5f39050374 100755
> >>> --- a/configure
> >>> +++ b/configure
> >>> @@ -5290,7 +5290,7 @@ case "$arch" in
> >>> arm*|iPad*|iPhone*)
> >>> arch="arm"
> >>> ;;
> >>> -loongarch*)
> >>> +loongarch*|loong64)
> >>> arch="loongarch"
> >>> ;;
> >>> mips*|IP*)
> >>> -- 
> >>> 2.20.1
> >>> 
> >>> ___
> >>> ffmpeg-devel mailing list
> >>> ffmpeg-devel@ffmpeg.org
> >>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>> 
> >>> To unsubscribe, visit link above, or email
> >>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> >> 
> >> 
> >> LGTM
> >> 
> > 
> > ping.
> 
> I think you missed Brad’s comments.
> 

Thank you for your reminder.




本邮件及其附件含有龙芯中科的商业秘密信息，仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制或散发）本邮件及其附件中的信息。如果您错收本邮件，请您立即电话或邮件通知发件人并删除本邮件。
 
This email and its attachments contain confidential information from Loongson 
Technology , which is intended only for the person or entity whose address is 
listed above. Any use of the information contained herein in any way 
(including, but not limited to, total or partial disclosure, reproduction or 
dissemination) by persons other than the intended recipient(s) is prohibited. 
If you receive this email in error, please notify the sender by phone or email 
immediately and delete it. 


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 0/5] Graphprint and Mermaid Fixes

2025-05-19 Thread ffmpegagent

This patchsets includes

 * two minor fixes for resource compilation
 * fixes for memory leaks in graphprint and tf_mermaid

softworkz (5):
  fftools/makefile: Remove resources from ffprobe
  fftools/resources: Use .SECONDARY in Makefile comment
  fftools/ffmpeg: Free print_graph option variables
  fftools/graphprint: Fix memory leaks
  fftools/tf_mermaid: Add missing uninit and fix leaks

 fftools/Makefile|  1 -
 fftools/ffmpeg.c|  3 +++
 fftools/graph/graphprint.c  |  5 +++-
 fftools/resources/Makefile  |  2 +-
 fftools/textformat/tf_mermaid.c | 45 +++--
 5 files changed, 45 insertions(+), 11 deletions(-)


base-commit: fd18ae88ae736b5aabff34e17394fcd103f9e5ad
Published-As: 
https://github.com/ffstaging/FFmpeg/releases/tag/pr-ffstaging-84%2Fsoftworkz%2Fsubmit_graphprint_fixes-v1
Fetch-It-Via: git fetch https://github.com/ffstaging/FFmpeg 
pr-ffstaging-84/softworkz/submit_graphprint_fixes-v1
Pull-Request: https://github.com/ffstaging/FFmpeg/pull/84
-- 
ffmpeg-codebot
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/5] fftools/makefile: Remove resources from ffprobe

2025-05-19 Thread softworkz

From: softworkz 

Even though it doesn't have any effect, that line is not needed (yet).

Signed-off-by: softworkz 
---
 fftools/Makefile | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fftools/Makefile b/fftools/Makefile
index 361a4fd574..c1eba733da 100644
--- a/fftools/Makefile
+++ b/fftools/Makefile
@@ -49,7 +49,6 @@ OBJS-ffprobe +=   \
 fftools/textformat/tw_avio.o  \
 fftools/textformat/tw_buffer.o\
 fftools/textformat/tw_stdout.o\
-$(OBJS-resman)\
 
 OBJS-ffplay += fftools/ffplay_renderer.o
 
-- 
ffmpeg-codebot

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 2/5] fftools/resources: Use .SECONDARY in Makefile comment

2025-05-19 Thread softworkz

From: softworkz 

---
 fftools/resources/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fftools/resources/Makefile b/fftools/resources/Makefile
index 8579a52678..c655c9a431 100644
--- a/fftools/resources/Makefile
+++ b/fftools/resources/Makefile
@@ -5,7 +5,7 @@ vpath %.html $(SRC_PATH)
 vpath %.css  $(SRC_PATH)
 
 # Uncomment to prevent deletion during build
-#.PRECIOUS: %.css.c %.css.min %.css.gz %.css.min.gz %.html.gz %.html.c
+#.SECONDARY: %.css.c %.css.min %.css.gz %.css.min.gz %.html.gz %.html.c
 
 OBJS-resman += \
 fftools/resources/resman.o \
-- 
ffmpeg-codebot

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 3/5] fftools/ffmpeg: Free print_graph option variables

2025-05-19 Thread softworkz

From: softworkz 

---
 fftools/ffmpeg.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c
index bd6f22e421..de607cac93 100644
--- a/fftools/ffmpeg.c
+++ b/fftools/ffmpeg.c
@@ -344,6 +344,9 @@ static void ffmpeg_cleanup(int ret)
 
 av_freep(&filter_nbthreads);
 
+av_freep(&print_graphs_file);
+av_freep(&print_graphs_format);
+
 av_freep(&input_files);
 av_freep(&output_files);
 
-- 
ffmpeg-codebot

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 4/5] fftools/graphprint: Fix memory leaks

2025-05-19 Thread softworkz

From: softworkz 

- uninit resource manager
- free strings before overwriting
- unref hw_frames_context

Signed-off-by: softworkz 
---
 fftools/graph/graphprint.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fftools/graph/graphprint.c b/fftools/graph/graphprint.c
index 50f1a2ecdc..852a8f6c0c 100644
--- a/fftools/graph/graphprint.c
+++ b/fftools/graph/graphprint.c
@@ -318,6 +318,7 @@ static void print_link(GraphPrintContext *gpc, AVFilterLink 
*link)
 
 if (hw_frames_ctx && hw_frames_ctx->data)
 print_hwframescontext(gpc, (AVHWFramesContext *)hw_frames_ctx->data);
+av_buffer_unref(&hw_frames_ctx);
 }
 
 static char sanitize_char(const char c)
@@ -1107,5 +1108,7 @@ cleanup:
 
 int print_filtergraphs(FilterGraph **graphs, int nb_graphs, InputFile 
**ifiles, int nb_ifiles, OutputFile **ofiles, int nb_ofiles)
 {
-return print_filtergraphs_priv(graphs, nb_graphs, ifiles, nb_ifiles, 
ofiles, nb_ofiles);
+int ret = print_filtergraphs_priv(graphs, nb_graphs, ifiles, nb_ifiles, 
ofiles, nb_ofiles);
+ff_resman_uninit();
+return ret;
 }
-- 
ffmpeg-codebot

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 5/5] fftools/tf_mermaid: Add missing uninit and fix leaks

2025-05-19 Thread softworkz

From: softworkz 

- merge forgotten uninit from work branch
- add set_str() function to free before overwriting
- fix some other leaks

Signed-off-by: softworkz 
---
 fftools/textformat/tf_mermaid.c | 45 +++--
 1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/fftools/textformat/tf_mermaid.c b/fftools/textformat/tf_mermaid.c
index 6147cf6eea..d3b9131ada 100644
--- a/fftools/textformat/tf_mermaid.c
+++ b/fftools/textformat/tf_mermaid.c
@@ -153,7 +153,6 @@ typedef struct MermaidContext {
 }  section_data[SECTION_MAX_NB_LEVELS];
 
 unsigned nb_link_captions[SECTION_MAX_NB_LEVELS]; ///< generic print 
buffer dedicated to each section,
-AVBPrint section_pbuf[SECTION_MAX_NB_LEVELS]; ///< generic print buffer 
dedicated to each section,
 AVBPrint link_buf; ///< print buffer for writing diagram links
 AVDictionary *link_dict;
 } MermaidContext;
@@ -216,6 +215,32 @@ static av_cold int mermaid_init_html(AVTextFormatContext 
*tfc)
 return 0;
 }
 
+static av_cold int mermaid_uninit(AVTextFormatContext *tfc)
+{
+MermaidContext *mmc = tfc->priv;
+
+av_bprint_finalize(&mmc->link_buf, NULL);
+av_dict_free(&mmc->link_dict);
+
+for (unsigned i = 0; i < SECTION_MAX_NB_LEVELS; i++) {
+av_freep(&mmc->section_data[i].dest_id);
+av_freep(&mmc->section_data[i].section_id);
+av_freep(&mmc->section_data[i].src_id);
+av_freep(&mmc->section_data[i].section_type);
+}
+
+return 0;
+}
+
+static void set_str(const char **dst, const char *src)
+{
+if (*dst)
+av_freep(dst);
+
+if (src)
+*dst = av_strdup(src);
+}
+
 #define MM_INDENT() writer_printf(tfc, "%*c", mmc->indent_level * 2, ' ')
 
 static void mermaid_print_section_header(AVTextFormatContext *tfc, const void 
*data)
@@ -266,6 +291,8 @@ static void 
mermaid_print_section_header(AVTextFormatContext *tfc, const void *d
 break;
 }
 
+av_bprint_finalize(&css_buf, NULL);
+av_freep(&directive);
 return;
 }
 
@@ -310,7 +337,7 @@ static void 
mermaid_print_section_header(AVTextFormatContext *tfc, const void *d
 }
 
 mmc->section_data[tfc->level].subgraph_start_incomplete = 1;
-mmc->section_data[tfc->level].section_id = 
av_strdup(sec_ctx->context_id);
+set_str(&mmc->section_data[tfc->level].section_id, 
sec_ctx->context_id);
 }
 
 if (section->flags & AV_TEXTFORMAT_SECTION_FLAG_IS_SHAPE) {
@@ -322,7 +349,7 @@ static void 
mermaid_print_section_header(AVTextFormatContext *tfc, const void *d
 
 if (sec_ctx->context_id) {
 
-mmc->section_data[tfc->level].section_id = 
av_strdup(sec_ctx->context_id);
+set_str(&mmc->section_data[tfc->level].section_id, 
sec_ctx->context_id);
 
 switch (mmc->diagram_config->diagram_type) {
 case AV_DIAGRAMTYPE_GRAPH:
@@ -352,7 +379,7 @@ static void 
mermaid_print_section_header(AVTextFormatContext *tfc, const void *d
 av_log(tfc, AV_LOG_ERROR, "Unable to write shape start. Missing id 
field. Section: %s", section->name);
 }
 
-mmc->section_data[tfc->level].section_id = 
av_strdup(sec_ctx->context_id);
+set_str(&mmc->section_data[tfc->level].section_id, 
sec_ctx->context_id);
 }
 
 
@@ -371,7 +398,7 @@ static void 
mermaid_print_section_header(AVTextFormatContext *tfc, const void *d
 mmc->nb_link_captions[tfc->level] = 0;
 
 if (sec_ctx && sec_ctx->context_type)
-mmc->section_data[tfc->level].section_type = 
av_strdup(sec_ctx->context_type);
+set_str(&mmc->section_data[tfc->level].section_type, 
sec_ctx->context_type);
 
 if (section->flags & AV_TEXTFORMAT_SECTION_FLAG_HAS_TYPE) {
 AVBPrint buf;
@@ -533,17 +560,17 @@ static void mermaid_print_value(AVTextFormatContext *tfc, 
const char *key,
 int exit = 0;
 
 if (section->id_key && !strcmp(section->id_key, key)) {
-mmc->section_data[tfc->level].section_id = av_strdup(str);
+set_str(&mmc->section_data[tfc->level].section_id, str);
 exit = 1;
 }
 
 if (section->dest_id_key && !strcmp(section->dest_id_key, key)) {
-mmc->section_data[tfc->level].dest_id = av_strdup(str);
+set_str(&mmc->section_data[tfc->level].dest_id, str);
 exit = 1;
 }
 
 if (section->src_id_key && !strcmp(section->src_id_key, key)) {
-mmc->section_data[tfc->level].src_id = av_strdup(str);
+set_str(&mmc->section_data[tfc->level].src_id, str);
 exit = 1;
 }
 
@@ -636,6 +663,7 @@ const AVTextFormatter avtextformatter_mermaid = {
 .name = "mermaid",
 .priv_size= sizeof(MermaidContext),
 .init = mermaid_init,
+.uninit   = mermaid_uninit,
 .print_section_header = mermaid_print_section_header,
 .print_section_footer = mermaid_print_section_footer,
 .print_integer= mermaid_pri

[FFmpeg-devel] [PATCH] avcodec/lcevcdec: don't try to write to output frames directly

2025-05-19 Thread James Almer

The buffer references may not be writable at this point, as the decoder
calls get_buffer2() with the AV_GET_BUFFER_FLAG_REF flag.

Fixes races as reported by tsan, producing correct output regardless of
threading choices.

Signed-off-by: James Almer 
---
 libavcodec/decode.c   | 39 
 libavcodec/lcevcdec.c | 69 ---
 libavcodec/lcevcdec.h |  5 
 3 files changed, 78 insertions(+), 35 deletions(-)

diff --git a/libavcodec/decode.c b/libavcodec/decode.c
index c2b2dd6e3b..ef09568381 100644
--- a/libavcodec/decode.c
+++ b/libavcodec/decode.c
@@ -1590,22 +1590,49 @@ static void update_frame_props(AVCodecContext *avctx, 
AVFrame *frame)
 }
 }
 
-static void attach_post_process_data(AVCodecContext *avctx, AVFrame *frame)
+static int attach_post_process_data(AVCodecContext *avctx, AVFrame *frame)
 {
 AVCodecInternal*avci = avctx->internal;
 DecodeContext*dc = decode_ctx(avci);
 
 if (dc->lcevc_frame) {
 FrameDecodeData *fdd = frame->private_ref;
+FFLCEVCFrame *frame_ctx;
+int ret;
 
-fdd->post_process_opaque = av_refstruct_ref(dc->lcevc);
-fdd->post_process_opaque_free = ff_lcevc_unref;
-fdd->post_process = ff_lcevc_process;
+frame_ctx = av_mallocz(sizeof(*frame_ctx));
+if (!frame_ctx)
+return AVERROR(ENOMEM);
+
+frame_ctx->frame = av_frame_alloc();
+if (!frame_ctx->frame) {
+av_free(frame_ctx);
+return AVERROR(ENOMEM);
+}
+
+frame_ctx->lcevc = av_refstruct_ref(dc->lcevc);
+frame_ctx->frame->width  = frame->width;
+frame_ctx->frame->height = frame->height;
+frame_ctx->frame->format = frame->format;
 
 frame->width  = dc->width;
 frame->height = dc->height;
+
+ret = avctx->get_buffer2(avctx, frame_ctx->frame, 0);
+if (ret < 0) {
+ff_lcevc_unref(frame_ctx);
+return ret;
+}
+
+validate_avframe_allocation(avctx, frame_ctx->frame);
+
+fdd->post_process_opaque = frame_ctx;
+fdd->post_process_opaque_free = ff_lcevc_unref;
+fdd->post_process = ff_lcevc_process;
 }
 dc->lcevc_frame = 0;
+
+return 0;
 }
 
 int ff_get_buffer(AVCodecContext *avctx, AVFrame *frame, int flags)
@@ -1666,7 +1693,9 @@ int ff_get_buffer(AVCodecContext *avctx, AVFrame *frame, 
int flags)
 if (ret < 0)
 goto fail;
 
-attach_post_process_data(avctx, frame);
+ret = attach_post_process_data(avctx, frame);
+if (ret < 0)
+goto fail;
 
 end:
 if (avctx->codec_type == AVMEDIA_TYPE_VIDEO && !override_dimensions &&
diff --git a/libavcodec/lcevcdec.c b/libavcodec/lcevcdec.c
index 2fe06b8800..102f6f32e9 100644
--- a/libavcodec/lcevcdec.c
+++ b/libavcodec/lcevcdec.c
@@ -47,7 +47,7 @@ static LCEVC_ColorFormat map_format(int format)
 return LCEVC_ColorFormat_Unknown;
 }
 
-static int alloc_base_frame(void *logctx, LCEVC_DecoderHandle decoder,
+static int alloc_base_frame(void *logctx, FFLCEVCContext *lcevc,
 const AVFrame *frame, LCEVC_PictureHandle *picture)
 {
 LCEVC_PictureDesc desc;
@@ -70,22 +70,22 @@ static int alloc_base_frame(void *logctx, 
LCEVC_DecoderHandle decoder,
 desc.sampleAspectRatioDen  = frame->sample_aspect_ratio.den;
 
 /* Allocate LCEVC Picture */
-res = LCEVC_AllocPicture(decoder, &desc, picture);
+res = LCEVC_AllocPicture(lcevc->decoder, &desc, picture);
 if (res != LCEVC_Success) {
 return AVERROR_EXTERNAL;
 }
-res = LCEVC_LockPicture(decoder, *picture, LCEVC_Access_Write, &lock);
+res = LCEVC_LockPicture(lcevc->decoder, *picture, LCEVC_Access_Write, 
&lock);
 if (res != LCEVC_Success)
 return AVERROR_EXTERNAL;
 
-res = LCEVC_GetPicturePlaneCount(decoder, *picture, &planes);
+res = LCEVC_GetPicturePlaneCount(lcevc->decoder, *picture, &planes);
 if (res != LCEVC_Success)
 return AVERROR_EXTERNAL;
 
 for (unsigned i = 0; i < planes; i++) {
 LCEVC_PicturePlaneDesc plane;
 
-res = LCEVC_GetPictureLockPlaneDesc(decoder, lock, i, &plane);
+res = LCEVC_GetPictureLockPlaneDesc(lcevc->decoder, lock, i, &plane);
 if (res != LCEVC_Success)
 return AVERROR_EXTERNAL;
 
@@ -96,43 +96,43 @@ static int alloc_base_frame(void *logctx, 
LCEVC_DecoderHandle decoder,
 av_image_copy2(data, linesizes, frame->data, frame->linesize,
frame->format, frame->width, frame->height);
 
-res = LCEVC_UnlockPicture(decoder, lock);
+res = LCEVC_UnlockPicture(lcevc->decoder, lock);
 if (res != LCEVC_Success)
 return AVERROR_EXTERNAL;
 
 return 0;
 }
 
-static int alloc_enhanced_frame(void *logctx, LCEVC_DecoderHandle decoder,
-const AVFrame *frame, LCEVC_PictureHandle 
*picture)
+static int alloc_enhanced_frame(void *logctx, FFLCEVCFrame *frame_ctx,
+

Re: [FFmpeg-devel] [PATCH v2] ffmpeg: Don't print graphs if there are no outputs yet

2025-05-19 Thread softworkz .



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Mark
> Thompson
> Sent: Montag, 19. Mai 2025 22:08
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v2] ffmpeg: Don't print graphs if there are
> no outputs yet
> 
> On 18/05/2025 15:57, softworkz . wrote:
> >> -Original Message-
> >> From: ffmpeg-devel  On Behalf Of Mark
> >> Thompson
> >> Sent: Sonntag, 18. Mai 2025 16:22
> >> To: ffmpeg-devel@ffmpeg.org
> >> Subject: Re: [FFmpeg-devel] [PATCH v2] ffmpeg: Don't print graphs if there
> are
> >> no outputs yet
> >>
> >> ...
> >>
> >> Suggest doing any non-performance-critical development (like this) with
> asan
> >> enabled in future; it doesn't slow things down very much and makes it
> easier
> >> to catch and fix leaks as you go along.
> >
> >
> > It's a good idea - I didn't have it on the record anymore after the pause.
> >
> > In the past, it had often caused trouble with MSVC (/fsanitize=address), so
> > we had it only in a Linux CI - which this work didn't go through 😊
> > I'll check it out, maybe MS have made some progress with it.
> >
> > Thanks for the suggestion and the patches,
> > sw
> 
> I've run with the mermaidhtml output a bit as well.  The output looks nice but
> there are many asan errors, mostly from lost string alloctions - see below.
> 
> I had a look at fixing these, but the object lifetime model appears more
> complex than I could straightforwardly divine - it's not obvious when any
> given object can be freed.  I suggest that you with your greater understanding
> would be better placed to fix these.
> 
> Thanks,
> 
> - Mark


Hi Mark,


I think I got all of them covered now, patches sent.

While hunting those leaks, I noticed a lot of leaks around QSV hardware 
acceleration, is that known and do you see these as well? I hope it doesn't
go back to my original code for QSV D3D11..

Thanks
sw



Command was something like:

-init_hw_device d3d11va=d1:2 -init_hw_device qsv@d1 -hwaccel qsv -c:v h264_qsv 
-i in.mkv  -filter_complex "[0:0]scale_qsv=w=512:h=256[f1_out0]" -map [f1_out0] 
-c:v hevc_qsv out1.mkv



-- Block 11879 at 0x5A25B810: 103 bytes --
  Leak Hash: 0xBD2EC68A, Count: 1, Total 103 bytes
  Call Stack (TID 51116):
ucrtbased.dll!aligned_malloc()
V:\ffbuild\source\ffmpeg\libavutil\mem.c (110): ffmpegd.exe!av_malloc() + 
0x12 bytes
V:\ffbuild\source\ffmpeg\libavutil\mem.c (258): ffmpegd.exe!av_mallocz() + 
0xC bytes
V:\ffbuild\source\ffmpeg\libavutil\buffer.c (44): 
ffmpegd.exe!buffer_create() + 0xA bytes
V:\ffbuild\source\ffmpeg\libavutil\buffer.c (64): 
ffmpegd.exe!av_buffer_create() + 0x34 bytes
V:\ffbuild\source\ffmpeg\libavutil\hwcontext.c (282): 
ffmpegd.exe!av_hwframe_ctx_alloc() + 0x20 bytes
V:\ffbuild\source\ffmpeg\libavutil\hwcontext_qsv.c (576): 
ffmpegd.exe!qsv_init_child_ctx() + 0x9 bytes
V:\ffbuild\source\ffmpeg\libavutil\hwcontext_qsv.c (754): 
ffmpegd.exe!qsv_init_pool() + 0xC bytes
V:\ffbuild\source\ffmpeg\libavutil\hwcontext_qsv.c (1428): 
ffmpegd.exe!qsv_frames_init() + 0xF bytes
V:\ffbuild\source\ffmpeg\libavutil\hwcontext.c (364): 
ffmpegd.exe!av_hwframe_ctx_init() + 0xF bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsvdec.c (349): 
ffmpegd.exe!qsv_decode_preinit() + 0x13 bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsvdec.c (445): 
ffmpegd.exe!qsv_decode_header() + 0x21 bytes


-- Block 14884 at 0x5A2548B0: 103 bytes --
  Leak Hash: 0x48BB8D37, Count: 1, Total 103 bytes
  Call Stack (TID 51116):
ucrtbased.dll!aligned_malloc()
V:\ffbuild\source\ffmpeg\libavutil\mem.c (110): ffmpegd.exe!av_malloc() + 
0x12 bytes
V:\ffbuild\source\ffmpeg\libavutil\mem.c (258): ffmpegd.exe!av_mallocz() + 
0xC bytes
V:\ffbuild\source\ffmpeg\libavutil\buffer.c (105): 
ffmpegd.exe!av_buffer_ref() + 0xA bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsv.c (764): 
ffmpegd.exe!qsv_create_mids() + 0xC bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsv.c (1143): 
ffmpegd.exe!ff_qsv_init_session_frames() + 0xF bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsvdec.c (211): 
ffmpegd.exe!qsv_init_session() + 0x77 bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsvdec.c (381): 
ffmpegd.exe!qsv_decode_preinit() + 0x38 bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsvdec.c (445): 
ffmpegd.exe!qsv_decode_header() + 0x21 bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsvdec.c (1026): 
ffmpegd.exe!qsv_process_data() + 0x2A bytes
V:\ffbuild\source\ffmpeg\libavcodec\qsvdec.c (1186): 
ffmpegd.exe!qsv_decode_frame() + 0x34 bytes
V:\ffbuild\source\ffmpeg\libavcodec\decode.c (442): 
ffmpegd.exe!decode_simple_internal() + 0x23 bytes
V:\ffbuild\source\ffmpeg\libavcodec\decode.c (600): 
ffmpegd.exe!decode_simple_receive_frame() + 0x17 bytes
V:\ffbuild\source\ffmpeg\libavcodec\decode.c (636): 
ffmpegd.exe!ff_decode_receive_frame_internal() + 0x13 bytes
V:\ffbuild\source\ffmpeg\libavcodec\decode.c (653): 
ffmpegd.exe!decode_receive_frame_internal()

37 matches

Mail list logo