Re: [FFmpeg-devel] [EXTERNAL] Re: [PATCH] Boost FPS and performance: Optimize vertical loop for cache-friendly access [libavcodec/jpeg2000dwt.c:dwt_decode97_float]

2025-05-14 Thread Chitra Dey Sarkar via ffmpeg-devel


-Original Message-
From: Michael Niedermayer  
Sent: Wednesday, May 14, 2025 9:40 AM
To: FFmpeg development discussions and patches 
Cc: Chitra Dey Sarkar 
Subject: [EXTERNAL] Re: [FFmpeg-devel] [PATCH] Boost FPS and performance: 
Optimize vertical loop for cache-friendly access 
[libavcodec/jpeg2000dwt.c:dwt_decode97_float]

Hi Chitra

On Wed, May 14, 2025 at 03:55:59AM +, Chitra Dey Sarkar via ffmpeg-devel 
wrote:
> Original Implementation:
> -
> In the original implementation, the "VER_SD" section processes image data 
> stored in *data using strided memory access in a vertical fashion This leads 
> to inefficient memory access patterns and cache thrashing due to 
> non-sequential data access across multiple inner loops.
> 
> Proposed Refactor:
> -
> The proposed refactor replaces this  by allocating a cache-friendly 2D array 
> buffer. This change eliminates strided memory access across the three inner 
> loops, significantly improving cache locality and reducing cache thrashing.
> 
> Additionally, the data is transposed outside the lp loop, which allows for 
> efficient per-line access and write-back to the l buffer, further optimizing 
> performance.
> 
> Performance improvements
> ---
> This change results in a substantial performance improvement  Sharing 
> the FPS data benchmarked on our end for the file 'Tears of Steel' 
> using HandBrake
> 
> Device / CPU ModelOfficial FPS   
> Optimized FPS   % Improvement
> Surface Laptop 11 (10-core X1P64100, L2: 36MB)  3.18  
>  6.15  +93%
> Surface Laptop 11(10-core X1P64100, L2: 36MB) 5.16   7.31 
>  +41%
> Surface Laptop 11 (10-core X1P64100, L2: 36MB)  5.57  
>  9.21  +65%
> AMD Ryzen + NVIDIA RTX 4060 Laptop (12C/24T)9.97 
> 11.22   +12%
> Mac Mini Apple M4 Chip   9.00  12.00  
>  +30%
> 
> --
> -
> ---
>  libavcodec/jpeg2000dwt.c | 72 
> +++-
>  1 file changed, 57 insertions(+), 15 deletions(-)
> 
> diff --git a/libavcodec/jpeg2000dwt.c b/libavcodec/jpeg2000dwt.c index 
> 9ee8122658..45d7897893 100644
> --- a/libavcodec/jpeg2000dwt.c
> +++ b/libavcodec/jpeg2000dwt.c
> @@ -409,6 +409,15 @@ static void dwt_decode97_float(DWTContext *s, float *t)
>  /* position at index O of line range [0-5,w+5] cf. extend function */
>  line += 5;
> 

> +/* Find the largest lv and lv to allocate a 2D Array*/

lv and lv ?
you mean lv anf lh ?


> +int max_dim = 0;
> +for (lev = 0; lev < s->ndeclevels; lev++) {
> +if (s->linelen[lev][0]  > max_dim) max_dim = s->linelen[lev][0];
> +if (s->linelen[lev][1] > max_dim) max_dim = 
> + s->linelen[lev][1];

FFMAX()


> +}
> +float *array2DBlock = av_malloc(max_dim * max_dim * sizeof(float));
> +int useFallback = !array2DBlock;

also is this supposed to be max_dim_h * max_dim_v ?



> +
>  for (lev = 0; lev < s->ndeclevels; lev++) {
>  int lh = s->linelen[lev][0],
>  lv = s->linelen[lev][1],
> @@ -431,23 +440,56 @@ static void dwt_decode97_float(DWTContext *s, float *t)
>  for (i = 0; i < lh; i++)
>  data[w * lp + i] = l[i];
>  }
> -
> -// VER_SD
> -l = line + mv;
> -for (lp = 0; lp < lh; lp++) {
> -int i, j = 0;
> -// copy with interleaving
> -for (i = mv; i < lv; i += 2, j++)
> -l[i] = data[w * j + lp];
> -for (i = 1 - mv; i < lv; i += 2, j++)
> -l[i] = data[w * j + lp];
> -

> -sr_1d97_float(line, mv, mv + lv);

this should be run linewise not columnwise if you dont understand what i mean 
here, please say so and ill elaborate

But basically both vertical and horizontal transforms should be done with row 
based implementations

The code before loads and safes each column (which is bad) your code adds an 
efficient transpose and then copies each row

Theres a ton of unneeded copying here, i think the data in your implementation 
now is copied 4 times for each vertical transform pass

But iam very happy to see a patch submission from Microsoft! :)

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

"I am not trying to be anyone's saviour, I'm trying to think about the  future 
and not be sad" - Elon Musk




Hi Michael,
Thanks so much for getting back! I'll quickly implement the first 3 comments

For the last comment is there a way for me to reach you on regular email to 
elaborate the proposed change more with a better explanation. 
T

[FFmpeg-devel] [PATCH v1 08/23] avcodec/vvc/ctu: refact out intra_data

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/ctu.c | 64 +++-
 1 file changed, 40 insertions(+), 24 deletions(-)

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index f77697af08..c5df898f7b 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -1805,6 +1805,37 @@ static int inter_data(VVCLocalContext *lc)
 return ret;
 }
 
+static int intra_data(VVCLocalContext *lc)
+{
+const VVCFrameContext *fc  = lc->fc;
+const VVCSPS *sps  = lc->fc->ps.sps;
+const CodingUnit *cu   = lc->cu;
+const VVCTreeType tree_type= cu->tree_type;
+const bool  pred_mode_plt_flag = cu->pred_mode == MODE_PLT;
+int ret= 0;
+
+if (tree_type == SINGLE_TREE || tree_type == DUAL_TREE_LUMA) {
+if (pred_mode_plt_flag) {
+avpriv_report_missing_feature(fc->log_ctx, "Palette");
+return AVERROR_PATCHWELCOME;
+} else {
+intra_luma_pred_modes(lc);
+ff_vvc_set_intra_mvf(lc, false, PF_INTRA, cu->ciip_flag);
+}
+}
+if ((tree_type == SINGLE_TREE || tree_type == DUAL_TREE_CHROMA) && 
sps->r->sps_chroma_format_idc) {
+if (pred_mode_plt_flag && tree_type == DUAL_TREE_CHROMA) {
+avpriv_report_missing_feature(fc->log_ctx, "Palette");
+return AVERROR_PATCHWELCOME;
+} else if (!pred_mode_plt_flag) {
+if (!cu->act_enabled_flag)
+intra_chroma_pred_modes(lc);
+}
+}
+
+return ret;
+}
+
 static int hls_coding_unit(VVCLocalContext *lc, int x0, int y0, int cb_width, 
int cb_height,
 int cqt_depth, const VVCTreeType tree_type, VVCModeType mode_type)
 {
@@ -1815,7 +1846,7 @@ static int hls_coding_unit(VVCLocalContext *lc, int x0, 
int y0, int cb_width, in
 const int vs= sps->vshift[CHROMA];
 const int is_128= cb_width > 64 || cb_height > 64;
 int pred_mode_plt_flag  = 0;
-int ret;
+int ret = 0;
 
 CodingUnit *cu = add_cu(lc, x0, y0, cb_width, cb_height, cqt_depth, 
tree_type);
 
@@ -1842,29 +1873,14 @@ static int hls_coding_unit(VVCLocalContext *lc, int x0, 
int y0, int cb_width, in
 avpriv_report_missing_feature(fc->log_ctx, "Adaptive Color Transform");
 return AVERROR_PATCHWELCOME;
 }
-if (cu->pred_mode == MODE_INTRA || cu->pred_mode == MODE_PLT) {
-if (tree_type == SINGLE_TREE || tree_type == DUAL_TREE_LUMA) {
-if (pred_mode_plt_flag) {
-avpriv_report_missing_feature(fc->log_ctx, "Palette");
-return AVERROR_PATCHWELCOME;
-} else {
-intra_luma_pred_modes(lc);
-ff_vvc_set_intra_mvf(lc, false, PF_INTRA, cu->ciip_flag);
-}
-}
-if ((tree_type == SINGLE_TREE || tree_type == DUAL_TREE_CHROMA) && 
sps->r->sps_chroma_format_idc) {
-if (pred_mode_plt_flag && tree_type == DUAL_TREE_CHROMA) {
-avpriv_report_missing_feature(fc->log_ctx, "Palette");
-return AVERROR_PATCHWELCOME;
-} else if (!pred_mode_plt_flag) {
-if (!cu->act_enabled_flag)
-intra_chroma_pred_modes(lc);
-}
-}
-} else if (tree_type != DUAL_TREE_CHROMA) { /* MODE_INTER or MODE_IBC */
-if ((ret = inter_data(lc)) < 0)
-return ret;
-}
+if (cu->pred_mode == MODE_INTRA || cu->pred_mode == MODE_PLT)
+ret = intra_data(lc);
+else if (tree_type != DUAL_TREE_CHROMA) /* MODE_INTER or MODE_IBC */
+ret = inter_data(lc);
+
+if (ret < 0)
+return ret;
+
 if (cu->pred_mode != MODE_INTRA && !pred_mode_plt_flag && 
!lc->cu->pu.general_merge_flag)
 cu->coded_flag = ff_vvc_cu_coded_flag(lc);
 else
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v1] fftools/ffplay: Resolve input file path before processing

2025-05-14 Thread Marton Balint



On Wed, 14 May 2025, Nicolas George wrote:


Appaji (HE12025-05-14):

Fixes ticket: https://trac.ffmpeg.org/ticket/11574

Signed-off-by: Appaji 
---
 fftools/ffplay.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fftools/ffplay.c b/fftools/ffplay.c
index 2a572fc3aa..42f0584b55 100644
--- a/fftools/ffplay.c
+++ b/fftools/ffplay.c
@@ -27,6 +27,7 @@
 #include "config_components.h"
 #include 
 #include 
+#include 
 #include 
 #include 

@@ -3623,9 +3624,17 @@ static int opt_input_file(void *optctx, const char 
*filename)
 filename, input_filename);
 return AVERROR(EINVAL);
 }
-if (!strcmp(filename, "-"))
+
+char resolved_path[PATH_MAX];
+
+if (!realpath(filename, resolved_path)) {
+av_log(NULL, AV_LOG_FATAL, "Failed to resolve path for '%s': %s\n", 
filename, strerror(errno));
+return AVERROR(errno);
+}
+


Hi. Thanks for the patch. Did you test it with non-filenames arguments,
for example http://…?


+if (!strcmp(resolved_path, "-"))
 filename = "fd:";


This should happen before resolution.


-input_filename = av_strdup(filename);
+input_filename = av_strdup(resolved_path);
 if (!input_filename)
 return AVERROR(ENOMEM);



On the whole, I think you are going at it wrong: you are only fixing
this for ffplay, not for ffprobe, ffmpeg and other applications built on
the libraries, and resolving the path can have side effects, for example
if you do not have permission on a parent of the current working
directory.

IMO, the correct way would be to add a stat() early in the opening of
the file and test the device number. But that requires changing quite a
lot of things.


Agreed. You should improve the probing function to fix the ticket, you can 
do a stat in v4l2_read_probe() in libavdevice/v4l2.c, check if it is a 
char device and try a V4L2 IOCTL on it to make sure it is a V4L2 device.


Regards,
Marton



Regards,

--
 Nicolas George
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] fix(configure): fix detection on windows arm64

2025-05-14 Thread Coia Prant
Or do we detect the MSYSTEM environment variable?

Martin Storsjö  于 2025年5月14日周三 03:31写道:

> On Mon, 12 May 2025, Coia Prant wrote:
>
> > On Windows Arm64
> > `uname -m` returned `x86_64` instead of `aarch64`
> > Link: https://github.com/msys2/msys2-runtime/issues/171
> >
> > But `uname -s` contains `ARM64` suffix
> > So check suffix on windows arm64 (for clangarm64)
> >
> > This problem also in VideoLAN/x264
> > Link: https://code.videolan.org/videolan/x264/-/merge_requests/177
> >
> > Signed-off-by: Coia Prant 
> > ---
> > configure | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/configure b/configure
> > index 2e69b3c..d8c1e09 100755
> > --- a/configure
> > +++ b/configure
> > @@ -4157,6 +4157,8 @@ if test "$target_os_default" = aix; then
> > arch_default=$(uname -p)
> > strip_default="strip -X32_64"
> > nm_default="nm -g -X32_64"
> > +elif [[ "$target_os_default" == "mingw"*"arm64" ]] || [[
> "$target_os_default" == "msys"*"arm64" ]]; then
> > +arch_default="aarch64"
> > else
>
> I don't think we should be detecting this for the msys*arm64 cases here.
> If we're in the msys environment, as opposed to the mingw ones, then the
> x86_64 that "uname -m" returns really is correct (even if running emulated
> on aarch64, the msys environment itself is x86_64, so that's the target
> architecture of the compilation).
>
> For the mingw*arm64 case, I haven't thought about all the potential
> consequences of the patch; it may be acceptable. But the script is a
> strict POSIX sh script, it can't use bash constructs (which is what
> Michael observed in testing the patch).
>
> // Martin
>
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 7/8] avcodec/svq3: Check that for 8 byte space before subtracting

2025-05-14 Thread Michael Niedermayer
On Wed, May 14, 2025 at 06:34:25PM +0200, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > No testcase
> > 
> > Signed-off-by: Michael Niedermayer 
> > ---
> >  libavcodec/svq3.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/libavcodec/svq3.c b/libavcodec/svq3.c
> > index f730358e2f9..30bc9334af7 100644
> > --- a/libavcodec/svq3.c
> > +++ b/libavcodec/svq3.c
> > @@ -1173,7 +1173,7 @@ static av_cold int svq3_decode_init(AVCodecContext 
> > *avctx)
> >  int w,h;
> >  
> >  size = AV_RB32(&extradata[4]);
> > -if (size > extradata_end - extradata - 8)
> > +if (extradata_end - extradata < 8 || size > extradata_end - 
> > extradata - 8)
> >  return AVERROR_INVALIDDATA;
> >  init_get_bits(&gb, extradata + 8, size * 8);
> >  
> 
> Can't be triggered: This code is only executed iff marker_found is 1;
> and given the "m + 8 < avctx->extradata_size" check in the loop it is
> guaranteed that there are at least eight bytes of extradata available.

True

Did we ever had someone miss such distributed checks and
produce buggy code through a change ?
If not then i think you are correct here and lets skip adding an
explicit check, its ugly to have such redundant checks

thx

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The day soldiers stop bringing you their problems is the day you have stopped 
leading them. They have either lost confidence that you can help or concluded 
you do not care. Either case is a failure of leadership. - Colin Powell


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/5] tools/target_dec_fuzzer: Adjust threshold for WEBP

2025-05-14 Thread Michael Niedermayer
On Tue, May 13, 2025 at 01:58:28AM +0200, Michael Niedermayer wrote:
> Fixes: Timeout
> Fixes: 
> 403345121/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WEBP_fuzzer-6408323910139904
> 
> Found-by: continuous fuzzing process 
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer 
> ---
>  tools/target_dec_fuzzer.c | 1 +
>  1 file changed, 1 insertion(+)

will apply

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you drop bombs on a foreign country and kill a hundred thousand
innocent people, expect your government to call the consequence
"unprovoked inhuman terrorist attacks" and use it to justify dropping
more bombs and killing more people. The technology changed, the idea is old.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 7/8] avcodec/svq3: Check that for 8 byte space before subtracting

2025-05-14 Thread Andreas Rheinhardt
Michael Niedermayer:
> On Wed, May 14, 2025 at 06:34:25PM +0200, Andreas Rheinhardt wrote:
>> Michael Niedermayer:
>>> No testcase
>>>
>>> Signed-off-by: Michael Niedermayer 
>>> ---
>>>  libavcodec/svq3.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/libavcodec/svq3.c b/libavcodec/svq3.c
>>> index f730358e2f9..30bc9334af7 100644
>>> --- a/libavcodec/svq3.c
>>> +++ b/libavcodec/svq3.c
>>> @@ -1173,7 +1173,7 @@ static av_cold int svq3_decode_init(AVCodecContext 
>>> *avctx)
>>>  int w,h;
>>>  
>>>  size = AV_RB32(&extradata[4]);
>>> -if (size > extradata_end - extradata - 8)
>>> +if (extradata_end - extradata < 8 || size > extradata_end - 
>>> extradata - 8)
>>>  return AVERROR_INVALIDDATA;
>>>  init_get_bits(&gb, extradata + 8, size * 8);
>>>  
>>
>> Can't be triggered: This code is only executed iff marker_found is 1;
>> and given the "m + 8 < avctx->extradata_size" check in the loop it is
>> guaranteed that there are at least eight bytes of extradata available.
> 
> True
> 
> Did we ever had someone miss such distributed checks and
> produce buggy code through a change ?
> If not then i think you are correct here and lets skip adding an
> explicit check, its ugly to have such redundant checks
> 

We could avoid the whole marker_found branch (and the variable) by
moving the whole if (marker_found) block into a function of its own that
is called where currently marker_found is set to one. I'll send a patch
for this.

- Andreas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/8] avcodec/dnxuc_parser: Use ff_parse_close()

2025-05-14 Thread Michael Niedermayer
On Sun, May 11, 2025 at 02:32:38AM +0200, Michael Niedermayer wrote:
> Fixes: buffer leak
> Fixes: 
> 398894512/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-6716597473705984
> 
> Found-by: continuous fuzzing process 
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer 
> ---
>  libavcodec/dnxuc_parser.c | 1 +
>  1 file changed, 1 insertion(+)

will apply patchset except 5 and 7


[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

z(9) = an object that transcends all computable functions describable
in finite terms. - ChatGPT in 2024


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/7] avcodec/svq3: Factor out decoding extradata

2025-05-14 Thread Andreas Rheinhardt
Patches attached.

- Andreas
From 19d9f3cfc278a2b442923f1bea505ffb079fe3c1 Mon Sep 17 00:00:00 2001
From: Andreas Rheinhardt 
Date: Thu, 15 May 2025 03:09:45 +0200
Subject: [PATCH 1/7] avcodec/svq3: Factor out decoding extradata

Reduces indentation and avoids an extra variable for whether
a sequence header has been found.
It also fixes potential undefined behaviour:
NULL + 0 is undefined and happens when no extradata is available.

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/svq3.c | 259 +++---
 1 file changed, 132 insertions(+), 127 deletions(-)

diff --git a/libavcodec/svq3.c b/libavcodec/svq3.c
index 6319e9b021..f264140dbc 100644
--- a/libavcodec/svq3.c
+++ b/libavcodec/svq3.c
@@ -1114,14 +1114,139 @@ static void init_dequant4_coeff_table(SVQ3Context *s)
 }
 }
 
+static av_cold int svq3_decode_extradata(AVCodecContext *avctx, SVQ3Context *s,
+ int seqh_offset)
+{
+const uint8_t *extradata = avctx->extradata + seqh_offset;
+unsigned int size = AV_RB32(extradata + 4);
+GetBitContext gb;
+int ret;
+
+if (size > avctx->extradata_size - seqh_offset - 8)
+return AVERROR_INVALIDDATA;
+extradata += 8;
+init_get_bits(&gb, extradata, size * 8);
+
+/* 'frame size code' and optional 'width, height' */
+int frame_size_code = get_bits(&gb, 3);
+int w, h;
+switch (frame_size_code) {
+case 0:
+w = 160;
+h = 120;
+break;
+case 1:
+w = 128;
+h =  96;
+break;
+case 2:
+w = 176;
+h = 144;
+break;
+case 3:
+w = 352;
+h = 288;
+break;
+case 4:
+w = 704;
+h = 576;
+break;
+case 5:
+w = 240;
+h = 180;
+break;
+case 6:
+w = 320;
+h = 240;
+break;
+case 7:
+w = get_bits(&gb, 12);
+h = get_bits(&gb, 12);
+break;
+}
+ret = ff_set_dimensions(avctx, w, h);
+if (ret < 0)
+return ret;
+
+s->halfpel_flag  = get_bits1(&gb);
+s->thirdpel_flag = get_bits1(&gb);
+
+/* unknown fields */
+int unk0 = get_bits1(&gb);
+int unk1 = get_bits1(&gb);
+int unk2 = get_bits1(&gb);
+int unk3 = get_bits1(&gb);
+
+s->low_delay = get_bits1(&gb);
+avctx->has_b_frames = !s->low_delay;
+
+/* unknown field */
+int unk4 = get_bits1(&gb);
+
+av_log(avctx, AV_LOG_DEBUG, "Unknown fields %d %d %d %d %d\n",
+   unk0, unk1, unk2, unk3, unk4);
+
+if (skip_1stop_8data_bits(&gb) < 0)
+return AVERROR_INVALIDDATA;
+
+s->has_watermark = get_bits1(&gb);
+
+if (!s->has_watermark)
+return 0;
+
+#if CONFIG_ZLIB
+unsigned watermark_width  = get_interleaved_ue_golomb(&gb);
+unsigned watermark_height = get_interleaved_ue_golomb(&gb);
+int u1= get_interleaved_ue_golomb(&gb);
+int u2= get_bits(&gb, 8);
+int u3= get_bits(&gb, 2);
+int u4= get_interleaved_ue_golomb(&gb);
+unsigned long buf_len = watermark_width *
+watermark_height * 4;
+int offset= get_bits_count(&gb) + 7 >> 3;
+
+if (watermark_height <= 0 ||
+get_bits_left(&gb) <= 0 ||
+(uint64_t)watermark_width * 4 > UINT_MAX / watermark_height)
+return AVERROR_INVALIDDATA;
+
+av_log(avctx, AV_LOG_DEBUG, "watermark size: %ux%u\n",
+   watermark_width, watermark_height);
+av_log(avctx, AV_LOG_DEBUG,
+   "u1: %x u2: %x u3: %x compressed data size: %d offset: %d\n",
+   u1, u2, u3, u4, offset);
+
+uint8_t *buf = av_malloc(buf_len);
+if (!buf)
+return AVERROR(ENOMEM);
+
+if (uncompress(buf, &buf_len, extradata + offset,
+   size - offset) != Z_OK) {
+av_log(avctx, AV_LOG_ERROR,
+   "could not uncompress watermark logo\n");
+av_free(buf);
+return -1;
+}
+s->watermark_key = av_bswap16(av_crc(av_crc_get_table(AV_CRC_16_CCITT), 0, buf, buf_len));
+
+s->watermark_key = s->watermark_key << 16 | s->watermark_key;
+av_log(avctx, AV_LOG_DEBUG,
+   "watermark key %#"PRIx32"\n", s->watermark_key);
+av_free(buf);
+
+return 0;
+#else
+av_log(avctx, AV_LOG_ERROR,
+   "this svq3 file contains watermark which need zlib support compiled in\n");
+return AVERROR(ENOSYS);
+#endif
+}
+
 static av_cold int svq3_decode_init(AVCodecContext *avctx)
 {
 SVQ3Context *s = avctx->priv_data;
 int m, x, y;
 unsigned char *extradata;
-unsigned char *extradata_end;
-unsigned int size;
-int marker_found = 0;
 int ret;
 
 s->cur_pic  = &s->frames[0];
@@ -1154,139 +1279,19 @@ static av_cold int svq3_decode_init(AVCodecContext *avctx)
 
 /* prowl for the "SEQH" marker in the extradata */
 extradata = (unsigned char *)avctx->extr

Re: [FFmpeg-devel] [PATCH 2/3] avformat/avidec: Ignore duplicate GAB2

2025-05-14 Thread Michael Niedermayer
On Sat, May 10, 2025 at 04:36:07PM +0200, Michael Niedermayer wrote:
> Fixes: memleak
> Fixes: 
> 398401912/clusterfuzz-testcase-minimized-ffmpeg_dem_AVI_fuzzer-4669849976766464
> 
> Found-by: continuous fuzzing process 
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer 
> ---
>  libavformat/avidec.c | 4 
>  1 file changed, 4 insertions(+)

will apply

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

What is kyc? Its a tool that makes you give out your real ID, while criminals
give out a forged ID card.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avformat/mpegts: update stream info when PMT ES stream_type changes

2025-05-14 Thread Pavel Koshevoy
If there are no further comments I'll commit and push this some time during
the weekend.

Pavel.


On Fri, May 9, 2025, 6:01 PM Pavel Koshevoy  wrote:

> I have a several .ts captures where video and audio codec changes
> even though the PMT version does not change and the PIDs stay the same.
> This happens during transition to/from slate (mpeg2 video and audio)
> to network broadcast (hevc video and eac3 audio in private PES).
>
> I've updated fate ts-demux expected results.
> ---
>  libavformat/mpegts.c| 4 +++-
>  tests/ref/fate/ts-demux | 4 ++--
>  2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
> index 54594b3a11..deb69a0548 100644
> --- a/libavformat/mpegts.c
> +++ b/libavformat/mpegts.c
> @@ -940,6 +940,8 @@ static int mpegts_set_stream_info(AVStream *st,
> PESContext *pes,
>  mpegts_find_stream_type(st, pes->stream_type, ISO_types);
>  if (pes->stream_type == STREAM_TYPE_AUDIO_MPEG2 || pes->stream_type
> == STREAM_TYPE_AUDIO_AAC)
>  sti->request_probe = 50;
> +if (pes->stream_type == STREAM_TYPE_PRIVATE_DATA)
> +sti->request_probe = AVPROBE_SCORE_STREAM_RETRY;
>  if ((prog_reg_desc == AV_RL32("HDMV") ||
>   prog_reg_desc == AV_RL32("HDPR")) &&
>  st->codecpar->codec_id == AV_CODEC_ID_NONE) {
> @@ -2508,7 +2510,7 @@ static void pmt_cb(MpegTSFilter *filter, const
> uint8_t *section, int section_len
>  if (!st)
>  goto out;
>
> -if (pes && !pes->stream_type)
> +if (pes && pes->stream_type != stream_type)
>  mpegts_set_stream_info(st, pes, stream_type, prog_reg_desc);
>
>  add_pid_to_program(prg, pid);
> diff --git a/tests/ref/fate/ts-demux b/tests/ref/fate/ts-demux
> index 6a830d0d99..d56cc27937 100644
> --- a/tests/ref/fate/ts-demux
> +++ b/tests/ref/fate/ts-demux
> @@ -24,6 +24,6 @@
> packet|codec_type=video|stream_index=0|pts=3912686363|pts_time=43474.292922|dts=
>
>  
> packet|codec_type=audio|stream_index=1|pts=3912644825|pts_time=43473.831389|dts=3912644825|dts_time=43473.831389|duration=2880|duration_time=0.032000|size=906|pos=474888|flags=K__|data_hash=CRC32:0893d398
>
>  
> packet|codec_type=audio|stream_index=2|pts=3912645580|pts_time=43473.839778|dts=3912645580|dts_time=43473.839778|duration=2880|duration_time=0.032000|size=354|pos=491808|flags=K__|data_hash=CRC32:f5963fa6
>  
> stream|index=0|codec_name=mpeg2video|profile=4|codec_type=video|codec_tag_string=[2][0][0][0]|codec_tag=0x0002|width=1280|height=720|coded_width=0|coded_height=0|has_b_frames=1|sample_aspect_ratio=1:1|display_aspect_ratio=16:9|pix_fmt=yuv420p|level=4|color_range=tv|color_space=unknown|color_transfer=unknown|color_primaries=unknown|chroma_location=left|field_order=progressive|refs=1|ts_id=32776|ts_packetsize=188|id=0x31|r_frame_rate=6/1001|avg_frame_rate=6/1001|time_base=1/9|start_pts=3912669846|start_time=43474.109400|duration_ts=19519|duration=0.216878|bit_rate=1500|max_bit_rate=N/A|bits_per_raw_sample=N/A|nb_frames=N/A|nb_read_frames=N/A|nb_read_packets=15|extradata_size=150|extradata_hash=CRC32:53134fa8|disposition:default=0|disposition:dub=0|disposition:original=0|disposition:comment=0|disposition:lyrics=0|disposition:karaoke=0|disposition:forced=0|disposition:hearing_impaired=0|disposition:visual_impaired=0|disposition:clean_effects=0|disposition:attached_pic=
 
0|disposition:timed_thumbnails=0|disposition:non_diegetic=0|disposition:captions=0|disposition:descriptions=0|disposition:metadata=0|disposition:dependent=0|disposition:still_image=0|disposition:multilayer=0|side_datum/cpb_properties:side_data_type=CPB
> properties|side_datum/cpb_properties:max_bitrate=1500|side_datum/cpb_properties:min_bitrate=0|side_datum/cpb_properties:avg_bitrate=0|side_datum/cpb_properties:buffer_size=9781248|side_datum/cpb_properties:vbv_delay=-1
>
> -stream|index=1|codec_name=ac3|profile=unknown|codec_type=audio|codec_tag_string=[4][0][0][0]|codec_tag=0x0004|sample_fmt=fltp|sample_rate=48000|channels=6|channel_layout=5.1(side)|bits_per_sample=0|initial_padding=0|dmix_mode=0|ltrt_cmixlev=0.00|ltrt_surmixlev=0.00|loro_cmixlev=0.00|loro_surmixlev=0.00|ts_id=32776|ts_packetsize=188|id=0x34|r_frame_rate=0/0|avg_frame_rate=0/0|time_base=1/9|start_pts=3912633305|start_time=43473.703389|duration_ts=14400|duration=0.16|bit_rate=384000|max_bit_rate=N/A|bits_per_raw_sample=N/A|nb_frames=N/A|nb_read_frames=N/A|nb_read_packets=5|disposition:default=0|disposition:dub=0|disposition:original=0|disposition:comment=0|disposition:lyrics=0|disposition:karaoke=0|disposition:forced=0|disposition:hearing_impaired=0|disposition:visual_impaired=0|disposition:clean_effects=0|disposition:attached_pic=0|disposition:timed_thumbnails=0|disposition:non_diegetic=0|disposition:captions=0|disposition:descriptions=0|disposition:metadata=
 
0|disposition:dependent=0|disposition:still_image=0|disposition:multilayer=0|tag:language=eng
>
> -stream|index=2|cod

Re: [FFmpeg-devel] [PATCH v5 1/7] libavformat/oggdec.h: Document packet function return value.

2025-05-14 Thread Michael Niedermayer
On Fri, May 09, 2025 at 06:43:21PM -0500, Romain Beauxis wrote:
> ---
>  libavformat/oggdec.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/libavformat/oggdec.h b/libavformat/oggdec.h
> index 43df23f4cb..5225b77a07 100644
> --- a/libavformat/oggdec.h
> +++ b/libavformat/oggdec.h
> @@ -38,6 +38,12 @@ struct ogg_codec {
>   * -1 if an error occurred or for unsupported stream
>   */
>  int (*header)(AVFormatContext *, int);
> +/**
> + * Attempt to process a packet as a data packet
> + * @return < 0 (AVERROR) code or -1 on error
> + * == 0 if the packet was a regular data packet.
> + * == 0 or 1 if the packet was a header from a chained bitstream.
> + */
>  int (*packet)(AVFormatContext *, int);
>  /**
>   * Translate a granule into a timestamp.
> -- 

will apply, so the patchset becomes smaller

thx


[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Good people do not need laws to tell them to act responsibly, while bad
people will find a way around the laws. -- Plato


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] MAINTAINERS: Add entry for samples-request

2025-05-14 Thread Michael Niedermayer
On Sun, May 11, 2025 at 10:13:36PM +0200, Michael Niedermayer wrote:
> This is based on discussion with the GA and its simply the people
> who have done or tried to do some uploads recently.
> 
> Everyone who has a shell account on ffmpeg.org should have powers to
> upload samples.
> 
> CC: compn 
> CC: Thilo Borgmann 
> Signed-off-by: Michael Niedermayer 
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)

will apply

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Democracy is the form of government in which you can choose your dictator


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v5 2/7] libavformat/oggdec.{c, h}: Implement packet skip on packet return value of 1

2025-05-14 Thread Michael Niedermayer
On Fri, May 09, 2025 at 06:43:22PM -0500, Romain Beauxis wrote:
> ---
>  libavformat/oggdec.c | 22 ++
>  libavformat/oggdec.h |  1 +
>  2 files changed, 15 insertions(+), 8 deletions(-)

will apply

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 7/8] avcodec/svq3: Check that for 8 byte space before subtracting

2025-05-14 Thread Andreas Rheinhardt
Michael Niedermayer:
> No testcase
> 
> Signed-off-by: Michael Niedermayer 
> ---
>  libavcodec/svq3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/svq3.c b/libavcodec/svq3.c
> index f730358e2f9..30bc9334af7 100644
> --- a/libavcodec/svq3.c
> +++ b/libavcodec/svq3.c
> @@ -1173,7 +1173,7 @@ static av_cold int svq3_decode_init(AVCodecContext 
> *avctx)
>  int w,h;
>  
>  size = AV_RB32(&extradata[4]);
> -if (size > extradata_end - extradata - 8)
> +if (extradata_end - extradata < 8 || size > extradata_end - 
> extradata - 8)
>  return AVERROR_INVALIDDATA;
>  init_get_bits(&gb, extradata + 8, size * 8);
>  

Can't be triggered: This code is only executed iff marker_found is 1;
and given the "m + 8 < avctx->extradata_size" check in the loop it is
guaranteed that there are at least eight bytes of extradata available.

- Andreas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] Boost FPS and performance: Optimize vertical loop for cache-friendly access [libavcodec/jpeg2000dwt.c:dwt_decode97_float]

2025-05-14 Thread Michael Niedermayer
Hi Chitra

On Wed, May 14, 2025 at 03:55:59AM +, Chitra Dey Sarkar via ffmpeg-devel 
wrote:
> Original Implementation:
> -
> In the original implementation, the "VER_SD" section processes image data 
> stored in *data using strided memory access in a vertical fashion This leads 
> to inefficient memory access patterns and cache thrashing due to 
> non-sequential data access across multiple inner loops.
> 
> Proposed Refactor:
> -
> The proposed refactor replaces this  by allocating a cache-friendly 2D array 
> buffer. This change eliminates strided memory access across the three inner 
> loops, significantly improving cache locality and reducing cache thrashing.
> 
> Additionally, the data is transposed outside the lp loop, which allows for 
> efficient per-line access and write-back to the l buffer, further optimizing 
> performance.
> 
> Performance improvements
> ---
> This change results in a substantial performance improvement  Sharing the FPS 
> data benchmarked on our end for the file 'Tears of Steel' using HandBrake
> 
> Device / CPU ModelOfficial FPS   
> Optimized FPS   % Improvement
> Surface Laptop 11 (10-core X1P64100, L2: 36MB)  3.18  
>  6.15  +93%
> Surface Laptop 11(10-core X1P64100, L2: 36MB) 5.16   7.31 
>  +41%
> Surface Laptop 11 (10-core X1P64100, L2: 36MB)  5.57  
>  9.21  +65%
> AMD Ryzen + NVIDIA RTX 4060 Laptop (12C/24T)9.97 
> 11.22   +12%
> Mac Mini Apple M4 Chip   9.00  12.00  
>  +30%
> 
> ---
> ---
>  libavcodec/jpeg2000dwt.c | 72 +++-
>  1 file changed, 57 insertions(+), 15 deletions(-)
> 
> diff --git a/libavcodec/jpeg2000dwt.c b/libavcodec/jpeg2000dwt.c index 
> 9ee8122658..45d7897893 100644
> --- a/libavcodec/jpeg2000dwt.c
> +++ b/libavcodec/jpeg2000dwt.c
> @@ -409,6 +409,15 @@ static void dwt_decode97_float(DWTContext *s, float *t)
>  /* position at index O of line range [0-5,w+5] cf. extend function */
>  line += 5;
> 

> +/* Find the largest lv and lv to allocate a 2D Array*/

lv and lv ?
you mean lv anf lh ?


> +int max_dim = 0;
> +for (lev = 0; lev < s->ndeclevels; lev++) {
> +if (s->linelen[lev][0]  > max_dim) max_dim = s->linelen[lev][0];
> +if (s->linelen[lev][1] > max_dim) max_dim = s->linelen[lev][1];

FFMAX()


> +}
> +float *array2DBlock = av_malloc(max_dim * max_dim * sizeof(float));
> +int useFallback = !array2DBlock;

also is this supposed to be max_dim_h * max_dim_v ?



> +
>  for (lev = 0; lev < s->ndeclevels; lev++) {
>  int lh = s->linelen[lev][0],
>  lv = s->linelen[lev][1],
> @@ -431,23 +440,56 @@ static void dwt_decode97_float(DWTContext *s, float *t)
>  for (i = 0; i < lh; i++)
>  data[w * lp + i] = l[i];
>  }
> -
> -// VER_SD
> -l = line + mv;
> -for (lp = 0; lp < lh; lp++) {
> -int i, j = 0;
> -// copy with interleaving
> -for (i = mv; i < lv; i += 2, j++)
> -l[i] = data[w * j + lp];
> -for (i = 1 - mv; i < lv; i += 2, j++)
> -l[i] = data[w * j + lp];
> -

> -sr_1d97_float(line, mv, mv + lv);

this should be run linewise not columnwise
if you dont understand what i mean here, please say so and ill elaborate

But basically both vertical and horizontal transforms should be done with
row based implementations

The code before loads and safes each column (which is bad)
your code adds an efficient transpose and then copies each row

Theres a ton of unneeded copying here, i think the data in your
implementation now is copied 4 times for each vertical transform
pass

But iam very happy to see a patch submission from Microsoft! :)

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

"I am not trying to be anyone's saviour, I'm trying to think about the
 future and not be sad" - Elon Musk



signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 01/16] ffv1enc_vulkan: merge all encoder variants into one file

2025-05-14 Thread Lynne
Makes it easier to work with, despite the heavy ifdeffery.
---
 libavcodec/ffv1enc_vulkan.c|  21 +--
 libavcodec/vulkan/Makefile |   4 +-
 libavcodec/vulkan/ffv1_enc.comp| 240 -
 libavcodec/vulkan/ffv1_enc_ac.comp |  83 -
 libavcodec/vulkan/ffv1_enc_common.comp | 101 ---
 libavcodec/vulkan/ffv1_enc_rgb.comp|  83 -
 libavcodec/vulkan/ffv1_enc_vlc.comp| 112 
 7 files changed, 244 insertions(+), 400 deletions(-)
 delete mode 100644 libavcodec/vulkan/ffv1_enc_ac.comp
 delete mode 100644 libavcodec/vulkan/ffv1_enc_common.comp
 delete mode 100644 libavcodec/vulkan/ffv1_enc_rgb.comp
 delete mode 100644 libavcodec/vulkan/ffv1_enc_vlc.comp

diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c
index 42a98a5efa..f4b54b8375 100644
--- a/libavcodec/ffv1enc_vulkan.c
+++ b/libavcodec/ffv1enc_vulkan.c
@@ -114,13 +114,9 @@ extern const char *ff_source_rangecoder_comp;
 extern const char *ff_source_ffv1_vlc_comp;
 extern const char *ff_source_ffv1_common_comp;
 extern const char *ff_source_ffv1_reset_comp;
-extern const char *ff_source_ffv1_enc_common_comp;
 extern const char *ff_source_ffv1_enc_rct_comp;
-extern const char *ff_source_ffv1_enc_vlc_comp;
-extern const char *ff_source_ffv1_enc_ac_comp;
 extern const char *ff_source_ffv1_enc_setup_comp;
 extern const char *ff_source_ffv1_enc_comp;
-extern const char *ff_source_ffv1_enc_rgb_comp;
 
 typedef struct FFv1VkParameters {
 VkDeviceAddress slice_state;
@@ -961,6 +957,9 @@ static void define_shared_code(AVCodecContext *avctx, 
FFVulkanShader *shd)
 av_bprintf(&shd->src, "#define GOLOMB\n" );
 }
 
+if (fv->is_rgb)
+av_bprintf(&shd->src, "#define RGB\n");
+
 GLSLF(0, #define TYPE int%i_t
,smp_bits);
 GLSLF(0, #define VTYPE2 i%ivec2  
,smp_bits);
 GLSLF(0, #define VTYPE3 i%ivec3  
,smp_bits);
@@ -1260,7 +1259,6 @@ static int init_encode_shader(AVCodecContext *avctx, 
FFVkSPIRVCompiler *spv)
 {
 int err;
 VulkanEncodeFFv1Context *fv = avctx->priv_data;
-FFV1Context *f = &fv->ctx;
 FFVulkanShader *shd = &fv->enc;
 FFVulkanDescriptorSetBinding *desc_set;
 
@@ -1344,18 +1342,7 @@ static int init_encode_shader(AVCodecContext *avctx, 
FFVkSPIRVCompiler *spv)
 };
 RET(ff_vk_shader_add_descriptor_set(&fv->s, shd, desc_set, 3, 0, 0));
 
-/* Assemble the shader body */
-GLSLD(ff_source_ffv1_enc_common_comp);
-
-if (f->ac == AC_GOLOMB_RICE)
-GLSLD(ff_source_ffv1_enc_vlc_comp);
-else
-GLSLD(ff_source_ffv1_enc_ac_comp);
-
-if (fv->is_rgb)
-GLSLD(ff_source_ffv1_enc_rgb_comp);
-else
-GLSLD(ff_source_ffv1_enc_comp);
+GLSLD(ff_source_ffv1_enc_comp);
 
 RET(spv->compile_shader(&fv->s, spv, shd, &spv_data, &spv_len, "main",
 &spv_opaque));
diff --git a/libavcodec/vulkan/Makefile b/libavcodec/vulkan/Makefile
index feb5d2ea51..4bbcb38c6a 100644
--- a/libavcodec/vulkan/Makefile
+++ b/libavcodec/vulkan/Makefile
@@ -6,10 +6,8 @@ clean::
 OBJS-$(CONFIG_FFV1_VULKAN_ENCODER)  +=  vulkan/common.o \
vulkan/rangecoder.o vulkan/ffv1_vlc.o \
vulkan/ffv1_common.o 
vulkan/ffv1_reset.o \
-   vulkan/ffv1_enc_common.o \
vulkan/ffv1_enc_rct.o 
vulkan/ffv1_enc_setup.o \
-   vulkan/ffv1_enc_vlc.o 
vulkan/ffv1_enc_ac.o \
-   vulkan/ffv1_enc.o vulkan/ffv1_enc_rgb.o
+   vulkan/ffv1_enc.o
 
 OBJS-$(CONFIG_FFV1_VULKAN_HWACCEL)  +=  vulkan/common.o \
vulkan/rangecoder.o vulkan/ffv1_vlc.o \
diff --git a/libavcodec/vulkan/ffv1_enc.comp b/libavcodec/vulkan/ffv1_enc.comp
index 4b851fd711..9854ecad51 100644
--- a/libavcodec/vulkan/ffv1_enc.comp
+++ b/libavcodec/vulkan/ffv1_enc.comp
@@ -20,12 +20,186 @@
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  */
 
+ivec2 get_diff(ivec2 pos, ivec2 off, int p, int comp, int sw, int bits)
+{
+const ivec2 yoff_border1 = off.x == 0 ? ivec2(1, -1) : ivec2(0, 0);
+const ivec2 yoff_border2 = off.x == 1 ? ivec2(1, -1) : ivec2(0, 0);
+
+TYPE top2 = TYPE(0);
+if (off.y > 1)
+top2 = TYPE(imageLoad(src[p], pos + ivec2(0, -2))[comp]);
+
+VTYPE3 top  = VTYPE3(TYPE(0),
+ TYPE(0),
+ TYPE(0));
+if (off.y > 0 && off != ivec2(0, 1))
+top[0] = TYPE(imageLoad(src[p], pos + ivec2(-1, -1) + 
yoff_border1)[comp]);
+if (off.y > 0) {
+top[1] = TYPE(imageLoad(src[p], pos + ivec2(0, -1))[comp]);
+top[2] = TYPE(imageLoad(src[p], pos + ivec2(min(1, sw - off.x - 1)

[FFmpeg-devel] [PATCH 02/16] vulkan/ffv1: synchronize get_pred implementations between encoder and decoder

2025-05-14 Thread Lynne
---
 libavcodec/vulkan/ffv1_dec.comp | 32 ++---
 libavcodec/vulkan/ffv1_enc.comp | 85 -
 2 files changed, 68 insertions(+), 49 deletions(-)

diff --git a/libavcodec/vulkan/ffv1_dec.comp b/libavcodec/vulkan/ffv1_dec.comp
index fc0175c715..1c313b3168 100644
--- a/libavcodec/vulkan/ffv1_dec.comp
+++ b/libavcodec/vulkan/ffv1_dec.comp
@@ -29,19 +29,19 @@
 #endif
 
 #ifdef RGB
-ivec2 get_pred(ivec2 sp, ivec2 off, int p, int sw, uint8_t quant_table_idx)
+ivec2 get_pred(readonly uimage2D pred, ivec2 sp, ivec2 off, int comp, int sw, 
uint8_t quant_table_idx)
 {
 const ivec2 yoff_border1 = expectEXT(off.x == 0, false) ? ivec2(1, -1) : 
ivec2(0, 0);
 
 /* Thanks to the same coincidence as below, we can skip checking if off == 
0, 1 */
-VTYPE3 top  = VTYPE3(TYPE(imageLoad(dec[p], sp + LADDR(off + ivec2(-1, -1) 
+ yoff_border1))[0]),
- TYPE(imageLoad(dec[p], sp + LADDR(off + ivec2(0, 
-1)))[0]),
- TYPE(imageLoad(dec[p], sp + LADDR(off + ivec2(min(1, 
sw - off.x - 1), -1)))[0]));
+VTYPE3 top  = VTYPE3(TYPE(imageLoad(pred, sp + LADDR(off + ivec2(-1, -1) + 
yoff_border1))[comp]),
+ TYPE(imageLoad(pred, sp + LADDR(off + ivec2(0, 
-1)))[comp]),
+ TYPE(imageLoad(pred, sp + LADDR(off + ivec2(min(1, sw 
- off.x - 1), -1)))[comp]));
 
 /* Normally, we'd need to check if off != ivec2(0, 0) here, since 
otherwise, we must
  * return zero. However, ivec2(-1,  0) + ivec2(1, -1) == ivec2(0, -1), 
e.g. previous
  * row, 0 offset, same slice, which is zero since we zero out the buffer 
for RGB */
-TYPE cur = TYPE(imageLoad(dec[p], sp + LADDR(off + ivec2(-1,  0) + 
yoff_border1))[0]);
+TYPE cur = TYPE(imageLoad(pred, sp + LADDR(off + ivec2(-1,  0) + 
yoff_border1))[comp]);
 
 int base = quant_table[quant_table_idx][0][(cur- top[0]) & 
MAX_QUANT_TABLE_MASK] +
quant_table[quant_table_idx][1][(top[0] - top[1]) & 
MAX_QUANT_TABLE_MASK] +
@@ -51,12 +51,12 @@ ivec2 get_pred(ivec2 sp, ivec2 off, int p, int sw, uint8_t 
quant_table_idx)
 TYPE cur2 = TYPE(0);
 if (expectEXT(off.x > 0, true)) {
 const ivec2 yoff_border2 = expectEXT(off.x == 1, false) ? 
ivec2(-1, -1) : ivec2(-2, 0);
-cur2 = TYPE(imageLoad(dec[p], sp + LADDR(off + yoff_border2))[0]);
+cur2 = TYPE(imageLoad(pred, sp + LADDR(off + yoff_border2))[comp]);
 }
 base += quant_table[quant_table_idx][3][(cur2 - cur) & 
MAX_QUANT_TABLE_MASK];
 
 /* top-2 became current upon swap */
-TYPE top2 = TYPE(imageLoad(dec[p], sp + LADDR(off))[0]);
+TYPE top2 = TYPE(imageLoad(pred, sp + LADDR(off))[comp]);
 base += quant_table[quant_table_idx][4][(top2 - top[1]) & 
MAX_QUANT_TABLE_MASK];
 }
 
@@ -64,7 +64,7 @@ ivec2 get_pred(ivec2 sp, ivec2 off, int p, int sw, uint8_t 
quant_table_idx)
 return ivec2(base, predict(cur, VTYPE2(top)));
 }
 #else
-ivec2 get_pred(ivec2 sp, ivec2 off, int p, int sw, uint8_t quant_table_idx)
+ivec2 get_pred(readonly uimage2D pred, ivec2 sp, ivec2 off, int comp, int sw, 
uint8_t quant_table_idx)
 {
 const ivec2 yoff_border1 = off.x == 0 ? ivec2(1, -1) : ivec2(0, 0);
 sp += off;
@@ -73,15 +73,15 @@ ivec2 get_pred(ivec2 sp, ivec2 off, int p, int sw, uint8_t 
quant_table_idx)
  TYPE(0),
  TYPE(0));
 if (off.y > 0 && off != ivec2(0, 1))
-top[0] = TYPE(imageLoad(dec[p], sp + ivec2(-1, -1) + yoff_border1)[0]);
+top[0] = TYPE(imageLoad(pred, sp + ivec2(-1, -1) + 
yoff_border1)[comp]);
 if (off.y > 0) {
-top[1] = TYPE(imageLoad(dec[p], sp + ivec2(0, -1))[0]);
-top[2] = TYPE(imageLoad(dec[p], sp + ivec2(min(1, sw - off.x - 1), 
-1))[0]);
+top[1] = TYPE(imageLoad(pred, sp + ivec2(0, -1))[comp]);
+top[2] = TYPE(imageLoad(pred, sp + ivec2(min(1, sw - off.x - 1), 
-1))[comp]);
 }
 
 TYPE cur = TYPE(0);
 if (off != ivec2(0, 0))
-cur = TYPE(imageLoad(dec[p], sp + ivec2(-1,  0) + yoff_border1)[0]);
+cur = TYPE(imageLoad(pred, sp + ivec2(-1,  0) + yoff_border1)[comp]);
 
 int base = quant_table[quant_table_idx][0][(cur - top[0]) & 
MAX_QUANT_TABLE_MASK] +
quant_table[quant_table_idx][1][(top[0] - top[1]) & 
MAX_QUANT_TABLE_MASK] +
@@ -92,13 +92,13 @@ ivec2 get_pred(ivec2 sp, ivec2 off, int p, int sw, uint8_t 
quant_table_idx)
 TYPE cur2 = TYPE(0);
 if (off.x > 0 && off != ivec2(1, 0)) {
 const ivec2 yoff_border2 = off.x == 1 ? ivec2(1, -1) : ivec2(0, 0);
-cur2 = TYPE(imageLoad(dec[p], sp + ivec2(-2,  0) + 
yoff_border2)[0]);
+cur2 = TYPE(imageLoad(pred, sp + ivec2(-2,  0) + 
yoff_border2)[comp]);
 }
 base += quant_table[quant_table_idx][3][(cur2 - cur) & 
MAX_QUANT_TABLE_MASK];
 
 TYPE top2 = TYPE(0);
 if (off.y > 1)
-top2 = TYPE(imageLoad(d

[FFmpeg-devel] [PATCH 04/16] ffv1enc_vulkan: unify EC code between setup and encode

2025-05-14 Thread Lynne
---
 libavcodec/ffv1enc_vulkan.c   |  1 +
 libavcodec/vulkan/ffv1_enc.comp   |  7 ---
 libavcodec/vulkan/ffv1_enc_setup.comp | 10 +-
 libavcodec/vulkan/rangecoder.comp | 23 +++
 4 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c
index d78ba3aca8..956463e932 100644
--- a/libavcodec/ffv1enc_vulkan.c
+++ b/libavcodec/ffv1enc_vulkan.c
@@ -976,6 +976,7 @@ static int init_setup_shader(AVCodecContext *avctx, 
FFVkSPIRVCompiler *spv)
 av_bprintf(&shd->src, "#define MAX_QUANT_TABLES %i\n", MAX_QUANT_TABLES);
 av_bprintf(&shd->src, "#define MAX_CONTEXT_INPUTS %i\n", 
MAX_CONTEXT_INPUTS);
 av_bprintf(&shd->src, "#define MAX_QUANT_TABLE_SIZE %i\n", 
MAX_QUANT_TABLE_SIZE);
+av_bprintf(&shd->src, "#define FULL_RENORM\n");
 
 desc_set = (FFVulkanDescriptorSetBinding []) {
 {
diff --git a/libavcodec/vulkan/ffv1_enc.comp b/libavcodec/vulkan/ffv1_enc.comp
index 7f8c831efa..a3c22f7459 100644
--- a/libavcodec/vulkan/ffv1_enc.comp
+++ b/libavcodec/vulkan/ffv1_enc.comp
@@ -63,13 +63,6 @@ ivec2 get_pred(readonly uimage2D pred, ivec2 sp, ivec2 off, 
int comp, int sw, ui
 }
 
 #ifndef GOLOMB
-void put_rac(inout RangeCoder c, uint64_t state, bool bit)
-{
-put_rac_norenorm(c, state, bit);
-if (c.range < 0x100)
-renorm_encoder(c);
-}
-
 /* Note - only handles signed values */
 void put_symbol(inout RangeCoder c, uint64_t state, int v)
 {
diff --git a/libavcodec/vulkan/ffv1_enc_setup.comp 
b/libavcodec/vulkan/ffv1_enc_setup.comp
index d395770ba8..6f21e47523 100644
--- a/libavcodec/vulkan/ffv1_enc_setup.comp
+++ b/libavcodec/vulkan/ffv1_enc_setup.comp
@@ -50,18 +50,18 @@ void init_slice(out SliceContext sc, const uint slice_idx)
 void put_usymbol(inout RangeCoder c, uint v)
 {
 bool is_nil = (v == 0);
-put_rac(c, state[0], is_nil);
+put_rac_direct(c, state[0], is_nil);
 if (is_nil)
 return;
 
 const int e = findMSB(v);
 
 for (int i = 0; i < e; i++)
-put_rac(c, state[1 + min(i, 9)], true);
-put_rac(c, state[1 + min(e, 9)], false);
+put_rac_direct(c, state[1 + min(i, 9)], true);
+put_rac_direct(c, state[1 + min(e, 9)], false);
 
 for (int i = e - 1; i >= 0; i--)
-put_rac(c, state[22 + min(i, 9)], bool(bitfieldExtract(v, i, 1)));
+put_rac_direct(c, state[22 + min(i, 9)], bool(bitfieldExtract(v, i, 
1)));
 }
 
 void write_slice_header(inout SliceContext sc)
@@ -83,7 +83,7 @@ void write_slice_header(inout SliceContext sc)
 put_usymbol(sc.c, sar.y);
 
 if (version >= 4) {
-put_rac(sc.c, state[0], sc.slice_reset_contexts);
+put_rac_direct(sc.c, state[0], sc.slice_reset_contexts);
 put_usymbol(sc.c, sc.slice_coding_mode);
 if (sc.slice_coding_mode != 1 && colorspace == 1) {
 put_usymbol(sc.c, sc.slice_rct_coef.y);
diff --git a/libavcodec/vulkan/rangecoder.comp 
b/libavcodec/vulkan/rangecoder.comp
index 1db42e1dc9..badc65293f 100644
--- a/libavcodec/vulkan/rangecoder.comp
+++ b/libavcodec/vulkan/rangecoder.comp
@@ -31,8 +31,9 @@ struct RangeCoder {
 uint8_t outstanding_byte;
 };
 
+#ifdef FULL_RENORM
 /* Full renorm version that can handle outstanding_byte == 0xFF */
-void renorm_encoder_full(inout RangeCoder c)
+void renorm_encoder(inout RangeCoder c)
 {
 int bs_cnt = 0;
 u8buf bytestream = u8buf(c.bytestream);
@@ -62,6 +63,8 @@ void renorm_encoder_full(inout RangeCoder c)
 c.low = bitfieldInsert(0, c.low, 8, 8);
 }
 
+#else
+
 /* Cannot deal with outstanding_byte == -1 in the name of speed */
 void renorm_encoder(inout RangeCoder c)
 {
@@ -90,8 +93,9 @@ void renorm_encoder(inout RangeCoder c)
 for (int i = 1; i < oc; i++)
 bs[i].v = fill;
 }
+#endif
 
-void put_rac_direct(inout RangeCoder c, uint8_t state, bool bit)
+void put_rac_direct(inout RangeCoder c, inout uint8_t state, bool bit)
 {
 int range1 = uint16_t((c.range * state) >> 8);
 
@@ -111,21 +115,16 @@ void put_rac_direct(inout RangeCoder c, uint8_t state, 
bool bit)
 } else {
 c.range  = diff;
 }
-}
 
-void put_rac_norenorm(inout RangeCoder c, uint64_t state, bool bit)
-{
-put_rac_direct(c, u8buf(state).v, bit);
+if (c.range < 0x100)
+renorm_encoder(c);
 
-u8buf(state).v = zero_one_state[(uint(bit) << 8) + u8buf(state).v];
+state = zero_one_state[(uint(bit) << 8) + state];
 }
 
-void put_rac(inout RangeCoder c, inout uint8_t state, bool bit)
+void put_rac(inout RangeCoder c, uint64_t state, bool bit)
 {
-put_rac_direct(c, state, bit);
-if (c.range < 0x100)
-renorm_encoder_full(c);
-state = zero_one_state[(uint(bit) << 8) + state];
+put_rac_direct(c, u8buf(state).v, bit);
 }
 
 /* Equiprobable bit */
-- 
2.49.0.395.g12beb8f557c
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link abo

[FFmpeg-devel] [PATCH 05/16] ffv1enc_vulkan: minor EC optimizations

2025-05-14 Thread Lynne
---
 libavcodec/vulkan/rangecoder.comp | 19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/libavcodec/vulkan/rangecoder.comp 
b/libavcodec/vulkan/rangecoder.comp
index badc65293f..9e2c5fbecf 100644
--- a/libavcodec/vulkan/rangecoder.comp
+++ b/libavcodec/vulkan/rangecoder.comp
@@ -109,14 +109,10 @@ void put_rac_direct(inout RangeCoder c, inout uint8_t 
state, bool bit)
 #endif
 
 int diff = c.range - range1;
-if (bit) {
-c.low   += diff;
-c.range  = range1;
-} else {
-c.range  = diff;
-}
+c.low += bit ? diff : 0;
+c.range = bit ? range1 : diff;
 
-if (c.range < 0x100)
+if (expectEXT(c.range < 0x100, false))
 renorm_encoder(c);
 
 state = zero_one_state[(uint(bit) << 8) + state];
@@ -139,12 +135,9 @@ void put_rac_equi(inout RangeCoder c, bool bit)
 debugPrintfEXT("Error: range1 <= 0");
 #endif
 
-if (bit) {
-c.low   += c.range - range1;
-c.range  = range1;
-} else {
-c.range -= range1;
-}
+int diff = c.range - range1;
+c.low += bit ? diff : 0;
+c.range = bit ? range1 : diff;
 
 if (expectEXT(c.range < 0x100, false))
 renorm_encoder(c);
-- 
2.49.0.395.g12beb8f557c
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 11/16] ffv1enc_vulkan: implement RCT search for level >= 4

2025-05-14 Thread Lynne
---
 libavcodec/ffv1enc_vulkan.c| 204 -
 libavcodec/vulkan/Makefile |   2 +-
 libavcodec/vulkan/ffv1_enc_setup.comp  |   6 +-
 libavcodec/vulkan/ffv1_rct_search.comp | 139 +
 4 files changed, 346 insertions(+), 5 deletions(-)
 create mode 100644 libavcodec/vulkan/ffv1_rct_search.comp

diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c
index 5de16d5b02..d9e12f5fae 100644
--- a/libavcodec/ffv1enc_vulkan.c
+++ b/libavcodec/ffv1enc_vulkan.c
@@ -74,6 +74,7 @@ typedef struct VulkanEncodeFFv1Context {
 size_t max_heap_size;
 
 FFVulkanShader setup;
+FFVulkanShader rct_search;
 FFVulkanShader reset;
 FFVulkanShader enc;
 
@@ -101,6 +102,7 @@ typedef struct VulkanEncodeFFv1Context {
 int num_h_slices;
 int num_v_slices;
 int force_pcm;
+int optimize_rct;
 
 int is_rgb;
 int ppi;
@@ -112,6 +114,7 @@ extern const char *ff_source_rangecoder_comp;
 extern const char *ff_source_ffv1_vlc_comp;
 extern const char *ff_source_ffv1_common_comp;
 extern const char *ff_source_ffv1_reset_comp;
+extern const char *ff_source_ffv1_rct_search_comp;
 extern const char *ff_source_ffv1_enc_setup_comp;
 extern const char *ff_source_ffv1_enc_comp;
 
@@ -147,7 +150,8 @@ typedef struct FFv1VkParameters {
 uint8_t ec;
 uint8_t ppi;
 uint8_t chunks;
-uint8_t padding[4];
+uint8_t rct_search;
+uint8_t padding[3];
 } FFv1VkParameters;
 
 static void add_push_data(FFVulkanShader *shd)
@@ -184,12 +188,76 @@ static void add_push_data(FFVulkanShader *shd)
 GLSLC(1,uint8_t ec;   
);
 GLSLC(1,uint8_t ppi;  
);
 GLSLC(1,uint8_t chunks;   
);
-GLSLC(1,uint8_t padding[4];   
);
+GLSLC(1,uint8_t rct_search;   
);
+GLSLC(1,uint8_t padding[3];   
);
 GLSLC(0, };   
);
 ff_vk_shader_add_push_const(shd, 0, sizeof(FFv1VkParameters),
 VK_SHADER_STAGE_COMPUTE_BIT);
 }
 
+typedef struct FFv1VkRCTSearchParameters {
+int fmt_lut[4];
+int rct_offset;
+uint8_t planar_rgb;
+uint8_t transparency;
+uint8_t key_frame;
+uint8_t force_pcm;
+uint8_t version;
+uint8_t micro_version;
+uint8_t padding[2];
+} FFv1VkRCTSearchParameters;
+
+static int run_rct_search(AVCodecContext *avctx, FFVkExecContext *exec,
+  AVFrame *enc_in, VkImageView *enc_in_views,
+  FFVkBuffer *slice_data_buf, uint32_t slice_data_size)
+{
+VulkanEncodeFFv1Context *fv = avctx->priv_data;
+FFV1Context *f = &fv->ctx;
+FFVulkanFunctions *vk = &fv->s.vkfn;
+AVHWFramesContext *src_hwfc = (AVHWFramesContext 
*)enc_in->hw_frames_ctx->data;
+FFv1VkRCTSearchParameters pd;
+
+/* Update descriptors */
+ff_vk_shader_update_desc_buffer(&fv->s, exec, &fv->rct_search,
+0, 0, 0,
+slice_data_buf,
+0, slice_data_size*f->slice_count,
+VK_FORMAT_UNDEFINED);
+ff_vk_shader_update_img_array(&fv->s, exec, &fv->rct_search,
+  enc_in, enc_in_views,
+  0, 1,
+  VK_IMAGE_LAYOUT_GENERAL,
+  VK_NULL_HANDLE);
+
+ff_vk_exec_bind_shader(&fv->s, exec, &fv->rct_search);
+
+pd = (FFv1VkRCTSearchParameters) {
+.rct_offset = 1 << f->bits_per_raw_sample,
+.planar_rgb = ff_vk_mt_is_np_rgb(src_hwfc->sw_format) &&
+  (ff_vk_count_images((AVVkFrame *)enc_in->data[0]) > 1),
+.transparency = f->transparency,
+.key_frame = f->key_frame,
+.force_pcm = fv->force_pcm,
+.version = f->version,
+.micro_version = f->micro_version,
+};
+
+if (avctx->sw_pix_fmt == AV_PIX_FMT_GBRP10 ||
+avctx->sw_pix_fmt == AV_PIX_FMT_GBRP12 ||
+avctx->sw_pix_fmt == AV_PIX_FMT_GBRP14)
+memcpy(pd.fmt_lut, (int [4]) { 2, 1, 0, 3 }, 4*sizeof(int));
+else
+ff_vk_set_perm(avctx->sw_pix_fmt, pd.fmt_lut, 1);
+
+ff_vk_shader_update_push_const(&fv->s, exec, &fv->rct_search,
+   VK_SHADER_STAGE_COMPUTE_BIT,
+   0, sizeof(pd), &pd);
+
+vk->CmdDispatch(exec->buf, fv->ctx.num_h_slices, fv->ctx.num_v_slices, 1);
+
+return 0;
+}
+
 static int vulkan_encode_ffv1_submit_frame(AVCodecContext *avctx,
FFVkExecContext *exec,
const AVFrame *pict)
@@ -366,6 +434,25 @@ static int 

[FFmpeg-devel] [PATCH 13/16] vulkan_ffv1: pipe through slice decoding status

2025-05-14 Thread Lynne
---
 libavcodec/vulkan/ffv1_dec.comp   |  4 ++
 libavcodec/vulkan/ffv1_dec_setup.comp |  4 +-
 libavcodec/vulkan_decode.c|  1 +
 libavcodec/vulkan_decode.h|  1 +
 libavcodec/vulkan_ffv1.c  | 60 +++
 5 files changed, 52 insertions(+), 18 deletions(-)

diff --git a/libavcodec/vulkan/ffv1_dec.comp b/libavcodec/vulkan/ffv1_dec.comp
index e73b3f1dc0..1d33b32c6b 100644
--- a/libavcodec/vulkan/ffv1_dec.comp
+++ b/libavcodec/vulkan/ffv1_dec.comp
@@ -291,4 +291,8 @@ void main(void)
 {
 const uint slice_idx = gl_WorkGroupID.y*gl_NumWorkGroups.x + 
gl_WorkGroupID.x;
 decode_slice(slice_ctx[slice_idx], slice_idx);
+
+uint32_t status = corrupt ? uint32_t(corrupt) : overread;
+if (status != 0)
+slice_status[2*slice_idx + 1] = status;
 }
diff --git a/libavcodec/vulkan/ffv1_dec_setup.comp 
b/libavcodec/vulkan/ffv1_dec_setup.comp
index a27a878927..671f28e7e7 100644
--- a/libavcodec/vulkan/ffv1_dec_setup.comp
+++ b/libavcodec/vulkan/ffv1_dec_setup.comp
@@ -133,6 +133,8 @@ void main(void)
 for (int i = 0; i < slice_size; i++)
 crc = crc_ieee[(crc & 0xFF) ^ uint32_t(bs[i].v)] ^ (crc >> 8);
 
-slice_crc_mismatch[slice_idx] = crc;
+slice_status[2*slice_idx + 0] = crc;
 }
+
+slice_status[2*slice_idx + 1] = corrupt ? uint32_t(corrupt) : overread;
 }
diff --git a/libavcodec/vulkan_decode.c b/libavcodec/vulkan_decode.c
index f1313c8409..7310ba1547 100644
--- a/libavcodec/vulkan_decode.c
+++ b/libavcodec/vulkan_decode.c
@@ -142,6 +142,7 @@ static void init_frame(FFVulkanDecodeContext *dec, 
FFVulkanDecodePicture *vkpic)
 
 vkpic->destroy_image_view = vk->DestroyImageView;
 vkpic->wait_semaphores = vk->WaitSemaphores;
+vkpic->invalidate_memory_ranges = vk->InvalidateMappedMemoryRanges;
 }
 
 int ff_vk_decode_prepare_frame(FFVulkanDecodeContext *dec, AVFrame *pic,
diff --git a/libavcodec/vulkan_decode.h b/libavcodec/vulkan_decode.h
index cbd22b3591..bf6506f280 100644
--- a/libavcodec/vulkan_decode.h
+++ b/libavcodec/vulkan_decode.h
@@ -114,6 +114,7 @@ typedef struct FFVulkanDecodePicture {
 /* Vulkan functions needed for destruction, as no other context is 
guaranteed to exist */
 PFN_vkWaitSemaphoreswait_semaphores;
 PFN_vkDestroyImageView  destroy_image_view;
+PFN_vkInvalidateMappedMemoryRanges invalidate_memory_ranges;
 } FFVulkanDecodePicture;
 
 /**
diff --git a/libavcodec/vulkan_ffv1.c b/libavcodec/vulkan_ffv1.c
index efbf5fa953..c839f4c387 100644
--- a/libavcodec/vulkan_ffv1.c
+++ b/libavcodec/vulkan_ffv1.c
@@ -221,7 +221,7 @@ static int vk_ffv1_start_frame(AVCodecContext  
*avctx,
   &fp->slice_status_buf,
   VK_BUFFER_USAGE_STORAGE_BUFFER_BIT |
   VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT,
-  NULL, f->slice_count*sizeof(uint32_t),
+  NULL, 2*f->slice_count*sizeof(uint32_t),
   VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT |
   VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT);
 if (err < 0)
@@ -408,7 +408,7 @@ static int vk_ffv1_end_frame(AVCodecContext *avctx)
 ff_vk_shader_update_desc_buffer(&ctx->s, exec, &fv->setup,
 1, 2, 0,
 slice_status,
-0, f->slice_count*sizeof(uint32_t),
+0, 2*f->slice_count*sizeof(uint32_t),
 VK_FORMAT_UNDEFINED);
 
 ff_vk_exec_bind_shader(&ctx->s, exec, &fv->setup);
@@ -538,10 +538,15 @@ static int vk_ffv1_end_frame(AVCodecContext *avctx)
   1, 1,
   VK_IMAGE_LAYOUT_GENERAL,
   VK_NULL_HANDLE);
+ff_vk_shader_update_desc_buffer(&ctx->s, exec, decode_shader,
+1, 2, 0,
+slice_status,
+0, 2*f->slice_count*sizeof(uint32_t),
+VK_FORMAT_UNDEFINED);
 if (is_rgb)
 ff_vk_shader_update_img_array(&ctx->s, exec, decode_shader,
   f->picture.f, vp->view.out,
-  1, 2,
+  1, 3,
   VK_IMAGE_LAYOUT_GENERAL,
   VK_NULL_HANDLE);
 
@@ -700,8 +705,8 @@ static int init_setup_shader(FFV1Context *f, 
FFVulkanContext *s,
 .type= VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
 .stages  = VK_SHADER_STAGE_COMPUTE_BIT,
 .mem_quali   = "writeonly",
-.buf_content = "uint32_t slice_crc_mismatch",
-.buf_elems   = f->max_slice_count,
+.buf_content = "uint32_t slic

[FFmpeg-devel] [PATCH 14/16] vulkan: enable VK_KHR_shader_subgroup_rotate

2025-05-14 Thread Lynne
Yet another thing that should've been always present.
---
 libavutil/hwcontext_vulkan.c | 5 +
 libavutil/vulkan_functions.h | 1 +
 libavutil/vulkan_loader.h| 1 +
 3 files changed, 7 insertions(+)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 978d7e29d3..eded36bc01 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -79,6 +79,7 @@ typedef struct VulkanDeviceFeatures {
 VkPhysicalDeviceVulkan12Features vulkan_1_2;
 VkPhysicalDeviceVulkan13Features vulkan_1_3;
 VkPhysicalDeviceTimelineSemaphoreFeatures timeline_semaphore;
+VkPhysicalDeviceShaderSubgroupRotateFeaturesKHR subgroup_rotate;
 
 #ifdef VK_KHR_shader_expect_assume
 VkPhysicalDeviceShaderExpectAssumeFeaturesKHR expect_assume;
@@ -205,6 +206,8 @@ static void device_features_init(AVHWDeviceContext *ctx, 
VulkanDeviceFeatures *f
 
 FF_VK_STRUCT_EXT(s, &feats->device, &feats->timeline_semaphore, 
FF_VK_EXT_PORTABILITY_SUBSET,
  
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TIMELINE_SEMAPHORE_FEATURES);
+FF_VK_STRUCT_EXT(s, &feats->device, &feats->subgroup_rotate, 
FF_VK_EXT_SUBGROUP_ROTATE,
+ 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_SUBGROUP_ROTATE_FEATURES);
 
 #ifdef VK_KHR_shader_expect_assume
 FF_VK_STRUCT_EXT(s, &feats->device, &feats->expect_assume, 
FF_VK_EXT_EXPECT_ASSUME,
@@ -283,6 +286,7 @@ static void 
device_features_copy_needed(VulkanDeviceFeatures *dst, VulkanDeviceF
 COPY_VAL(vulkan_1_3.dynamicRendering);
 
 COPY_VAL(timeline_semaphore.timelineSemaphore);
+COPY_VAL(subgroup_rotate.shaderSubgroupRotate);
 
 COPY_VAL(video_maintenance_1.videoMaintenance1);
 #ifdef VK_KHR_video_maintenance2
@@ -588,6 +592,7 @@ static const VulkanOptExtension optional_device_exts[] = {
 { VK_KHR_COOPERATIVE_MATRIX_EXTENSION_NAME,   
FF_VK_EXT_COOP_MATRIX},
 { VK_NV_OPTICAL_FLOW_EXTENSION_NAME,  
FF_VK_EXT_OPTICAL_FLOW   },
 { VK_EXT_SHADER_OBJECT_EXTENSION_NAME,
FF_VK_EXT_SHADER_OBJECT  },
+{ VK_KHR_SHADER_SUBGROUP_ROTATE_EXTENSION_NAME,   
FF_VK_EXT_SUBGROUP_ROTATE},
 #ifdef VK_KHR_shader_expect_assume
 { VK_KHR_SHADER_EXPECT_ASSUME_EXTENSION_NAME, 
FF_VK_EXT_EXPECT_ASSUME  },
 #endif
diff --git a/libavutil/vulkan_functions.h b/libavutil/vulkan_functions.h
index cd61d71577..8b413013e6 100644
--- a/libavutil/vulkan_functions.h
+++ b/libavutil/vulkan_functions.h
@@ -48,6 +48,7 @@ typedef uint64_t FFVulkanExtensions;
 #define FF_VK_EXT_PUSH_DESCRIPTOR(1ULL << 14) /* 
VK_KHR_push_descriptor */
 #define FF_VK_EXT_RELAXED_EXTENDED_INSTR (1ULL << 15) /* 
VK_KHR_shader_relaxed_extended_instruction */
 #define FF_VK_EXT_EXPECT_ASSUME  (1ULL << 16) /* 
VK_KHR_shader_expect_assume */
+#define FF_VK_EXT_SUBGROUP_ROTATE(1ULL << 17) /* 
VK_KHR_shader_subgroup_rotate */
 
 /* Video extensions */
 #define FF_VK_EXT_VIDEO_QUEUE(1ULL << 36) /* VK_KHR_video_queue */
diff --git a/libavutil/vulkan_loader.h b/libavutil/vulkan_loader.h
index eaf6e2e6bb..a7976fe560 100644
--- a/libavutil/vulkan_loader.h
+++ b/libavutil/vulkan_loader.h
@@ -58,6 +58,7 @@ static inline uint64_t ff_vk_extensions_to_mask(const char * 
const *extensions,
 { VK_KHR_COOPERATIVE_MATRIX_EXTENSION_NAME,
FF_VK_EXT_COOP_MATRIX},
 { VK_NV_OPTICAL_FLOW_EXTENSION_NAME,   
FF_VK_EXT_OPTICAL_FLOW   },
 { VK_EXT_SHADER_OBJECT_EXTENSION_NAME, 
FF_VK_EXT_SHADER_OBJECT  },
+{ VK_KHR_SHADER_SUBGROUP_ROTATE_EXTENSION_NAME,
FF_VK_EXT_SUBGROUP_ROTATE},
 { VK_KHR_VIDEO_MAINTENANCE_1_EXTENSION_NAME,   
FF_VK_EXT_VIDEO_MAINTENANCE_1},
 #ifdef VK_KHR_video_maintenance2
 { VK_KHR_VIDEO_MAINTENANCE_2_EXTENSION_NAME,   
FF_VK_EXT_VIDEO_MAINTENANCE_2},
-- 
2.49.0.395.g12beb8f557c
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 12/16] vulkan/ffv1: unify encode and decode get/put primitives

2025-05-14 Thread Lynne
This simply makes a get_rac/put_rac_internal variant that can be
reused.
---
 libavcodec/vulkan/rangecoder.comp | 57 +--
 1 file changed, 17 insertions(+), 40 deletions(-)

diff --git a/libavcodec/vulkan/rangecoder.comp 
b/libavcodec/vulkan/rangecoder.comp
index 9e2c5fbecf..8687b8bc3c 100644
--- a/libavcodec/vulkan/rangecoder.comp
+++ b/libavcodec/vulkan/rangecoder.comp
@@ -95,26 +95,26 @@ void renorm_encoder(inout RangeCoder c)
 }
 #endif
 
-void put_rac_direct(inout RangeCoder c, inout uint8_t state, bool bit)
+void put_rac_internal(inout RangeCoder c, const int range1, bool bit)
 {
-int range1 = uint16_t((c.range * state) >> 8);
-
 #ifdef DEBUG
-if (state == 0)
-debugPrintfEXT("Error: state is zero");
 if (range1 >= c.range)
 debugPrintfEXT("Error: range1 >= c.range");
 if (range1 <= 0)
 debugPrintfEXT("Error: range1 <= 0");
 #endif
 
-int diff = c.range - range1;
-c.low += bit ? diff : 0;
-c.range = bit ? range1 : diff;
+int ranged = c.range - range1;
+c.low += bit ? ranged : 0;
+c.range = bit ? range1 : ranged;
 
 if (expectEXT(c.range < 0x100, false))
 renorm_encoder(c);
+}
 
+void put_rac_direct(inout RangeCoder c, inout uint8_t state, bool bit)
+{
+put_rac_internal(c, (c.range * state) >> 8, bit);
 state = zero_one_state[(uint(bit) << 8) + state];
 }
 
@@ -126,21 +126,7 @@ void put_rac(inout RangeCoder c, uint64_t state, bool bit)
 /* Equiprobable bit */
 void put_rac_equi(inout RangeCoder c, bool bit)
 {
-int range1 = c.range >> 1;
-
-#ifdef DEBUG
-if (range1 >= c.range)
-debugPrintfEXT("Error: range1 >= c.range");
-if (range1 <= 0)
-debugPrintfEXT("Error: range1 <= 0");
-#endif
-
-int diff = c.range - range1;
-c.low += bit ? diff : 0;
-c.range = bit ? range1 : diff;
-
-if (expectEXT(c.range < 0x100, false))
-renorm_encoder(c);
+put_rac_internal(c, c.range >> 1, bit);
 }
 
 void put_rac_terminate(inout RangeCoder c)
@@ -224,11 +210,9 @@ void refill(inout RangeCoder c)
 }
 }
 
-bool get_rac_direct(inout RangeCoder c, inout uint8_t state)
+bool get_rac_internal(inout RangeCoder c, const int range1)
 {
-int range1 = c.range * state >> 8;
 int ranged = c.range - range1;
-
 bool bit = c.low >= ranged;
 c.low -= bit ? ranged : 0;
 c.range = (bit ? 0 : ranged) + (bit ? range1 : 0);
@@ -236,6 +220,12 @@ bool get_rac_direct(inout RangeCoder c, inout uint8_t 
state)
 if (expectEXT(c.range < 0x100, false))
 refill(c);
 
+return bit;
+}
+
+bool get_rac_direct(inout RangeCoder c, inout uint8_t state)
+{
+bool bit = get_rac_internal(c, c.range * state >> 8);
 state = zero_one_state[state + (bit ? 256 : 0)];
 return bit;
 }
@@ -247,18 +237,5 @@ bool get_rac(inout RangeCoder c, uint64_t state)
 
 bool get_rac_equi(inout RangeCoder c)
 {
-int range1 = c.range >> 1;
-
-c.range -= range1;
-
-bool bit = c.low >= c.range;
-if (bit) {
-c.low -= c.range;
-c.range = range1;
-}
-
-if (expectEXT(c.range < 0x100, false))
-refill(c);
-
-return bit;
+return get_rac_internal(c, c.range >> 1);
 }
-- 
2.49.0.395.g12beb8f557c
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 03/16] ffv1enc_vulkan: get rid of temporary data for the setup shader

2025-05-14 Thread Lynne
---
 libavcodec/ffv1enc_vulkan.c   | 21 -
 libavcodec/vulkan/ffv1_enc_setup.comp | 65 +++
 libavcodec/vulkan/rangecoder.comp | 28 +++-
 3 files changed, 42 insertions(+), 72 deletions(-)

diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c
index f4b54b8375..d78ba3aca8 100644
--- a/libavcodec/ffv1enc_vulkan.c
+++ b/libavcodec/ffv1enc_vulkan.c
@@ -88,9 +88,6 @@ typedef struct VulkanEncodeFFv1Context {
 AVBufferPool *out_data_pool;
 AVBufferPool *pkt_data_pool;
 
-/* Temporary data buffer */
-AVBufferPool *tmp_data_pool;
-
 /* Slice results buffer */
 AVBufferPool *results_data_pool;
 
@@ -303,11 +300,6 @@ static int vulkan_encode_ffv1_submit_frame(AVCodecContext 
*avctx,
 
 AVFrame *intermediate_frame = NULL;
 
-/* Temporary data */
-size_t tmp_data_size;
-AVBufferRef *tmp_data_ref;
-FFVkBuffer *tmp_data_buf;
-
 /* Slice data */
 AVBufferRef *slice_data_ref;
 FFVkBuffer *slice_data_buf;
@@ -352,17 +344,6 @@ static int vulkan_encode_ffv1_submit_frame(AVCodecContext 
*avctx,
 
 f->slice_count = f->max_slice_count;
 
-/* Allocate temporary data buffer */
-tmp_data_size = f->slice_count*CONTEXT_SIZE;
-RET(ff_vk_get_pooled_buffer(&fv->s, &fv->tmp_data_pool,
-&tmp_data_ref,
-VK_BUFFER_USAGE_STORAGE_BUFFER_BIT |
-VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT,
-NULL, tmp_data_size,
-VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT));
-tmp_data_buf = (FFVkBuffer *)tmp_data_ref->data;
-ff_vk_exec_add_dep_buf(&fv->s, exec, &tmp_data_ref, 1, 0);
-
 /* Allocate slice buffer data */
 if (f->ac == AC_GOLOMB_RICE)
 plane_state_size = 8;
@@ -481,7 +462,6 @@ static int vulkan_encode_ffv1_submit_frame(AVCodecContext 
*avctx,
 ff_vk_exec_bind_shader(&fv->s, exec, &fv->setup);
 pd = (FFv1VkParameters) {
 .slice_state = slice_data_buf->address + f->slice_count*256,
-.scratch_data = tmp_data_buf->address,
 .out_data = out_data_buf->address,
 .bits_per_raw_sample = f->bits_per_raw_sample,
 .sar[0] = pict->sample_aspect_ratio.num,
@@ -1698,7 +1678,6 @@ static av_cold int 
vulkan_encode_ffv1_close(AVCodecContext *avctx)
 
 av_buffer_pool_uninit(&fv->out_data_pool);
 av_buffer_pool_uninit(&fv->pkt_data_pool);
-av_buffer_pool_uninit(&fv->tmp_data_pool);
 
 av_buffer_unref(&fv->keyframe_slice_data_ref);
 av_buffer_pool_uninit(&fv->slice_data_pool);
diff --git a/libavcodec/vulkan/ffv1_enc_setup.comp 
b/libavcodec/vulkan/ffv1_enc_setup.comp
index 44c13404d8..d395770ba8 100644
--- a/libavcodec/vulkan/ffv1_enc_setup.comp
+++ b/libavcodec/vulkan/ffv1_enc_setup.comp
@@ -20,6 +20,8 @@
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  */
 
+uint8_t state[CONTEXT_SIZE];
+
 void init_slice(out SliceContext sc, const uint slice_idx)
 {
 /* Set coordinates */
@@ -45,67 +47,54 @@ void init_slice(out SliceContext sc, const uint slice_idx)
  slice_size_max);
 }
 
-void put_rac_full(inout RangeCoder c, uint64_t state, bool bit)
-{
-put_rac_norenorm(c, state, bit);
-if (c.range < 0x100)
-renorm_encoder_full(c);
-}
-
-void put_symbol_unsigned(inout RangeCoder c, uint64_t state, uint v)
+void put_usymbol(inout RangeCoder c, uint v)
 {
 bool is_nil = (v == 0);
-put_rac_full(c, state, is_nil);
+put_rac(c, state[0], is_nil);
 if (is_nil)
 return;
 
 const int e = findMSB(v);
 
-state += 1;
 for (int i = 0; i < e; i++)
-put_rac_full(c, state + min(i, 9), true);
-put_rac_full(c, state + min(e, 9), false);
+put_rac(c, state[1 + min(i, 9)], true);
+put_rac(c, state[1 + min(e, 9)], false);
 
-state += 21;
 for (int i = e - 1; i >= 0; i--)
-put_rac_full(c, state + min(i, 9), bool(bitfieldExtract(v, i, 1)));
+put_rac(c, state[22 + min(i, 9)], bool(bitfieldExtract(v, i, 1)));
 }
 
-void write_slice_header(inout SliceContext sc, uint64_t state)
+void write_slice_header(inout SliceContext sc)
 {
-u8buf sb = u8buf(state);
-
 [[unroll]]
 for (int i = 0; i < CONTEXT_SIZE; i++)
-sb[i].v = uint8_t(128);
+state[i] = uint8_t(128);
 
-put_symbol_unsigned(sc.c, state, gl_WorkGroupID.x);
-put_symbol_unsigned(sc.c, state, gl_WorkGroupID.y);
-put_symbol_unsigned(sc.c, state, 0);
-put_symbol_unsigned(sc.c, state, 0);
+put_usymbol(sc.c, gl_WorkGroupID.x);
+put_usymbol(sc.c, gl_WorkGroupID.y);
+put_usymbol(sc.c, 0);
+put_usymbol(sc.c, 0);
 
 for (int i = 0; i < codec_planes; i++)
-put_symbol_unsigned(sc.c, state, sc.quant_table_idx[i]);
+put_usymbol(sc.c, sc.quant_table_idx[i]);
 
-put_symbol_unsigned(sc.c, state, pic_mode);
-put_symbol_unsigned(sc.c, state, sar.x);
-put_symb

[FFmpeg-devel] [PATCH 06/16] ffv1enc_vulkan: switch to 2-line cache, unify prediction code

2025-05-14 Thread Lynne
---
 libavcodec/ffv1enc_vulkan.c| 379 +
 libavcodec/vulkan/ffv1_common.comp |  87 +++
 libavcodec/vulkan/ffv1_dec.comp|  91 +--
 libavcodec/vulkan/ffv1_enc.comp| 155 ++--
 libavcodec/vulkan_ffv1.c   |   5 +-
 5 files changed, 288 insertions(+), 429 deletions(-)

diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c
index 956463e932..bab9bb640b 100644
--- a/libavcodec/ffv1enc_vulkan.c
+++ b/libavcodec/ffv1enc_vulkan.c
@@ -37,6 +37,9 @@
 #define LG_ALIGN_W 32
 #define LG_ALIGN_H 32
 
+/* Unlike the decoder, we need 4 lines (but really only 3) */
+#define RGB_LINECACHE 4
+
 typedef struct VulkanEncodeFFv1FrameData {
 /* Output data */
 AVBufferRef *out_data_ref;
@@ -72,7 +75,6 @@ typedef struct VulkanEncodeFFv1Context {
 
 FFVulkanShader setup;
 FFVulkanShader reset;
-FFVulkanShader rct;
 FFVulkanShader enc;
 
 /* Constant read-only buffers */
@@ -111,7 +113,6 @@ extern const char *ff_source_rangecoder_comp;
 extern const char *ff_source_ffv1_vlc_comp;
 extern const char *ff_source_ffv1_common_comp;
 extern const char *ff_source_ffv1_reset_comp;
-extern const char *ff_source_ffv1_enc_rct_comp;
 extern const char *ff_source_ffv1_enc_setup_comp;
 extern const char *ff_source_ffv1_enc_comp;
 
@@ -120,6 +121,7 @@ typedef struct FFv1VkParameters {
 VkDeviceAddress scratch_data;
 VkDeviceAddress out_data;
 
+int32_t fmt_lut[4];
 int32_t sar[2];
 uint32_t chroma_shift[2];
 
@@ -127,7 +129,9 @@ typedef struct FFv1VkParameters {
 uint32_t context_count;
 uint32_t crcref;
 uint32_t slice_size_max;
+int  rct_offset;
 
+uint8_t extend_lookup[8];
 uint8_t bits_per_raw_sample;
 uint8_t context_model;
 uint8_t version;
@@ -137,13 +141,14 @@ typedef struct FFv1VkParameters {
 uint8_t components;
 uint8_t planes;
 uint8_t codec_planes;
+uint8_t planar_rgb;
 uint8_t transparency;
 uint8_t colorspace;
 uint8_t pic_mode;
 uint8_t ec;
 uint8_t ppi;
 uint8_t chunks;
-uint8_t padding[1];
+uint8_t padding[4];
 } FFv1VkParameters;
 
 static void add_push_data(FFVulkanShader *shd)
@@ -153,6 +158,7 @@ static void add_push_data(FFVulkanShader *shd)
 GLSLC(1,u8buf scratch_data;   
);
 GLSLC(1,u8buf out_data;   
);
 GLSLC(0,  
);
+GLSLC(1,ivec4 fmt_lut;
);
 GLSLC(1,ivec2 sar;
);
 GLSLC(1,uvec2 chroma_shift;   
);
 GLSLC(0,  
);
@@ -160,7 +166,9 @@ static void add_push_data(FFVulkanShader *shd)
 GLSLC(1,uint context_count;   
);
 GLSLC(1,uint32_t crcref;  
);
 GLSLC(1,uint32_t slice_size_max;  
);
+GLSLC(1,int rct_offset;   
);
 GLSLC(0,  
);
+GLSLC(1,uint8_t extend_lookup[8]; 
);
 GLSLC(1,uint8_t bits_per_raw_sample;  
);
 GLSLC(1,uint8_t context_model;
);
 GLSLC(1,uint8_t version;  
);
@@ -170,122 +178,19 @@ static void add_push_data(FFVulkanShader *shd)
 GLSLC(1,uint8_t components;   
);
 GLSLC(1,uint8_t planes;   
);
 GLSLC(1,uint8_t codec_planes; 
);
+GLSLC(1,uint8_t planar_rgb;   
);
 GLSLC(1,uint8_t transparency; 
);
 GLSLC(1,uint8_t colorspace;   
);
 GLSLC(1,uint8_t pic_mode; 
);
 GLSLC(1,uint8_t ec;   
);
 GLSLC(1,uint8_t ppi;  
);
 GLSLC(1,uint8_t chunks;   
);
-GLSLC(1,uint8_t padding[1];   
);
+GLSLC(1,uint8_t padding[4];   
);
 GLSLC(0, };   
);
 ff_vk_shader_add_push_const(shd, 0, sizeof(FFv1VkParameters),
 VK_SHADER_STAGE_COMPUTE_BIT);
 }
 
-static int run_rct(AVCodecContext *avctx, FFVkEx

[FFmpeg-devel] [PATCH 10/16] ffv1enc_vulkan: implement the cached EC writer from the decoder

2025-05-14 Thread Lynne
This gives a 35% speedup on AMD and 50% on Nvidia.
---
 libavcodec/ffv1enc_vulkan.c |  6 ++-
 libavcodec/vulkan/ffv1_enc.comp | 68 ++---
 2 files changed, 50 insertions(+), 24 deletions(-)

diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c
index c2eb73ca53..5de16d5b02 100644
--- a/libavcodec/ffv1enc_vulkan.c
+++ b/libavcodec/ffv1enc_vulkan.c
@@ -1099,12 +1099,13 @@ static int init_encode_shader(AVCodecContext *avctx, 
FFVkSPIRVCompiler *spv)
 uint8_t *spv_data;
 size_t spv_len;
 void *spv_opaque = NULL;
+int use_cached_reader = fv->ctx.ac != AC_GOLOMB_RICE;
 
 RET(ff_vk_shader_init(&fv->s, shd, "ffv1_enc",
   VK_SHADER_STAGE_COMPUTE_BIT,
   (const char *[]) { "GL_EXT_buffer_reference",
  "GL_EXT_buffer_reference2" }, 2,
-  1, 1, 1,
+  use_cached_reader ? CONTEXT_SIZE : 1, 1, 1,
   0));
 
 /* Common codec header */
@@ -1116,6 +1117,9 @@ static int init_encode_shader(AVCodecContext *avctx, 
FFVkSPIRVCompiler *spv)
 av_bprintf(&shd->src, "#define MAX_CONTEXT_INPUTS %i\n", 
MAX_CONTEXT_INPUTS);
 av_bprintf(&shd->src, "#define MAX_QUANT_TABLE_SIZE %i\n", 
MAX_QUANT_TABLE_SIZE);
 
+if (use_cached_reader)
+av_bprintf(&shd->src, "#define CACHED_SYMBOL_READER 1\n");
+
 desc_set = (FFVulkanDescriptorSetBinding []) {
 {
 .name= "rangecoder_static_buf",
diff --git a/libavcodec/vulkan/ffv1_enc.comp b/libavcodec/vulkan/ffv1_enc.comp
index db33c414e1..65a7df1359 100644
--- a/libavcodec/vulkan/ffv1_enc.comp
+++ b/libavcodec/vulkan/ffv1_enc.comp
@@ -21,27 +21,32 @@
  */
 
 #ifndef GOLOMB
+#ifdef CACHED_SYMBOL_READER
+shared uint8_t state[CONTEXT_SIZE];
+#define WRITE(c, off, val) put_rac_direct(c, state[off], val)
+#else
+#define WRITE(c, off, val) put_rac(c, uint64_t(slice_state) + (state_off + 
off), val)
+#endif
+
 /* Note - only handles signed values */
-void put_symbol(inout RangeCoder c, uint64_t state, int v)
+void put_symbol(inout RangeCoder c, uint state_off, int v)
 {
 bool is_nil = (v == 0);
-put_rac(c, state, is_nil);
+WRITE(c, 0, is_nil);
 if (is_nil)
 return;
 
 const int a = abs(v);
 const int e = findMSB(a);
 
-state += 1;
 for (int i = 0; i < e; i++)
-put_rac(c, state + min(i, 9), true);
-put_rac(c, state + min(e, 9), false);
+WRITE(c, 1 + min(i, 9), true);
+WRITE(c, 1 + min(e, 9), false);
 
-state += 21;
 for (int i = e - 1; i >= 0; i--)
-put_rac(c, state + min(i, 9), bool(bitfieldExtract(a, i, 1)));
+WRITE(c, 22 + min(i, 9), bool(bitfieldExtract(a, i, 1)));
 
-put_rac(c, state - 11 + min(e, 10), v < 0);
+WRITE(c, 22 - 11 + min(e, 10), v < 0);
 }
 
 void encode_line_pcm(inout SliceContext sc, readonly uimage2D img,
@@ -49,6 +54,11 @@ void encode_line_pcm(inout SliceContext sc, readonly 
uimage2D img,
 {
 int w = sc.slice_dim.x;
 
+#ifdef CACHED_SYMBOL_READER
+if (gl_LocalInvocationID.x > 0)
+return;
+#endif
+
 #ifndef RGB
 if (p > 0 && p < 3) {
 w >>= chroma_shift.x;
@@ -63,7 +73,7 @@ void encode_line_pcm(inout SliceContext sc, readonly uimage2D 
img,
 }
 }
 
-void encode_line(inout SliceContext sc, readonly uimage2D img, uint64_t state,
+void encode_line(inout SliceContext sc, readonly uimage2D img, uint state_off,
  ivec2 sp, int y, int p, int comp, int bits,
  uint8_t quant_table_idx, const int run_index)
 {
@@ -86,13 +96,25 @@ void encode_line(inout SliceContext sc, readonly uimage2D 
img, uint64_t state,
 
 d[1] = fold(d[1], bits);
 
-put_symbol(sc.c, state + CONTEXT_SIZE*d[0], d[1]);
+uint context_off = state_off + CONTEXT_SIZE*d[0];
+#ifdef CACHED_SYMBOL_READER
+u8buf sb = u8buf(uint64_t(slice_state) + context_off + 
gl_LocalInvocationID.x);
+state[gl_LocalInvocationID.x] = sb.v;
+barrier();
+if (gl_LocalInvocationID.x == 0)
+#endif
+
+put_symbol(sc.c, context_off, d[1]);
+
+#ifdef CACHED_SYMBOL_READER
+sb.v = state[gl_LocalInvocationID.x];
+#endif
 }
 }
 
 #else /* GOLOMB */
 
-void encode_line(inout SliceContext sc, readonly uimage2D img, uint64_t state,
+void encode_line(inout SliceContext sc, readonly uimage2D img, uint state_off,
  ivec2 sp, int y, int p, int comp, int bits,
  uint8_t quant_table_idx, inout int run_index)
 {
@@ -143,7 +165,7 @@ void encode_line(inout SliceContext sc, readonly uimage2D 
img, uint64_t state,
 }
 
 if (!run_mode) {
-VlcState sb = VlcState(state + VLC_STATE_SIZE*d[0]);
+VlcState sb = VlcState(uint64_t(slice_state) + state_off + 
VLC_STATE_SIZE*d[0]);
 Symbol sym = get_vlc_symbol(sb, d[1], bits);
 put_bits(sc.pb, sym.bits, sym.val);
 }
@@ -

[FFmpeg-devel] [PATCH 08/16] ffv1enc_vulkan: use ff_get_encode_buffer

2025-05-14 Thread Lynne
We used to create our own buffer, but still used the DR1 flag,
which is not how it's supposed to work.

Instead, use ff_get_encode_buffer, and either host-map the buffer
before copying each slice via GPU transfers, or just copy each
slice manually if that fails or is unavailable.
---
 libavcodec/ffv1enc_vulkan.c | 98 +
 1 file changed, 57 insertions(+), 41 deletions(-)

diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c
index bab9bb640b..c2eb73ca53 100644
--- a/libavcodec/ffv1enc_vulkan.c
+++ b/libavcodec/ffv1enc_vulkan.c
@@ -88,7 +88,6 @@ typedef struct VulkanEncodeFFv1Context {
 
 /* Output data buffer */
 AVBufferPool *out_data_pool;
-AVBufferPool *pkt_data_pool;
 
 /* Slice results buffer */
 AVBufferPool *results_data_pool;
@@ -299,8 +298,11 @@ static int vulkan_encode_ffv1_submit_frame(AVCodecContext 
*avctx,
 VK_BUFFER_USAGE_STORAGE_BUFFER_BIT |
 VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT,
 NULL, maxsize,
-maxsize < fv->max_heap_size ?
-VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT : 0x0));
+VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
+(maxsize < fv->max_heap_size ?
+ VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT : 0x0) |
+(!(fv->s.extensions & 
FF_VK_EXT_EXTERNAL_HOST_MEMORY) ?
+ VK_MEMORY_PROPERTY_HOST_CACHED_BIT : 0x0)));
 out_data_buf = (FFVkBuffer *)fd->out_data_ref->data;
 ff_vk_exec_add_dep_buf(&fv->s, exec, &fd->out_data_ref, 1, 1);
 
@@ -583,10 +585,10 @@ fail:
 return err;
 }
 
-static int download_slices(AVCodecContext *avctx,
+static int transfer_slices(AVCodecContext *avctx,
VkBufferCopy *buf_regions, int nb_regions,
VulkanEncodeFFv1FrameData *fd,
-   AVBufferRef *pkt_data_ref)
+   uint8_t *dst, AVBufferRef *dst_ref)
 {
 int err;
 VulkanEncodeFFv1Context *fv = avctx->priv_data;
@@ -594,11 +596,20 @@ static int download_slices(AVCodecContext *avctx,
 FFVkExecContext *exec;
 
 FFVkBuffer *out_data_buf = (FFVkBuffer *)fd->out_data_ref->data;
-FFVkBuffer *pkt_data_buf = (FFVkBuffer *)pkt_data_ref->data;
+
+AVBufferRef *mapped_ref;
+FFVkBuffer *mapped_buf;
 
 VkBufferMemoryBarrier2 buf_bar[8];
 int nb_buf_bar = 0;
 
+err = ff_vk_host_map_buffer(&fv->s, &mapped_ref, dst, dst_ref,
+VK_BUFFER_USAGE_TRANSFER_DST_BIT);
+if (err < 0)
+return err;
+
+mapped_buf = (FFVkBuffer *)mapped_ref->data;
+
 /* Transfer the slices */
 exec = ff_vk_exec_get(&fv->s, &fv->transfer_exec_pool);
 ff_vk_exec_start(&fv->s, exec);
@@ -606,7 +617,8 @@ static int download_slices(AVCodecContext *avctx,
 ff_vk_exec_add_dep_buf(&fv->s, exec, &fd->out_data_ref, 1, 0);
 fd->out_data_ref = NULL; /* Ownership passed */
 
-ff_vk_exec_add_dep_buf(&fv->s, exec, &pkt_data_ref, 1, 1);
+ff_vk_exec_add_dep_buf(&fv->s, exec, &mapped_ref, 1, 0);
+mapped_ref = NULL; /* Ownership passed */
 
 /* Ensure the output buffer is finished */
 buf_bar[nb_buf_bar++] = (VkBufferMemoryBarrier2) {
@@ -630,8 +642,11 @@ static int download_slices(AVCodecContext *avctx,
 out_data_buf->access = buf_bar[0].dstAccessMask;
 nb_buf_bar = 0;
 
+for (int i = 0; i < nb_regions; i++)
+buf_regions[i].dstOffset += mapped_buf->virtual_offset;
+
 vk->CmdCopyBuffer(exec->buf,
-  out_data_buf->buf, pkt_data_buf->buf,
+  out_data_buf->buf, mapped_buf->buf,
   nb_regions, buf_regions);
 
 /* Submit */
@@ -642,18 +657,6 @@ static int download_slices(AVCodecContext *avctx,
 /* We need the encoded data immediately */
 ff_vk_exec_wait(&fv->s, exec);
 
-/* Invalidate slice/output data if needed */
-if (!(pkt_data_buf->flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)) {
-VkMappedMemoryRange invalidate_data = {
-.sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
-.memory = pkt_data_buf->mem,
-.offset = 0,
-.size = VK_WHOLE_SIZE,
-};
-vk->InvalidateMappedMemoryRanges(fv->s.hwctx->act_dev,
- 1, &invalidate_data);
-}
-
 return 0;
 }
 
@@ -664,13 +667,9 @@ static int get_packet(AVCodecContext *avctx, 
FFVkExecContext *exec,
 VulkanEncodeFFv1Context *fv = avctx->priv_data;
 FFV1Context *f = &fv->ctx;
 FFVulkanFunctions *vk = &fv->s.vkfn;
-
-/* Packet data */
-AVBufferRef *pkt_data_ref;
-FFVkBuffer *pkt_data_buf;
-
 VulkanEncodeFFv1FrameData *fd = exec->opaque;
 
+FFVkBuffer *out_data_buf = (FFVkBuffer *)fd->out_data_ref->data;
 FF

[FFmpeg-devel] [PATCH 07/16] ffv1_common: minor RGB optimization

2025-05-14 Thread Lynne
---
 libavcodec/vulkan/ffv1_common.comp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libavcodec/vulkan/ffv1_common.comp 
b/libavcodec/vulkan/ffv1_common.comp
index 1f222bdc42..3d40592739 100644
--- a/libavcodec/vulkan/ffv1_common.comp
+++ b/libavcodec/vulkan/ffv1_common.comp
@@ -100,17 +100,17 @@ uint slice_coord(uint width, uint sx, uint num_h_slices, 
uint chroma_shift)
 ivec2 get_pred(readonly uimage2D pred, ivec2 sp, ivec2 off,
int comp, int sw, uint8_t quant_table_idx, bool extend_lookup)
 {
-const ivec2 yoff_border1 = expectEXT(off.x == 0, false) ? ivec2(1, -1) : 
ivec2(0, 0);
+const ivec2 yoff_border1 = expectEXT(off.x == 0, false) ? off + ivec2(1, 
-1) : off;
 
 /* Thanks to the same coincidence as below, we can skip checking if off == 
0, 1 */
-VTYPE3 top  = VTYPE3(TYPE(imageLoad(pred, sp + LADDR(off + ivec2(-1, -1) + 
yoff_border1))[comp]),
+VTYPE3 top  = VTYPE3(TYPE(imageLoad(pred, sp + LADDR(yoff_border1 + 
ivec2(-1, -1)))[comp]),
  TYPE(imageLoad(pred, sp + LADDR(off + ivec2(0, 
-1)))[comp]),
  TYPE(imageLoad(pred, sp + LADDR(off + ivec2(min(1, sw 
- off.x - 1), -1)))[comp]));
 
 /* Normally, we'd need to check if off != ivec2(0, 0) here, since 
otherwise, we must
  * return zero. However, ivec2(-1,  0) + ivec2(1, -1) == ivec2(0, -1), 
e.g. previous
  * row, 0 offset, same slice, which is zero since we zero out the buffer 
for RGB */
-TYPE cur = TYPE(imageLoad(pred, sp + LADDR(off + ivec2(-1,  0) + 
yoff_border1))[comp]);
+TYPE cur = TYPE(imageLoad(pred, sp + LADDR(yoff_border1 + ivec2(-1,  
0)))[comp]);
 
 int base = quant_table[quant_table_idx][0][(cur- top[0]) & 
MAX_QUANT_TABLE_MASK] +
quant_table[quant_table_idx][1][(top[0] - top[1]) & 
MAX_QUANT_TABLE_MASK] +
-- 
2.49.0.395.g12beb8f557c
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 09/16] vulkan_ffv1: fix PCM + cached symbol reader

2025-05-14 Thread Lynne
writeout_rgb requires that all subgroups are active.
---
 libavcodec/vulkan/ffv1_dec.comp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/libavcodec/vulkan/ffv1_dec.comp b/libavcodec/vulkan/ffv1_dec.comp
index c74af4bf6a..e73b3f1dc0 100644
--- a/libavcodec/vulkan/ffv1_dec.comp
+++ b/libavcodec/vulkan/ffv1_dec.comp
@@ -56,6 +56,11 @@ int get_isymbol(inout RangeCoder c, uint state_off)
 
 void decode_line_pcm(inout SliceContext sc, ivec2 sp, int w, int y, int p, int 
bits)
 {
+#ifdef CACHED_SYMBOL_READER
+if (gl_LocalInvocationID.x > 0)
+return;
+#endif
+
 #ifndef RGB
 if (p > 0 && p < 3) {
 w >>= chroma_shift.x;
@@ -235,8 +240,6 @@ void decode_slice(inout SliceContext sc, const uint 
slice_idx)
 /* PCM coding */
 #ifndef GOLOMB
 if (sc.slice_coding_mode == 1) {
-if (gl_LocalInvocationID.x > 0)
-return;
 #ifndef RGB
 for (int p = 0; p < planes; p++) {
 int h = sc.slice_dim.y;
-- 
2.49.0.395.g12beb8f557c
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 15/16] hwcontext_vulkan: correct image transfer usage flags

2025-05-14 Thread Lynne
By pure coincidence, BUFFER and IMAGE flags were equal for those
two usage types.
---
 libavutil/hwcontext_vulkan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index eded36bc01..9f9df91e5d 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -2784,8 +2784,8 @@ static int vulkan_frames_init(AVHWFramesContext *hwfc)
 
 /* Image usage flags */
 if (!hwctx->usage) {
-hwctx->usage = supported_usage & (VK_BUFFER_USAGE_TRANSFER_DST_BIT |
-  VK_BUFFER_USAGE_TRANSFER_SRC_BIT |
+hwctx->usage = supported_usage & (VK_IMAGE_USAGE_TRANSFER_DST_BIT |
+  VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
   VK_IMAGE_USAGE_STORAGE_BIT   |
   VK_IMAGE_USAGE_SAMPLED_BIT);
 
-- 
2.49.0.395.g12beb8f557c
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 16/16] hwcontext_vulkan: only try exporting DMABUF memory on !WIN32 and only for DMABUF tiling

2025-05-14 Thread Lynne
---
 libavutil/hwcontext_vulkan.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 9f9df91e5d..4f205137eb 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -2643,11 +2643,12 @@ static AVBufferRef *vulkan_pool_alloc(void *opaque, 
size_t size)
 if (p->vkctx.extensions & FF_VK_EXT_EXTERNAL_FD_MEMORY)
 try_export_flags(hwfc, &eiinfo.handleTypes, &e,
  VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT);
-#endif
 
-if (p->vkctx.extensions & FF_VK_EXT_EXTERNAL_DMABUF_MEMORY)
+if (p->vkctx.extensions & FF_VK_EXT_EXTERNAL_DMABUF_MEMORY &&
+hwctx->tiling == VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT)
 try_export_flags(hwfc, &eiinfo.handleTypes, &e,
  VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT);
+#endif
 
 for (int i = 0; i < av_pix_fmt_count_planes(hwfc->sw_format); i++) {
 eminfo[i].sType   = VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO;
-- 
2.49.0.395.g12beb8f557c
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] Boost FPS and performance: Optimize vertical loop for cache-friendly access [libavcodec/jpeg2000dwt.c:dwt_decode97_float]

2025-05-14 Thread Chitra Dey Sarkar via ffmpeg-devel
From d074ea81c12132e3a92211679adbe2d2cb4d5a69 Mon Sep 17 00:00:00 2001
From: ChitraDeySarkar 
Date: Wed, 14 May 2025 11:51:35 -0700
Subject: [PATCH] Boost FPS and performance: Optimize vertical loop for
 cache-friendly access [libavcodec/jpeg2000dwt.c:dwt_decode97_float]
From: chdey...@microsoft.com
X-Unsent: 1
To: ffmpeg-devel@ffmpeg.org

From earlier

Hi Michael,
Thanks so much for getting back! I'll quickly implement the first 3 comments

For the last comment is there a way for me to reach you on regular email to 
elaborate the proposed change more with a better explanation.
The 'git-send-email' was not good way for me to provide a detailed explanation 
for what I was trying to achieve
Additionally I can add more people from my group.

Comment from Michael
-
this should be run linewise not columnwise
if you dont understand what i mean here, please say so and ill elaborate
But basically both vertical and horizontal transforms should be done with row 
based implementations
The code before loads and safes each column (which is bad)

- Yes we would like to learn more . I am always happy to understand the details 
behind what is going on here and appreciate your explanations

Issue
-
for (i = mv; i < lv; i += 2, j++)
  l[i] = data[w * j + lp];
- VER_SD is running vertically at the moment with j being incremented in the 
innermost loop.
With w=4096 , we access data[4096] , data[8192] , data[12288]  which touches a 
new cacheline in every single iteration of the inner- loop and causes cache 
thrashing (The next iteration of loop does not use the previous cacheline)
 In our profiling on the newest Surface 11 devices with ~36M L2 cache we 
observed this loop to be a bottleneck costing ~4-5 FPS on these devices. We 
observed this on Mac M2 and M4 devices too.

Chitra's comments

The proposed fix saves each column in 2D array in reverse. Inner loops are 
sequential, but the performance benefit is also coming from the size of 2DArray

In my profiling here are the real time-values
LV : 108, 215 ,  429 ,  857 ,1714
LH : 256 , 512 , 1024 , 2048 , 4096
W : 4096

Original code the size of *data  = 1714 * 4096 * sizeof(float) = 26MB

In cache-blocking with the 2D Array I am intentionally transposing *data in a 
2D array but 2DArray is much smaller and fits in CPU cache and no need to 
access DRAM.
Here are the sizes of 2DArray
LVLHMemory for Array2DBlock
108   256   ~0.1 MB
215   512   ~0.4 MB
429   1024  ~1.6 MB
857   2048  ~6.7 MB
1714  4096  ~26 MB

Overall logic
--
The overall logic is not impacted . I do not change the contents of l[i] even 
though it gets populated through the 2D Array
sr_1d97_float using *line should not be impacted

I have validated the CRC of the output file for transcoding the first 1500 
frames of tears of Steel with and without this change and I am also happy to do 
a Demo if that is an option.

2 extra copies explanation
---
Earlier *data is malloc'd outside of the function,  without knowing LV and LH 
it took the largest LV*LH as its size, which is much larger than 2D Array

Earlier there were 3 loops accessing *data vertically (columnwise) , now there 
are 5 loops I agree. But the 5 loops are cache friendly

In the current implementation

  1.
All the loops access *data row wise but copy to 2D Array columnwise
  2.
Its might be ok to copy to 2D Array column-wise as it is smaller (fits well in 
CPU cache for 4 out of 5 times)
  3.
 All the inner loops are sequential and easier for prefetch and easier for 
compiler to apply vactorization and optimizations


I can potentially reduce the extra copies and use the fallback path  if the 
function is invoked with LV and LH large enough that the extra copies are not 
beneficial with a condition check

Overall this has shown us a lot of improvement
Please let us know if I can provide any more details. Thanks for revieing our 
code!

Regards
Chitra

Signed-off-by: ChitraDeySarkar 
---
 libavcodec/jpeg2000dwt.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/libavcodec/jpeg2000dwt.c b/libavcodec/jpeg2000dwt.c
index 45d7897893..4de95a3bea 100644
--- a/libavcodec/jpeg2000dwt.c
+++ b/libavcodec/jpeg2000dwt.c
@@ -409,14 +409,14 @@ static void dwt_decode97_float(DWTContext *s, float *t)
 /* position at index O of line range [0-5,w+5] cf. extend function */
 line += 5;

-/* Find the largest lv and lv to allocate a 2D Array*/
-int max_dim = 0;
+/* Find the largest lv and lh to allocate a 2D Array*/
+int max_dim_lv = 0 , max_dim_lh = 0;
 for (lev = 0; lev < s->ndeclevels; lev++) {
-if (s->linelen[lev][0]  > max_dim) max_dim = s->linelen[lev][0];
-if (s->linelen[lev][1] > max_dim) max_dim = s->linelen[lev][1];
-}
-float *array2DBlock = av_malloc(max_dim * max_dim * sizeof(float));
-int us

[FFmpeg-devel] [PATCH] cbs_apv: Fix memory leak on metadata parse failure

2025-05-14 Thread Mark Thompson
Buffers are allocated inside some metadata types, so we must ensure
that the object is visible to the free function before a parse failure.

Found by libFuzzer.
---
 libavcodec/cbs_apv_syntax_template.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/cbs_apv_syntax_template.c 
b/libavcodec/cbs_apv_syntax_template.c
index ca66349141..fc8a08ff31 100644
--- a/libavcodec/cbs_apv_syntax_template.c
+++ b/libavcodec/cbs_apv_syntax_template.c
@@ -543,11 +543,11 @@ static int FUNC(metadata)(CodedBitstreamContext *ctx, 
RWContext *rw,
 return AVERROR_INVALIDDATA;
 }
 
+current->metadata_count = p + 1;
+
 CHECK(FUNC(metadata_payload)(ctx, rw, pl));
 
 metadata_bytes_left -= pl->payload_size;
-
-current->metadata_count = p + 1;
 if (metadata_bytes_left == 0)
 break;
 }
-- 
2.47.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] doc: add htmlxref.cnf

2025-05-14 Thread James Almer
Silences warnings like

filters.texi:256: warning: no htmlxref.cnf entry found for `ffmpeg-utils'

Signed-off-by: James Almer 
---
Seen with current Gentoo and MSYS2 environments. May be something that started\
with texi2info 7.2

 doc/htmlxref.cnf | 6 ++
 1 file changed, 6 insertions(+)
 create mode 100644 doc/htmlxref.cnf

diff --git a/doc/htmlxref.cnf b/doc/htmlxref.cnf
new file mode 100644
index 00..823c1cb99a
--- /dev/null
+++ b/doc/htmlxref.cnf
@@ -0,0 +1,6 @@
+ffmpeg mono ./ffmpeg.html
+ffmpeg-filters mono ./ffmpeg-filters.html
+ffmpeg-formats mono ./ffmpeg-formats.html
+ffmpeg-scaler mono ./ffmpeg-scaler.html
+ffmpeg-resampler mono ./ffmpeg-resampler.html
+ffmpeg-utils mono ./ffmpeg-utils.html
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] avfilter/vf_libplacebo: add shader_cache_dir option

2025-05-14 Thread Niklas Haas
From: Niklas Haas 

Useful to speed up shader compilation. May significantly lower startup
times, in particular with large or complex shaders.

Sponsored-by: nxtedition
---
 doc/filters.texi|  5 +
 libavfilter/vf_libplacebo.c | 29 +
 2 files changed, 34 insertions(+)

diff --git a/doc/filters.texi b/doc/filters.texi
index 679b71f290..66cb3e6c20 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -16307,6 +16307,11 @@ as a list of @var{key}=@var{value} pairs separated by 
':'. The following example
 shows how to configure a custom filter kernel ("EWA LanczosSharp") and use it
 to double the input image resolution:
 
+@item shader_cache_dir
+If set to the path of a directory that exists, libplacebo will store and use
+cached shader objects in this directory. This cache is not cleaned up
+automatically.
+
 @example
 -vf 
"libplacebo=w=iw*2:h=ih*2:extra_opts='upscaler=custom\:upscaler_preset=ewa_lanczos\:upscaler_blur=0.9812505644269356'"
 @end example
diff --git a/libavfilter/vf_libplacebo.c b/libavfilter/vf_libplacebo.c
index 86e1f43dea..ca7d9e253a 100644
--- a/libavfilter/vf_libplacebo.c
+++ b/libavfilter/vf_libplacebo.c
@@ -195,6 +195,11 @@ typedef struct LibplaceboContext {
 int color_trc;
 AVDictionary *extra_opts;
 
+#if PL_API_VER >= 351
+pl_cache cache;
+char *shader_cache_dir;
+#endif
+
 int have_hwdevice;
 
 /* pl_render_params */
@@ -522,6 +527,21 @@ static int libplacebo_init(AVFilterContext *avctx)
 return AVERROR(ENOMEM);
 }
 
+#if PL_API_VER >= 351
+if (s->shader_cache_dir && s->shader_cache_dir[0]) {
+s->cache = pl_cache_create(pl_cache_params(
+.log  = s->log,
+.get  = pl_cache_get_dir,
+.set  = pl_cache_set_dir,
+.priv = s->shader_cache_dir,
+));
+if (!s->cache) {
+libplacebo_uninit(avctx);
+return AVERROR(ENOMEM);
+}
+}
+#endif
+
 if (s->out_format_string) {
 s->out_format = av_get_pix_fmt(s->out_format_string);
 if (s->out_format == AV_PIX_FMT_NONE) {
@@ -676,6 +696,9 @@ static int init_vulkan(AVFilterContext *avctx, const 
AVVulkanDeviceContext *hwct
 }
 
 s->gpu = s->vulkan->gpu;
+#if PL_API_VER >= 351
+pl_gpu_set_cache(s->gpu, s->cache);
+#endif
 
 /* Parse the user shaders, if requested */
 if (s->shader_bin_len)
@@ -714,6 +737,9 @@ static void libplacebo_uninit(AVFilterContext *avctx)
 av_freep(&s->inputs);
 }
 
+#if PL_API_VER >= 351
+pl_cache_destroy(&s->cache);
+#endif
 pl_options_free(&s->opts);
 pl_vulkan_destroy(&s->vulkan);
 pl_log_destroy(&s->log);
@@ -1328,6 +1354,9 @@ static const AVOption libplacebo_options[] = {
 { "fillcolor", "Background fill color", OFFSET(fillcolor), 
AV_OPT_TYPE_COLOR, {.str = "black@0"}, .flags = DYNAMIC },
 { "corner_rounding", "Corner rounding radius", OFFSET(corner_rounding), 
AV_OPT_TYPE_FLOAT, {.dbl = 0.0}, 0.0, 1.0, .flags = DYNAMIC },
 { "extra_opts", "Pass extra libplacebo-specific options using a 
:-separated list of key=value pairs", OFFSET(extra_opts), AV_OPT_TYPE_DICT, 
.flags = DYNAMIC },
+#if PL_API_VER >= 351
+{ "shader_cache_dir",  "Set shader cache directory", 
OFFSET(shader_cache_dir), AV_OPT_TYPE_STRING, {.str=""}, .flags = STATIC },
+#endif
 
 {"colorspace", "select colorspace", OFFSET(colorspace), AV_OPT_TYPE_INT, 
{.i64=-1}, -1, AVCOL_SPC_NB-1, DYNAMIC, .unit = "colorspace"},
 {"auto", "keep the same colorspace",  0, AV_OPT_TYPE_CONST, {.i64=-1}, 
 INT_MIN, INT_MAX, STATIC, .unit = "colorspace"},
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3] libavformat/dashdec: Fix buffer overflow in segment URL resolution

2025-05-14 Thread jing yan
Hi, just a gentle ping on this patch.

Let me know if anything else is needed.


Thanks!

 于2025年4月16日周三 14:56写道:

> From: xiaohuanshu 
>
> Problem:
> The max_url_size calculation for DASH segment URLs only considered the
> base URL
> length, leading to buffer overflow when the segment's sourceURL exceeded
> the
> pre-allocated buffer. This triggered the log error:
> "DASH request for url 'invalid:truncated'".
>
> Reproduce:
> 1. A test sample "long-sourceurl-sample.mpd" (deliberately designed with a
> long
>sourceURL) was uploaded to VideoLAN's repository.
> 2. Reproduce with short base path:
>ffmpeg -i /tmp/short_path/long-sourceurl-sample.mpd
>-> Triggers "invalid:truncated" error
> 3. With artificially lengthened base path (e.g. /aaa/../bbb/../...):
>ffmpeg -i
> /long/../path/../with/../many/../segments/long-sourceurl-sample.mpd
>-> URL resolves correctly (though HTTP fetch fails due to fake URL)
>
> Fix:
> Recalculate max_url_size by considering both base URL and sourceURL
> lengths,
> ensuring sufficient buffer allocation during URL concatenation.
>
> V2:
> 1. no need to determine whether initialization_val is null.
> 2. fix the incorrect variable name.
>
> V3:
> 1. change `max_url_size` scope into `Initialization` and `Media` blocks.
>
> Signed-off-by: xiaohuanshu 
> ---
>  libavformat/dashdec.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/libavformat/dashdec.c b/libavformat/dashdec.c
> index c3f3d7f3f8..31a84bd184 100644
> --- a/libavformat/dashdec.c
> +++ b/libavformat/dashdec.c
> @@ -606,7 +606,6 @@ static int
> parse_manifest_segmenturlnode(AVFormatContext *s, struct representati
>  char *initialization_val = NULL;
>  char *media_val = NULL;
>  char *range_val = NULL;
> -int max_url_size = c ? c->max_url_size: MAX_URL_SIZE;
>  int err;
>
>  if (!av_strcasecmp(fragmenturl_node->name, "Initialization")) {
> @@ -620,6 +619,12 @@ static int
> parse_manifest_segmenturlnode(AVFormatContext *s, struct representati
>  xmlFree(initialization_val);
>  return AVERROR(ENOMEM);
>  }
> +int max_url_size = FFMAX(
> +c ? c->max_url_size : 0,
> +aligned(strlen(initialization_val) +
> +(rep_id_val ? strlen(rep_id_val) : 0) +
> +(rep_bandwidth_val ? strlen(rep_bandwidth_val) :
> 0)));
> +max_url_size = max_url_size ? max_url_size : MAX_URL_SIZE;
>  rep->init_section->url = get_content_url(baseurl_nodes, 4,
>   max_url_size,
>   rep_id_val,
> @@ -641,6 +646,11 @@ static int
> parse_manifest_segmenturlnode(AVFormatContext *s, struct representati
>  xmlFree(media_val);
>  return AVERROR(ENOMEM);
>  }
> +int max_url_size = FFMAX(
> +c ? c->max_url_size : 0,
> +aligned(strlen(media_val) + (rep_id_val ?
> strlen(rep_id_val) : 0) +
> +(rep_bandwidth_val ? strlen(rep_bandwidth_val) :
> 0)));
> +max_url_size = max_url_size ? max_url_size : MAX_URL_SIZE;
>  seg->url = get_content_url(baseurl_nodes, 4,
> max_url_size,
> rep_id_val,
> --
> 2.39.5 (Apple Git-154)
>
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v1] fftools/ffplay: Resolve input file path before processing

2025-05-14 Thread Nicolas George
Appaji (HE12025-05-14):
> Fixes ticket: https://trac.ffmpeg.org/ticket/11574
> 
> Signed-off-by: Appaji 
> ---
>  fftools/ffplay.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/fftools/ffplay.c b/fftools/ffplay.c
> index 2a572fc3aa..42f0584b55 100644
> --- a/fftools/ffplay.c
> +++ b/fftools/ffplay.c
> @@ -27,6 +27,7 @@
>  #include "config_components.h"
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -3623,9 +3624,17 @@ static int opt_input_file(void *optctx, const char 
> *filename)
>  filename, input_filename);
>  return AVERROR(EINVAL);
>  }
> -if (!strcmp(filename, "-"))
> +
> +char resolved_path[PATH_MAX];
> +
> +if (!realpath(filename, resolved_path)) {
> +av_log(NULL, AV_LOG_FATAL, "Failed to resolve path for '%s': %s\n", 
> filename, strerror(errno));
> +return AVERROR(errno);
> +}
> +

Hi. Thanks for the patch. Did you test it with non-filenames arguments,
for example http://…?

> +if (!strcmp(resolved_path, "-"))
>  filename = "fd:";

This should happen before resolution.

> -input_filename = av_strdup(filename);
> +input_filename = av_strdup(resolved_path);
>  if (!input_filename)
>  return AVERROR(ENOMEM);
>  

On the whole, I think you are going at it wrong: you are only fixing
this for ffplay, not for ffprobe, ffmpeg and other applications built on
the libraries, and resolving the path can have side effects, for example
if you do not have permission on a parent of the current working
directory.

IMO, the correct way would be to add a stat() early in the opening of
the file and test the device number. But that requires changing quite a
lot of things.

Regards,

-- 
  Nicolas George
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] Remove libpostproc

2025-05-14 Thread Michael Niedermayer
Hi

On Wed, May 14, 2025 at 11:41:35AM +0100, Kieran Kunhya via ffmpeg-devel wrote:
> On Wed, May 14, 2025 at 11:21 AM Michael Niedermayer
>  wrote:
> >
> > Hi Andrew
> >
> > On Wed, May 14, 2025 at 05:54:54AM +0300, Andrew Randrianasulu wrote:
> > > ср, 14 мая 2025 г., 03:55 Andrew Randrianasulu :
> > >
> > > >
> > > >
> > > > вт, 6 мая 2025 г., 02:27 Michael Niedermayer :
> > > >
> > > >> This will be available in https://github.com/michaelni/libpostproc
> > > >> either as a separate library or a ffmpeg source plugin whatever turns
> > > >> out more convenient to maintain
> > > >>
> > > >
> > > >
> > > >
> > > > Congratulations, you broke  building cinelerra-gg with ffmpeg.git 
> > > > despite
> > > > our best efforts :/
> > > >
> > > > Why all this code movement?!
> > > >
> > > > For whom it "simple"?
> > > >
> > >
> > >
> > > For some reason this mail not arrived into my inbox (spam filter ate it?)
> > >
> > > https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343192.html
> > >
> > > =
> > >
> > > The idea of course here is to expand this to filters and other
> > > things. Which again is trivial, nothing really is needed except
> > > people simply following this style of a source plugin
> > >
> > >
> > > =
> > >
> > > I found this concerning. Because does this mean ffmpeg will be fragmented
> > > like Python or Rust into million pieces users supposed to held together?
> 
> libpostproc never really fit in FFmpeg,

libostproc implements part of ISO/IEC 14496-2 (MPEG-4)


> has a lot of out of date code
> and that's why it was removed.
>
> >
> > simple awnser, no
> >
> > There is an increasing number of filters which do not fit into FFmpeg.
> > For a wide range of reasons. ATM these are simply inaccessable and
> > invissible to users.
> > With plugins you will be able to use filters that have ugly dependancies,
> > or cannot be in main FFmpeg for other reasons.
> > Or you can also choose not to touch them.
> >
> > If there is interrest we can make releases with and without all plugins
> > (in fact i intend to include libpostproc in the next relaase)
> 

[...]

> We should encourage users to upstream patches.

we already do and will always continue to do that
The problem is that many things "Dont fit" in someones view.

to quote yourself from this very same email

according to you "libpostproc never really fit in FFmpeg,"
And while libpostproc is old and out of date, this view is not
limited to "old and out of date" code. We had encoders, and modern AI
filters rejected. Some quite recently

And when things cannot be developed in the main repository
They will be developed elsewhere. Thats how free software works

I always stood for innovation and for supporting a full set of features
And when that cannot be done in the main repository sometimes for
good (technical) reasons and sometimes for bad reasons. Plugins is
the awnser where every developer can continue to work on what they
are passionate about and love doing

Though this is sliding a bit off topic

Thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you think the mosad wants you dead since a long time then you are either
wrong or dead since a long time.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avformat/iamf_parse: increase PutBytes buffer when writing AAC extradata

2025-05-14 Thread Michael Niedermayer
On Tue, May 13, 2025 at 08:45:00PM -0300, James Almer wrote:
> We may write up to 43 bits, so 5 bytes is not enough.
> 
> Fixes: Assertion n>=0 && n<=32 failed at ./libavcodec/get_bits.h:406
> Fixes: 
> 398527871/clusterfuzz-testcase-minimized-ffmpeg_dem_IAMF_fuzzer-6602025714647040
> 
> Found-by: continuous fuzzing process 
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: James Almer 
> ---
>  libavformat/iamf_parse.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/libavformat/iamf_parse.c b/libavformat/iamf_parse.c
> index abedfdb066..11c27ebe98 100644
> --- a/libavformat/iamf_parse.c
> +++ b/libavformat/iamf_parse.c
> @@ -285,7 +285,7 @@ static int update_extradata(AVCodecParameters *codecpar)
>  AV_WL16A(codecpar->extradata + 16, AV_RB16A(codecpar->extradata + 
> 16)); // Byte swap Output Gain
>  break;
>  case AV_CODEC_ID_AAC: {
> -uint8_t buf[5];
> +uint8_t buf[6];
>  
>  init_put_bits(&pb, buf, sizeof(buf));
>  ret = init_get_bits8(&gb, codecpar->extradata, 
> codecpar->extradata_size);
> @@ -304,6 +304,10 @@ static int update_extradata(AVCodecParameters *codecpar)
>  skip_bits(&gb, 4);
>  put_bits(&pb, 4, codecpar->ch_layout.nb_channels); // set channel 
> config
>  ret = put_bits_left(&pb);
> +while (ret >= 32) {
> +   put_bits32(&pb, get_bits_long(&gb, 32));
> +   ret -= 32;
> +}
>  put_bits(&pb, ret, get_bits_long(&gb, ret));
>  flush_put_bits(&pb);

bit_copy() from libavcodec/dvdec.c seems doing the same
maybe this can be factored somewhere before or after this patch

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Into a blind darkness they enter who follow after the Ignorance,
they as if into a greater darkness enter who devote themselves
to the Knowledge alone. -- Isha Upanishad


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 5/5] checkasm: add vvc_sao

2025-05-14 Thread Nuo Mi
On Sat, May 10, 2025 at 8:45 PM Nuo Mi  wrote:

>
>
> On Wed, May 7, 2025 at 5:26 AM Martin Storsjö  wrote:
>
>> On Sat, 3 May 2025, Nuo Mi wrote:
>>
>> > Hi Martin,Great, it works!
>> > HEVC is included in v2.
>>
>> Thanks great, thanks for looking into it! The checkasm aspects of patches
>> 5-7/7 look good to me.
>>
> Thank you, Martin.
> I’ll merge if there are no other objections
>
Applied.

>
>> // Martin
>>
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 01/23] avcodec/vvc/cabac: add 9.3.3.5 k-th order Exp - Golomb binarization process

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/cabac.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/libavcodec/vvc/cabac.c b/libavcodec/vvc/cabac.c
index 5510144893..54055ed736 100644
--- a/libavcodec/vvc/cabac.c
+++ b/libavcodec/vvc/cabac.c
@@ -928,6 +928,27 @@ static int truncated_binary_decode(VVCLocalContext *lc, 
const int c_max)
 return v;
 }
 
+// 9.3.3.5 k-th order Exp - Golomb binarization process
+static int kth_order_egk_decode(CABACContext *c, int k)
+{
+int bit= 1;
+int value  = 0;
+int symbol = 0;
+
+while (bit) {
+bit = get_cabac_bypass(c);
+value += bit << k++;
+}
+
+if (--k) {
+for (int i = 0; i < k; i++)
+symbol = (symbol << 1) | get_cabac_bypass(c);
+value += symbol;
+}
+
+return value;
+}
+
 // 9.3.3.6 Limited k-th order Exp-Golomb binarization process
 static int limited_kth_order_egk_decode(CABACContext *c, const int k, const 
int max_pre_ext_len, const int trunc_suffix_len)
 {
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 02/23] avcodec/vvc/cabac: add 9.3.3.7 Fixed-length binarization process

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/cabac.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/libavcodec/vvc/cabac.c b/libavcodec/vvc/cabac.c
index 54055ed736..9290ecd90f 100644
--- a/libavcodec/vvc/cabac.c
+++ b/libavcodec/vvc/cabac.c
@@ -968,6 +968,17 @@ static int limited_kth_order_egk_decode(CABACContext *c, 
const int k, const int
 return val;
 }
 
+// 9.3.3.7 Fixed-length binarization process
+static int fixed_length_decode(CABACContext* c, const int len)
+{
+int value = 0;
+
+for (int i = 0; i < len; i++)
+value = (value << 1) | get_cabac_bypass(c);
+
+return value;
+}
+
 static av_always_inline
 void get_left_top(const VVCLocalContext *lc, uint8_t *left, uint8_t *top,
 const int x0, const int y0, const uint8_t *left_ctx, const uint8_t 
*top_ctx)
@@ -1011,11 +1022,7 @@ int ff_vvc_sao_type_idx_decode(VVCLocalContext *lc)
 
 int ff_vvc_sao_band_position_decode(VVCLocalContext *lc)
 {
-int value = get_cabac_bypass(&lc->ep->cc);
-
-for (int i = 0; i < 4; i++)
-value = (value << 1) | get_cabac_bypass(&lc->ep->cc);
-return value;
+return fixed_length_decode(&lc->ep->cc, 5);
 }
 
 int ff_vvc_sao_offset_abs_decode(VVCLocalContext *lc)
@@ -1035,9 +1042,7 @@ int ff_vvc_sao_offset_sign_decode(VVCLocalContext *lc)
 
 int ff_vvc_sao_eo_class_decode(VVCLocalContext *lc)
 {
-int ret = get_cabac_bypass(&lc->ep->cc) << 1;
-ret|= get_cabac_bypass(&lc->ep->cc);
-return ret;
+return (get_cabac_bypass(&lc->ep->cc) << 1) | 
get_cabac_bypass(&lc->ep->cc);
 }
 
 int ff_vvc_alf_ctb_flag(VVCLocalContext *lc, const int rx, const int ry, const 
int c_idx)
@@ -1479,12 +1484,7 @@ int ff_vvc_merge_idx(VVCLocalContext *lc)
 
 int ff_vvc_merge_gpm_partition_idx(VVCLocalContext *lc)
 {
-int i = 0;
-
-for (int j = 0; j < 6; j++)
-i = (i << 1) | get_cabac_bypass(&lc->ep->cc);
-
-return i;
+return fixed_length_decode(&lc->ep->cc, 6);
 }
 
 int ff_vvc_merge_gpm_idx(VVCLocalContext *lc, const int idx)
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 06/23] avcodec/vvc: refact out ep_init and ep_init_wpp

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/dec.c| 48 ++---
 libavcodec/vvc/thread.c | 12 +++
 2 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/libavcodec/vvc/dec.c b/libavcodec/vvc/dec.c
index f860e116ab..09b0053703 100644
--- a/libavcodec/vvc/dec.c
+++ b/libavcodec/vvc/dec.c
@@ -506,23 +506,18 @@ static int slices_realloc(VVCFrameContext *fc)
 return 0;
 }
 
-static int ep_init_cabac_decoder(SliceContext *sc, const int index,
-const H2645NAL *nal, GetBitContext *gb, const CodedBitstreamUnit *unit)
+static int get_ep_size(const H266RawSliceHeader *rsh, GetBitContext *gb, const 
H2645NAL *nal, const int header_size, const int ep_index)
 {
-const H266RawSlice *slice = unit->content_ref;
-const H266RawSliceHeader *rsh = sc->sh.r;
-EntryPoint *ep= sc->eps + index;
 int size;
-int ret;
 
-if (index < rsh->num_entry_points) {
+if (ep_index < rsh->num_entry_points) {
 int skipped = 0;
 int64_t start =  (gb->index >> 3);
-int64_t end = start + rsh->sh_entry_point_offset_minus1[index] + 1;
-while (skipped < nal->skipped_bytes && nal->skipped_bytes_pos[skipped] 
<= start + slice->header_size) {
+int64_t end = start + rsh->sh_entry_point_offset_minus1[ep_index] + 1;
+while (skipped < nal->skipped_bytes && nal->skipped_bytes_pos[skipped] 
<= start + header_size) {
 skipped++;
 }
-while (skipped < nal->skipped_bytes && nal->skipped_bytes_pos[skipped] 
<= end + slice->header_size) {
+while (skipped < nal->skipped_bytes && nal->skipped_bytes_pos[skipped] 
<= end + header_size) {
 end--;
 skipped++;
 }
@@ -531,6 +526,13 @@ static int ep_init_cabac_decoder(SliceContext *sc, const 
int index,
 } else {
 size = get_bits_left(gb) / 8;
 }
+return size;
+}
+
+static int ep_init_cabac_decoder(EntryPoint *ep, GetBitContext *gb, const int 
size)
+{
+int ret;
+
 av_assert0(gb->buffer + get_bits_count(gb) / 8 + size <= gb->buffer_end);
 ret = ff_init_cabac_decoder (&ep->cc, gb->buffer + get_bits_count(gb) / 8, 
size);
 if (ret < 0)
@@ -539,6 +541,19 @@ static int ep_init_cabac_decoder(SliceContext *sc, const 
int index,
 return 0;
 }
 
+static int ep_init(EntryPoint *ep, const int ctu_addr, const int ctu_end, 
GetBitContext *gb, const int size)
+{
+const int ret = ep_init_cabac_decoder(ep, gb, size);
+
+if (ret < 0)
+return ret;
+
+ep->ctu_start = ctu_addr;
+ep->ctu_end   = ctu_end;
+
+return 0;
+}
+
 static int slice_init_entry_points(SliceContext *sc,
 VVCFrameContext *fc, const H2645NAL *nal, const CodedBitstreamUnit *unit)
 {
@@ -562,20 +577,19 @@ static int slice_init_entry_points(SliceContext *sc,
 return ret;
 for (int i = 0; i < sc->nb_eps; i++)
 {
-EntryPoint *ep = sc->eps + i;
+const int size= get_ep_size(sc->sh.r, &gb, nal, 
slice->header_size, i);
+const int ctu_end = (i + 1 == sc->nb_eps ? sh->num_ctus_in_curr_slice 
: sh->entry_point_start_ctu[i]);
+EntryPoint *ep= sc->eps + i;
 
-ep->ctu_start = ctu_addr;
-ep->ctu_end   = (i + 1 == sc->nb_eps ? sh->num_ctus_in_curr_slice : 
sh->entry_point_start_ctu[i]);
+ret = ep_init(ep, ctu_addr, ctu_end, &gb, size);
+if (ret < 0)
+return ret;
 
 for (int j = ep->ctu_start; j < ep->ctu_end; j++) {
 const int rs = sc->sh.ctb_addr_in_curr_slice[j];
 fc->tab.slice_idx[rs] = sc->slice_idx;
 }
 
-ret = ep_init_cabac_decoder(sc, i, nal, &gb, unit);
-if (ret < 0)
-return ret;
-
 if (i + 1 < sc->nb_eps)
 ctu_addr = sh->entry_point_start_ctu[i];
 }
diff --git a/libavcodec/vvc/thread.c b/libavcodec/vvc/thread.c
index 6194416e14..e1d64bd3d2 100644
--- a/libavcodec/vvc/thread.c
+++ b/libavcodec/vvc/thread.c
@@ -283,6 +283,12 @@ static void add_progress_listener(VVCFrame *ref, 
ProgressListener *l,
 ff_vvc_add_progress_listener(ref, (VVCProgressListener*)l);
 }
 
+static void ep_init_wpp(EntryPoint *next, const EntryPoint *ep, const VVCSPS 
*sps)
+{
+memcpy(next->cabac_state, ep->cabac_state, sizeof(next->cabac_state));
+ff_vvc_ep_init_stat_coeff(next, sps->bit_depth, 
sps->r->sps_persistent_rice_adaptation_enabled_flag);
+}
+
 static void schedule_next_parse(VVCContext *s, VVCFrameContext *fc, const 
SliceContext *sc, const VVCTask *t)
 {
 VVCFrameThread *ft = fc->ft;
@@ -292,10 +298,8 @@ static void schedule_next_parse(VVCContext *s, 
VVCFrameContext *fc, const SliceC
 if (sps->r->sps_entropy_coding_sync_enabled_flag) {
 if (t->rx == fc->ps.pps->ctb_to_col_bd[t->rx]) {
 EntryPoint *next = ep + 1;
-if (next < sc->eps + sc->nb_eps && !is_first_row(fc, t->rx, t->ry 
+ 1)) {
-memcpy(next->cabac_state, ep->cabac_state

[FFmpeg-devel] [PATCH v1 05/23] avcodec/vvc/ctu: refact out ff_vvc_channel_range

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/ctu.c   | 16 
 libavcodec/vvc/ctu.h   |  1 +
 libavcodec/vvc/intra.c |  8 
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index 080b740cc6..c621b6d5d6 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -501,13 +501,12 @@ static int skipped_transform_tree(VVCLocalContext *lc, 
int x0, int y0,int tu_wid
 SKIPPED_TRANSFORM_TREE(x0, y0 + trafo_height);
 } else {
 TransformUnit *tu= add_tu(fc, lc->cu, x0, y0, tu_width, tu_height);
-const int has_chroma = sps->r->sps_chroma_format_idc && cu->tree_type 
!= DUAL_TREE_LUMA;
-const int c_start= cu->tree_type == DUAL_TREE_CHROMA ? CB : LUMA;
-const int c_end  = has_chroma ? VVC_MAX_SAMPLE_ARRAYS : CB;
+int start, end;
 
 if (!tu)
 return AVERROR_INVALIDDATA;
-for (int i = c_start; i < c_end; i++) {
+ff_vvc_channel_range(&start, &end, cu->tree_type, 
sps->r->sps_chroma_format_idc);
+for (int i = start; i < end; i++) {
 TransformBlock *tb = add_tb(tu, lc, x0, y0, tu_width >> 
sps->hshift[i], tu_height >> sps->vshift[i], i);
 if (i != CR)
 set_tb_size(fc, tb);
@@ -2580,3 +2579,12 @@ void ff_vvc_ep_init_stat_coeff(EntryPoint *ep,
 persistent_rice_adaptation_enabled_flag ? 2 * (av_log2(bit_depth - 
10)) : 0;
 }
 }
+
+void ff_vvc_channel_range(int *start, int *end, const VVCTreeType tree_type, 
const uint8_t chroma_format_idc)
+{
+const bool has_chroma = chroma_format_idc && tree_type != DUAL_TREE_LUMA;
+const bool has_luma   = tree_type != DUAL_TREE_CHROMA;
+
+*start = has_luma   ? LUMA : CB;
+*end   = has_chroma ? VVC_MAX_SAMPLE_ARRAYS : CB;
+}
diff --git a/libavcodec/vvc/ctu.h b/libavcodec/vvc/ctu.h
index c5533c1ad0..dab6f453f1 100644
--- a/libavcodec/vvc/ctu.h
+++ b/libavcodec/vvc/ctu.h
@@ -489,5 +489,6 @@ void ff_vvc_decode_neighbour(VVCLocalContext *lc, int 
x_ctb, int y_ctb, int rx,
 void ff_vvc_ctu_free_cus(CodingUnit **cus);
 int ff_vvc_get_qPy(const VVCFrameContext *fc, int xc, int yc);
 void ff_vvc_ep_init_stat_coeff(EntryPoint *ep, int bit_depth, int 
persistent_rice_adaptation_enabled_flag);
+void ff_vvc_channel_range(int *start, int *end, VVCTreeType tree_type, uint8_t 
chroma_format_idc);
 
 #endif // AVCODEC_VVC_CTU_H
diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index 41ed89c946..2e6cb8f09e 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -639,11 +639,11 @@ static void ibc_fill_vir_buf(const VVCLocalContext *lc, 
const CodingUnit *cu)
 {
 const VVCFrameContext *fc = lc->fc;
 const VVCSPS *sps = fc->ps.sps;
-const int has_chroma  = sps->r->sps_chroma_format_idc && cu->tree_type 
!= DUAL_TREE_LUMA;
-const int start   = cu->tree_type == DUAL_TREE_CHROMA;
-const int end = has_chroma ? CR : LUMA;
+int start, end;
 
-for (int c_idx = start; c_idx <= end; c_idx++) {
+ff_vvc_channel_range(&start, &end, cu->tree_type, 
sps->r->sps_chroma_format_idc);
+
+for (int c_idx = start; c_idx < end; c_idx++) {
 const int hs = sps->hshift[c_idx];
 const int vs = sps->vshift[c_idx];
 const int ps = sps->pixel_shift;
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 03/23] avcodec/vvc/cabac: add palette support

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/cabac.c | 52 ++
 libavcodec/vvc/cabac.h |  9 
 2 files changed, 61 insertions(+)

diff --git a/libavcodec/vvc/cabac.c b/libavcodec/vvc/cabac.c
index 9290ecd90f..700b719b7c 100644
--- a/libavcodec/vvc/cabac.c
+++ b/libavcodec/vvc/cabac.c
@@ -1377,6 +1377,58 @@ int ff_vvc_intra_chroma_pred_mode(VVCLocalContext *lc)
 return (get_cabac_bypass(&lc->ep->cc) << 1) | 
get_cabac_bypass(&lc->ep->cc);
 }
 
+int ff_vvc_palette_predictor_run(VVCLocalContext *lc)
+{
+return kth_order_egk_decode(&lc->ep->cc, 0);
+}
+
+int ff_vvc_num_signalled_palette_entries(VVCLocalContext *lc)
+{
+return kth_order_egk_decode(&lc->ep->cc, 0);
+}
+
+int ff_vvc_new_palette_entries(VVCLocalContext *lc, const int bit_depth)
+{
+return fixed_length_decode(&lc->ep->cc, bit_depth);
+}
+
+bool ff_vvc_palette_escape_val_present_flag(VVCLocalContext *lc)
+{
+return get_cabac_bypass(&lc->ep->cc);
+}
+
+bool ff_vvc_palette_transpose_flag(VVCLocalContext *lc)
+{
+return GET_CABAC(PALETTE_TRANSPOSE_FLAG);
+}
+
+bool ff_vvc_run_copy_flag(VVCLocalContext *lc, const int prev_run_type, const 
int prev_run_position, const int cur_pos)
+{
+uint8_t run_left_lut[] = { 0, 1, 2, 3, 4 };
+uint8_t run_top_lut[] = { 5, 6, 6, 7, 7 };
+
+int bin_dist = cur_pos - prev_run_position - 1;
+uint8_t *run_lut = prev_run_type == 1 ? run_top_lut : run_left_lut;
+uint8_t ctx_inc = bin_dist <= 4 ? run_lut[bin_dist] : run_lut[4];
+
+return GET_CABAC(RUN_COPY_FLAG + ctx_inc);
+}
+
+bool ff_vvc_copy_above_palette_indices_flag(VVCLocalContext *lc)
+{
+return GET_CABAC(COPY_ABOVE_PALETTE_INDICES_FLAG);
+}
+
+int ff_vvc_palette_idx_idc(VVCLocalContext *lc, const int max_palette_index, 
const bool adjust)
+{
+return truncated_binary_decode(lc, max_palette_index - adjust);
+}
+
+int ff_vvc_palette_escape_val(VVCLocalContext *lc)
+{
+return kth_order_egk_decode(&lc->ep->cc, 5);
+}
+
 int ff_vvc_general_merge_flag(VVCLocalContext *lc)
 {
 return GET_CABAC(GENERAL_MERGE_FLAG);
diff --git a/libavcodec/vvc/cabac.h b/libavcodec/vvc/cabac.h
index e9bc98e23a..92f0163c85 100644
--- a/libavcodec/vvc/cabac.h
+++ b/libavcodec/vvc/cabac.h
@@ -81,6 +81,15 @@ int ff_vvc_intra_luma_mpm_remainder(VVCLocalContext *lc);
 int ff_vvc_cclm_mode_flag(VVCLocalContext *lc);
 int ff_vvc_cclm_mode_idx(VVCLocalContext *lc);
 int ff_vvc_intra_chroma_pred_mode(VVCLocalContext *lc);
+int ff_vvc_palette_predictor_run(VVCLocalContext *lc);
+int ff_vvc_num_signalled_palette_entries(VVCLocalContext *lc);
+int ff_vvc_new_palette_entries(VVCLocalContext *lc, int bit_dpeth);
+bool ff_vvc_palette_escape_val_present_flag(VVCLocalContext *lc);
+bool ff_vvc_palette_transpose_flag(VVCLocalContext *lc);
+bool ff_vvc_run_copy_flag(VVCLocalContext *lc, int prev_run_type, int 
prev_run_position, int cur_pos);
+bool ff_vvc_copy_above_palette_indices_flag(VVCLocalContext *lc);
+int ff_vvc_palette_idx_idc(VVCLocalContext *lc, int max_palette_index, bool 
adjust);
+int ff_vvc_palette_escape_val(VVCLocalContext *lc);
 
 //inter
 int ff_vvc_general_merge_flag(VVCLocalContext *lc);
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 09/23] avcodec/vvc/intra: add ff_vvc_palette_derive_scale

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/intra.c | 52 ++
 libavcodec/vvc/intra.h |  1 +
 2 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index 2e6cb8f09e..2e1703e234 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -336,29 +336,30 @@ static void derive_qp(const VVCLocalContext *lc, const 
TransformUnit *tu, Transf
 tb->bd_offset = (1 << tb->bd_shift) >> 1;
 }
 
+static const uint8_t rem6[63 + 8 * 6 + 1] = {
+0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0, 
 1,  2,  3,  4,  5,
+0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0, 
 1,  2,  3,  4,  5,
+0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0, 
 1,  2,  3,  4,  5,
+0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0, 
 1,  2,  3,  4,  5,
+0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,
+};
+
+static const uint8_t div6[63 + 8 * 6 + 1] = {
+0,  0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  2,  3, 
 3,  3,  3,  3,  3,
+4,  4,  4,  4,  4,  4,  5,  5,  5,  5,  5,  5,  6,  6,  6,  6,  6,  6,  7, 
 7,  7,  7,  7,  7,
+8,  8,  8,  8,  8,  8,  9,  9,  9,  9,  9,  9, 10, 10, 10, 10, 10, 10, 11, 
11, 11, 11, 11, 11,
+   12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 15, 
15, 15, 15, 15, 15,
+   16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 18, 18, 18, 18,
+};
+
+const static int level_scale[2][6] = {
+   { 40, 45, 51, 57, 64, 72 },
+   { 57, 64, 72, 80, 90, 102 }
+};
+
 //8.7.3 Scaling process for transform coefficients
 static av_always_inline int derive_scale(const TransformBlock *tb, const int 
sh_dep_quant_used_flag)
 {
-static const uint8_t rem6[63 + 8 * 6 + 1] = {
- 0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  
5,  0,  1,  2,  3,  4,  5,
- 0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  
5,  0,  1,  2,  3,  4,  5,
- 0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  
5,  0,  1,  2,  3,  4,  5,
- 0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  
5,  0,  1,  2,  3,  4,  5,
- 0,  1,  2,  3,  4,  5,  0,  1,  2,  3,  4,  5,  0,  1,  2,  3,
-};
-
-static const uint8_t div6[63 + 8 * 6 + 1] = {
- 0,  0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  
2,  3,  3,  3,  3,  3,  3,
- 4,  4,  4,  4,  4,  4,  5,  5,  5,  5,  5,  5,  6,  6,  6,  6,  6,  
6,  7,  7,  7,  7,  7,  7,
- 8,  8,  8,  8,  8,  8,  9,  9,  9,  9,  9,  9, 10, 10, 10, 10, 10, 
10, 11, 11, 11, 11, 11, 11,
-12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 
14, 15, 15, 15, 15, 15, 15,
-16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 18, 18, 18, 18,
-};
-
-const static int level_scale[2][6] = {
-{ 40, 45, 51, 57, 64, 72 },
-{ 57, 64, 72, 80, 90, 102 }
-};
 const int addin = sh_dep_quant_used_flag && !tb->ts;
 const int qp= tb->qp + addin;
 
@@ -658,6 +659,17 @@ static void ibc_fill_vir_buf(const VVCLocalContext *lc, 
const CodingUnit *cu)
 }
 }
 
+int ff_vvc_palette_derive_scale(VVCLocalContext *lc, const TransformUnit *tu, 
TransformBlock *tb)
+{
+const VVCSPS *sps = lc->fc->ps.sps;
+const int qp_prime_ts_min  = 4 + 6 * sps->r->sps_min_qp_prime_ts;
+int qp;
+
+derive_qp(lc, tu, tb);
+qp = FFMAX(qp_prime_ts_min, tb->qp);
+return level_scale[0][rem6[qp]] << div6[qp];
+}
+
 int ff_vvc_reconstruct(VVCLocalContext *lc, const int rs, const int rx, const 
int ry)
 {
 const VVCFrameContext *fc   = lc->fc;
diff --git a/libavcodec/vvc/intra.h b/libavcodec/vvc/intra.h
index 8a02699135..1201c70836 100644
--- a/libavcodec/vvc/intra.h
+++ b/libavcodec/vvc/intra.h
@@ -45,5 +45,6 @@ int ff_vvc_intra_pred_angle_derive(int pred_mode);
 int ff_vvc_intra_inv_angle_derive(int pred_mode);
 int ff_vvc_wide_angle_mode_mapping(const CodingUnit *cu,
 int tb_width, int tb_height, int c_idx, int pred_mode_intra);
+int ff_vvc_palette_derive_scale(VVCLocalContext *lc, const TransformUnit *tu, 
TransformBlock *tb);
 
 #endif // AVCODEC_VVC_INTRA_H
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 04/23] avcodec/vvc: add VVC_MAX_NUM_PALETTE_PREDICTOR_SIZE

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libavcodec/vvc.h b/libavcodec/vvc.h
index 92639779c1..5490ddb4c8 100644
--- a/libavcodec/vvc.h
+++ b/libavcodec/vvc.h
@@ -154,6 +154,9 @@ enum {
 
 // {sps, ph}_num_{ver, hor}_virtual_boundaries should in [0, 3]
 VVC_MAX_VBS = 3,
+
+// 8.4.5.3 Decoding process for palette mode - maxNumPalettePredictorSize
+VVC_MAX_NUM_PALETTE_PREDICTOR_SIZE = 63
 };
 
 #endif /* AVCODEC_VVC_H */
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 12/23] avcodec/vvc/intra: add palette coding decoder

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Introduction at https://ieeexplore.ieee.org/document/9408666

passed files:
10b422_G_Sony_5.bit
10b422_H_Sony_5.bit
10b422_I_Sony_5.bit
10b422_J_Sony_5.bit
10b422_K_Sony_5.bit
10b422_L_Sony_5.bit
8b422_G_Sony_5.bit
8b422_H_Sony_5.bit
8b422_I_Sony_5.bit
8b422_J_Sony_5.bit
8b422_K_Sony_5.bit
8b422_L_Sony_5.bit
8b444_A_Kwai_2.bit
8b444_B_Kwai_2.bit
PALETTE_A_Alibaba_2.bit
PALETTE_B_Alibaba_2.bit
PALETTE_C_Alibaba_2.bit
PALETTE_D_Alibaba_2.bit
PALETTE_E_Alibaba_2.bit

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/intra.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index 2e1703e234..7f772fa4ae 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -670,6 +670,27 @@ int ff_vvc_palette_derive_scale(VVCLocalContext *lc, const 
TransformUnit *tu, Tr
 return level_scale[0][rem6[qp]] << div6[qp];
 }
 
+// 8.4.5.3 Decoding process for palette mode
+static void vvc_predict_palette(VVCLocalContext *lc)
+{
+const VVCFrameContext *fc = lc->fc;
+const CodingUnit *cu  = lc->cu;
+TransformUnit *tu = cu->tus.head;
+const VVCSPS *sps = fc->ps.sps;
+const int ps  = sps->pixel_shift;
+
+for (int i = 0; i < tu->nb_tbs; i++) {
+TransformBlock *tb = &tu->tbs[i];
+const int c_idx= tb->c_idx;
+const int w= tb->tb_width;
+const int h= tb->tb_height;
+const ptrdiff_t stride = fc->frame->linesize[c_idx];
+uint8_t *dst   = POS(c_idx, cu->x0, cu->y0);
+
+av_image_copy_plane(dst, stride, (uint8_t*)tb->coeffs, w << ps, w << 
ps, h);
+}
+}
+
 int ff_vvc_reconstruct(VVCLocalContext *lc, const int rs, const int rx, const 
int ry)
 {
 const VVCFrameContext *fc   = lc->fc;
@@ -690,6 +711,8 @@ int ff_vvc_reconstruct(VVCLocalContext *lc, const int rs, 
const int rx, const in
 ff_vvc_predict_ciip(lc);
 else if (cu->pred_mode == MODE_IBC)
 vvc_predict_ibc(lc);
+else if (cu->pred_mode == MODE_PLT)
+vvc_predict_palette(lc);
 if (cu->coded_flag) {
 ret = reconstruct(lc);
 } else {
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 07/23] avcodec/vvc: refact, save pf and ciip_flag in ff_vvc_set_intra_mvf

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/ctu.c |  4 ++--
 libavcodec/vvc/mvs.c | 24 
 libavcodec/vvc/mvs.h |  2 +-
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index c621b6d5d6..f77697af08 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -1756,7 +1756,7 @@ static void fill_dmvr_info(const VVCLocalContext *lc)
 const CodingUnit *cu  = lc->cu;
 
 if (cu->pred_mode == MODE_IBC) {
-ff_vvc_set_intra_mvf(lc, 1);
+ff_vvc_set_intra_mvf(lc, true, PF_IBC, false);
 } else {
 const VVCPPS *pps = fc->ps.pps;
 const int w   = cu->cb_width >> MIN_PU_LOG2;
@@ -1849,8 +1849,8 @@ static int hls_coding_unit(VVCLocalContext *lc, int x0, 
int y0, int cb_width, in
 return AVERROR_PATCHWELCOME;
 } else {
 intra_luma_pred_modes(lc);
+ff_vvc_set_intra_mvf(lc, false, PF_INTRA, cu->ciip_flag);
 }
-ff_vvc_set_intra_mvf(lc, 0);
 }
 if ((tree_type == SINGLE_TREE || tree_type == DUAL_TREE_CHROMA) && 
sps->r->sps_chroma_format_idc) {
 if (pred_mode_plt_flag && tree_type == DUAL_TREE_CHROMA) {
diff --git a/libavcodec/vvc/mvs.c b/libavcodec/vvc/mvs.c
index 566df158a8..8946b00b5b 100644
--- a/libavcodec/vvc/mvs.c
+++ b/libavcodec/vvc/mvs.c
@@ -144,7 +144,8 @@ static int derive_temporal_colocated_mvs(const 
VVCLocalContext *lc, MvField temp
 const SliceContext *sc  = lc->sc;
 RefPicList* refPicList  = sc->rpl;
 
-if (temp_col.pred_flag == PF_INTRA)
+if (temp_col.pred_flag == PF_INTRA ||
+temp_col.pred_flag == PF_IBC)
 return 0;
 
 if (sb_flag){
@@ -266,7 +267,7 @@ void ff_vvc_set_mvf(const VVCLocalContext *lc, const int 
x0, const int y0, const
 }
 }
 
-void ff_vvc_set_intra_mvf(const VVCLocalContext *lc, const int dmvr)
+void ff_vvc_set_intra_mvf(const VVCLocalContext *lc, const bool dmvr, const 
PredFlag pf, const bool ciip_flag)
 {
 const VVCFrameContext *fc   = lc->fc;
 const CodingUnit *cu= lc->cu;
@@ -277,7 +278,10 @@ void ff_vvc_set_intra_mvf(const VVCLocalContext *lc, const 
int dmvr)
 for (int dx = 0; dx < cu->cb_width; dx += min_pu_size) {
 const int x = cu->x0 + dx;
 const int y = cu->y0 + dy;
-TAB_MVF(x, y).pred_flag = PF_INTRA;
+MvField *mv = &TAB_MVF(x, y);
+
+mv->pred_flag = pf;
+mv->ciip_flag = ciip_flag;
 }
 }
 }
@@ -599,7 +603,19 @@ static void init_neighbour_context(NeighbourContext *ctx, 
const VVCLocalContext
 
 static av_always_inline PredMode pred_flag_to_mode(PredFlag pred)
 {
-return pred == PF_IBC ? MODE_IBC : (pred == PF_INTRA ? MODE_INTRA : 
MODE_INTER);
+static const PredMode lut[] = {
+MODE_INTRA, // PF_INTRA
+MODE_INTER, // PF_L0
+MODE_INTER, // PF_L1
+MODE_INTER, // PF_BI
+0,  // invalid
+MODE_IBC,   // PF_IBC
+0,  // invalid
+0,  // invalid
+MODE_PLT,   // PF_PLT
+};
+
+return lut[pred];
 }
 
 static int check_available(Neighbour *n, const VVCLocalContext *lc, const int 
check_mer)
diff --git a/libavcodec/vvc/mvs.h b/libavcodec/vvc/mvs.h
index b2242b2a4d..7150c0b8cf 100644
--- a/libavcodec/vvc/mvs.h
+++ b/libavcodec/vvc/mvs.h
@@ -43,6 +43,6 @@ void ff_vvc_update_hmvp(VVCLocalContext *lc, const MotionInfo 
*mi);
 int ff_vvc_no_backward_pred_flag(const VVCLocalContext *lc);
 MvField* ff_vvc_get_mvf(const VVCFrameContext *fc, const int x0, const int y0);
 void ff_vvc_set_mvf(const VVCLocalContext *lc, const int x0, const int y0, 
const int w, const int h, const MvField *mvf);
-void ff_vvc_set_intra_mvf(const VVCLocalContext *lc, int dmvr);
+void ff_vvc_set_intra_mvf(const VVCLocalContext *lc, bool dmvr, PredFlag pf, 
bool ciip_flag);
 
 #endif //AVCODEC_VVC_MVS_H
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 10/23] avcodec/vvc/ctu: add palette support

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/ctu.c| 339 
 libavcodec/vvc/ctu.h|  11 ++
 libavcodec/vvc/dec.c|   3 +
 libavcodec/vvc/mvs.c|   3 +-
 libavcodec/vvc/thread.c |   1 +
 5 files changed, 327 insertions(+), 30 deletions(-)

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index c5df898f7b..979a27c6ad 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -25,6 +25,7 @@
 #include "cabac.h"
 #include "ctu.h"
 #include "inter.h"
+#include "intra.h"
 #include "mvs.h"
 
 #define PROF_TEMP_SIZE (PROF_BLOCK_SIZE) * sizeof(int16_t)
@@ -1046,13 +1047,15 @@ static PredMode pred_mode_decode(VVCLocalContext *lc,
 const H266RawSliceHeader *rsh   = lc->sc->sh.r;
 const int ch_type   = tree_type == DUAL_TREE_CHROMA ? 1 : 0;
 const int is_4x4= cu->cb_width == 4 && cu->cb_height == 4;
+const int is_128= cu->cb_width == 128 || cu->cb_height == 
128;
+const int hs= sps->hshift[CHROMA];
+const int vs= sps->vshift[CHROMA];
 int pred_mode_flag;
 int pred_mode_ibc_flag;
 PredMode pred_mode;
 
 cu->skip_flag = 0;
 if (!IS_I(rsh) || sps->r->sps_ibc_enabled_flag) {
-const int is_128 = cu->cb_width == 128 || cu->cb_height == 128;
 if (tree_type != DUAL_TREE_CHROMA &&
 ((!is_4x4 && mode_type != MODE_TYPE_INTRA) ||
 (sps->r->sps_ibc_enabled_flag && !is_128))) {
@@ -1087,6 +1090,14 @@ static PredMode pred_mode_decode(VVCLocalContext *lc,
 pred_mode = MODE_INTRA;
 }
 
+if (pred_mode == MODE_INTRA && sps->r->sps_palette_enabled_flag && !is_128 
&& !cu->skip_flag &&
+mode_type != MODE_TYPE_INTER && ((cu->cb_width * cu->cb_height) >
+(tree_type != DUAL_TREE_CHROMA ? 16 : (16 << hs << vs))) &&
+(mode_type != MODE_TYPE_INTRA || tree_type != DUAL_TREE_CHROMA)) {
+if (ff_vvc_pred_mode_plt_flag(lc))
+pred_mode = MODE_PLT;
+}
+
 set_cb_tab(lc, fc->tab.cpm[cu->ch_type], pred_mode);
 if (tree_type == SINGLE_TREE)
 set_cb_tab(lc, fc->tab.cpm[CHROMA], pred_mode);
@@ -1755,8 +1766,8 @@ static void fill_dmvr_info(const VVCLocalContext *lc)
 const VVCFrameContext *fc = lc->fc;
 const CodingUnit *cu  = lc->cu;
 
-if (cu->pred_mode == MODE_IBC) {
-ff_vvc_set_intra_mvf(lc, true, PF_IBC, false);
+if (cu->pred_mode == MODE_IBC || cu->pred_mode == MODE_PLT) {
+ff_vvc_set_intra_mvf(lc, true, cu->pred_mode == MODE_IBC ? PF_IBC : 
PF_PLT, false);
 } else {
 const VVCPPS *pps = fc->ps.pps;
 const int w   = cu->cb_width >> MIN_PU_LOG2;
@@ -1805,9 +1816,291 @@ static int inter_data(VVCLocalContext *lc)
 return ret;
 }
 
+static TransformUnit* palette_add_tu(VVCLocalContext *lc, const int start, 
const int end, const VVCTreeType tree_type)
+{
+CodingUnit   *cu  = lc->cu;
+const VVCSPS *sps = lc->fc->ps.sps;
+TransformUnit *tu = add_tu(lc->fc, cu, cu->x0, cu->y0, cu->cb_width, 
cu->cb_height);
+
+if (!tu)
+return NULL;
+
+for (int c = start; c < end; c++) {
+const int w = tu->width >> sps->hshift[c];
+const int h = tu->height >> sps->vshift[c];
+TransformBlock *tb = add_tb(tu, lc, tu->x0, tu->y0, w, h, c);
+if (c != CR)
+set_tb_size(lc->fc, tb);
+}
+
+for (int i = 0; i < FF_ARRAY_ELEMS(cu->plt); i++)
+cu->plt[i].size = 0;
+
+return tu;
+}
+
+static void palette_predicted(VVCLocalContext *lc, const bool local_dual_tree, 
int start, int end,
+bool *predictor_reused, const int predictor_size, const int max_entries)
+{
+CodingUnit  *cu  = lc->cu;
+int nb_predicted = 0;
+
+if (local_dual_tree) {
+start = LUMA;
+end = VVC_MAX_SAMPLE_ARRAYS;
+}
+
+for (int i = 0; i < predictor_size && nb_predicted < max_entries; i++) {
+const int run = ff_vvc_palette_predictor_run(lc);
+if (run == 1)
+break;
+
+if (run > 1)
+i += run - 1;
+predictor_reused[i] = true;
+for (int c = start; c < end; c++)
+cu->plt[c].entries[nb_predicted] = lc->ep->pp[c].entries[i];
+nb_predicted++;
+}
+
+for (int c = start; c < end; c++)
+cu->plt[c].size = nb_predicted;
+}
+
+static void palette_signaled(VVCLocalContext *lc, const bool local_dual_tree,
+const int start, const int end, const int max_entries)
+{
+const VVCSPS *sps = lc->fc->ps.sps;
+CodingUnit  *cu   = lc->cu;
+const int nb_predicted= cu->plt[start].size;
+const int nb_signaled = nb_predicted < max_entries ? 
ff_vvc_num_signalled_palette_entries(lc) : 0;
+const int size= nb_predicted + nb_signaled;
+const bool dual_tree_luma = local_dual_tree && cu->tree_type == 
DUAL_TREE_LUMA;
+
+for (int c = start; c < end; c++) {
+Palette *plt 

[FFmpeg-devel] [PATCH v1 14/23] avcodec/vvc/ctu: read act_enabled_flag for adaptive color transform

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/ctu.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index 979a27c6ad..a83c59f27c 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -2150,10 +2150,9 @@ static int hls_coding_unit(VVCLocalContext *lc, int x0, 
int y0, int cb_width, in
 mode_type = MODE_TYPE_INTRA;
 cu->pred_mode = pred_mode_decode(lc, tree_type, mode_type);
 
-if (cu->pred_mode == MODE_INTRA && sps->r->sps_act_enabled_flag && 
tree_type == SINGLE_TREE) {
-avpriv_report_missing_feature(fc->log_ctx, "Adaptive Color Transform");
-return AVERROR_PATCHWELCOME;
-}
+if (cu->pred_mode == MODE_INTRA && sps->r->sps_act_enabled_flag && 
tree_type == SINGLE_TREE)
+cu->act_enabled_flag = ff_vvc_cu_act_enabled_flag(lc);
+
 if (cu->pred_mode == MODE_INTRA || cu->pred_mode == MODE_PLT)
 ret = intra_data(lc);
 else if (tree_type != DUAL_TREE_CHROMA) /* MODE_INTER or MODE_IBC */
@@ -2169,10 +2168,8 @@ static int hls_coding_unit(VVCLocalContext *lc, int x0, 
int y0, int cb_width, in
 
 if (cu->coded_flag) {
 sbt_info(lc, sps);
-if (sps->r->sps_act_enabled_flag && cu->pred_mode != MODE_INTRA && 
tree_type == SINGLE_TREE) {
-avpriv_report_missing_feature(fc->log_ctx, "Adaptive Color 
Transform");
-return AVERROR_PATCHWELCOME;
-}
+if (sps->r->sps_act_enabled_flag && cu->pred_mode != MODE_INTRA && 
tree_type == SINGLE_TREE)
+cu->act_enabled_flag = ff_vvc_cu_act_enabled_flag(lc);
 lc->parse.lfnst_dc_only = 1;
 lc->parse.lfnst_zero_out_sig_coeff_flag = 1;
 lc->parse.mts_dc_only = 1;
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 11/23] avcodec/vvc/filter: skip deblocking filter for palette

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/filter.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/libavcodec/vvc/filter.c b/libavcodec/vvc/filter.c
index a7f102bc64..e3886d008e 100644
--- a/libavcodec/vvc/filter.c
+++ b/libavcodec/vvc/filter.c
@@ -772,17 +772,15 @@ static int get_qp(const VVCFrameContext *fc, const 
uint8_t *src, const int x, co
 
 static void vvc_deblock(const VVCLocalContext *lc, int x0, int y0, const int 
rs, const int vertical)
 {
-VVCFrameContext *fc= lc->fc;
-const VVCSPS *sps  = fc->ps.sps;
-const int c_end= sps->r->sps_chroma_format_idc ? 
VVC_MAX_SAMPLE_ARRAYS : 1;
-const int ctb_size = fc->ps.sps->ctb_size_y;
-const DBParams *params = fc->tab.deblock + rs;
-int x_end  = FFMIN(x0 + ctb_size, fc->ps.pps->width);
-int y_end  = FFMIN(y0 + ctb_size, fc->ps.pps->height);
-
-//not use this yet, may needed by plt.
-const uint8_t no_p[4]  = { 0 };
-const uint8_t no_q[4]  = { 0 } ;
+VVCFrameContext *fc= lc->fc;
+const VVCSPS *sps  = fc->ps.sps;
+const int c_end= sps->r->sps_chroma_format_idc ? 
VVC_MAX_SAMPLE_ARRAYS : 1;
+const int ctb_size = fc->ps.sps->ctb_size_y;
+const DBParams *params = fc->tab.deblock + rs;
+int x_end  = FFMIN(x0 + ctb_size, fc->ps.pps->width);
+int y_end  = FFMIN(y0 + ctb_size, fc->ps.pps->height);
+const int log2_min_cb_size = fc->ps.sps->min_cb_log2_size_y;
+const int min_cb_width = fc->ps.pps->min_cb_width;
 
 if (!vertical) {
 FFSWAP(int, x_end, y_end);
@@ -802,6 +800,8 @@ static void vvc_deblock(const VVCLocalContext *lc, int x0, 
int y0, const int rs,
 const uint8_t horizontal_ctu_edge = !vertical && !(x % 
ctb_size);
 int32_t bs[4], beta[4], tc[4] = { 0 }, all_zero_bs = 1;
 uint8_t max_len_p[4], max_len_q[4];
+uint8_t no_p[4] = { 0 };
+uint8_t no_q[4] = { 0 };
 
 for (int i = 0; i < DEBLOCK_STEP >> (2 - vs); i++) {
 int tx = x;
@@ -818,6 +818,13 @@ static void vvc_deblock(const VVCLocalContext *lc, int x0, 
int y0, const int rs,
 tc[i] = TC_CALC(qp, bs[i]) ;
 max_filter_length(fc, tx, ty, c_idx, vertical, 
horizontal_ctu_edge, bs[i], &max_len_p[i], &max_len_q[i]);
 all_zero_bs = 0;
+
+if (sps->r->sps_palette_enabled_flag) {
+const int cu_q = (ty >> 
log2_min_cb_size) * min_cb_width + (tx>> log2_min_cb_size);
+const int cu_p = (ty - !vertical >> 
log2_min_cb_size) * min_cb_width + (tx - vertical >> log2_min_cb_size);
+no_q[i] = fc->tab.cpm[!!c_idx][cu_q] == MODE_PLT;
+no_p[i] = cu_p >= 0 && fc->tab.cpm[!!c_idx][cu_p] 
== MODE_PLT;
+}
 }
 }
 
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 13/23] avcodec/vvc/cabac: add ff_vvc_cu_act_enabled_flag

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/cabac.c | 5 +
 libavcodec/vvc/cabac.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/libavcodec/vvc/cabac.c b/libavcodec/vvc/cabac.c
index 700b719b7c..6847ce59af 100644
--- a/libavcodec/vvc/cabac.c
+++ b/libavcodec/vvc/cabac.c
@@ -1703,6 +1703,11 @@ int ff_vvc_tu_y_coded_flag(VVCLocalContext *lc)
 return lc->parse.prev_tu_cbf_y;
 }
 
+int ff_vvc_cu_act_enabled_flag(VVCLocalContext *lc)
+{
+return GET_CABAC(CU_ACT_ENABLED_FLAG);
+}
+
 int ff_vvc_cu_qp_delta_abs(VVCLocalContext *lc)
 {
 int v, i, k;
diff --git a/libavcodec/vvc/cabac.h b/libavcodec/vvc/cabac.h
index 92f0163c85..972890317e 100644
--- a/libavcodec/vvc/cabac.h
+++ b/libavcodec/vvc/cabac.h
@@ -120,6 +120,7 @@ int ff_vvc_bcw_idx(VVCLocalContext *lc, int 
no_backward_pred_flag);
 int ff_vvc_tu_cb_coded_flag(VVCLocalContext *lc);
 int ff_vvc_tu_cr_coded_flag(VVCLocalContext *lc, int tu_cb_coded_flag);
 int ff_vvc_tu_y_coded_flag(VVCLocalContext *lc);
+int ff_vvc_cu_act_enabled_flag(VVCLocalContext *lc);
 int ff_vvc_cu_chroma_qp_offset_flag(VVCLocalContext *lc);
 int ff_vvc_cu_chroma_qp_offset_idx(VVCLocalContext *lc);
 int ff_vvc_tu_joint_cbcr_residual_flag(VVCLocalContext *lc, int 
tu_cb_coded_flag, int tu_cr_coded_flag);
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 20/23] avcodec/vvc/intra: make lmcs_scale_chroma inplace

2025-05-14 Thread toqsxw
From: Wu Jianhua 

prepare for adaptive color transform

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/dsp.h| 2 +-
 libavcodec/vvc/intra.c  | 5 ++---
 libavcodec/vvc/intra_template.c | 7 +++
 3 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/libavcodec/vvc/dsp.h b/libavcodec/vvc/dsp.h
index fa1387aadd..ae22900931 100644
--- a/libavcodec/vvc/dsp.h
+++ b/libavcodec/vvc/dsp.h
@@ -106,7 +106,7 @@ struct VVCLocalContext;
 
 typedef struct VVCIntraDSPContext {
 void (*intra_cclm_pred)(const struct VVCLocalContext *lc, int x0, int y0, 
int w, int h);
-void (*lmcs_scale_chroma)(struct VVCLocalContext *lc, int *dst, const int 
*coeff, int w, int h, int x0_cu, int y0_cu);
+void (*lmcs_scale_chroma)(struct VVCLocalContext *lc, int *coeff, int w, 
int h, int x0_cu, int y0_cu);
 void (*intra_pred)(const struct VVCLocalContext *lc, int x0, int y0, int 
w, int h, int c_idx);
 void (*pred_planar)(uint8_t *src, const uint8_t *top, const uint8_t *left, 
int w, int h, ptrdiff_t stride);
 void (*pred_mip)(uint8_t *src, const uint8_t *top, const uint8_t *left, 
int w, int h, ptrdiff_t stride,
diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index 3db3347d8c..b5842a93d1 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -495,7 +495,6 @@ static void itransform(VVCLocalContext *lc, TransformUnit 
*tu, const int tu_idx,
 const VVCSH *sh = &lc->sc->sh;
 const CodingUnit *cu= lc->cu;
 const int ps= fc->ps.sps->pixel_shift;
-DECLARE_ALIGNED(32, int, temp)[MAX_TB_SIZE * MAX_TB_SIZE];
 
 for (int i = 0; i < tu->nb_tbs; i++) {
 TransformBlock *tb  = &tu->tbs[i];
@@ -540,10 +539,10 @@ static void itransform(VVCLocalContext *lc, TransformUnit 
*tu, const int tu_idx,
 fc->vvcdsp.itx.pred_residual_joint(jcbcr->coeffs, 
tb->coeffs, tb->tb_width, tb->tb_height, c_sign, shift);
 }
 if (chroma_scale)
-fc->vvcdsp.intra.lmcs_scale_chroma(lc, temp, coeffs, w, h, 
cu->x0, cu->y0);
+fc->vvcdsp.intra.lmcs_scale_chroma(lc, coeffs, w, h, 
cu->x0, cu->y0);
 // TODO: Address performance issue here by combining 
transform, lmcs_scale_chroma, and add_residual into one function.
 // Complete this task before implementing ASM code.
-fc->vvcdsp.itx.add_residual(dst, chroma_scale ? temp : coeffs, 
w, h, stride);
+fc->vvcdsp.itx.add_residual(dst, coeffs, w, h, stride);
 }
 }
 }
diff --git a/libavcodec/vvc/intra_template.c b/libavcodec/vvc/intra_template.c
index 440ac5b6cc..3ec6c72213 100644
--- a/libavcodec/vvc/intra_template.c
+++ b/libavcodec/vvc/intra_template.c
@@ -428,7 +428,7 @@ static int FUNC(lmcs_derive_chroma_scale)(VVCLocalContext 
*lc, const int x0, con
 }
 
 // 8.7.5.3 Picture reconstruction with luma dependent chroma residual scaling 
process for chroma samples
-static void FUNC(lmcs_scale_chroma)(VVCLocalContext *lc, int *dst, const int 
*coeff,
+static void FUNC(lmcs_scale_chroma)(VVCLocalContext *lc, int *coeff,
 const int width, const int height, const int x0_cu, const int y0_cu)
 {
 const int chroma_scale = FUNC(lmcs_derive_chroma_scale)(lc, x0_cu, y0_cu);
@@ -438,11 +438,10 @@ static void FUNC(lmcs_scale_chroma)(VVCLocalContext *lc, 
int *dst, const int *co
 const int c = av_clip_intp2(*coeff, BIT_DEPTH);
 
 if (c > 0)
-*dst = (c * chroma_scale + (1 << 10)) >> 11;
+*coeff = (c * chroma_scale + (1 << 10)) >> 11;
 else
-*dst = -((-c * chroma_scale + (1 << 10)) >> 11);
+*coeff = -((-c * chroma_scale + (1 << 10)) >> 11);
 coeff++;
-dst++;
 }
 }
 }
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 21/23] avcodec/vvc/intra: refact out lmcs_scale_chroma and add_residual

2025-05-14 Thread toqsxw
From: Wu Jianhua 

prepare for adaptive color transform

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/intra.c | 107 -
 1 file changed, 63 insertions(+), 44 deletions(-)

diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index b5842a93d1..0ea33e1e73 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -27,6 +27,10 @@
 #include "intra.h"
 #include "itx_1d.h"
 
+#define POS(c_idx, x, y)\
+&fc->frame->data[c_idx][((y) >> fc->ps.sps->vshift[c_idx]) * 
fc->frame->linesize[c_idx] +   \
+(((x) >> fc->ps.sps->hshift[c_idx]) << fc->ps.sps->pixel_shift)]
+
 static int is_cclm(enum IntraPredMode mode)
 {
 return mode == INTRA_LT_CCLM || mode == INTRA_L_CCLM || mode == 
INTRA_T_CCLM;
@@ -488,28 +492,65 @@ static void transform_bdpcm(TransformBlock *tb, const 
VVCLocalContext *lc, const
 tb->max_scan_x = tb->tb_width - 1;
 }
 
-static void itransform(VVCLocalContext *lc, TransformUnit *tu, const int 
tu_idx, const int target_ch_type)
+static void lmcs_scale_chroma(VVCLocalContext *lc, TransformUnit *tu, 
TransformBlock *tb, const int target_ch_type)
 {
-const VVCFrameContext *fc   = lc->fc;
-const VVCSPS *sps   = fc->ps.sps;
-const VVCSH *sh = &lc->sc->sh;
-const CodingUnit *cu= lc->cu;
-const int ps= fc->ps.sps->pixel_shift;
+const VVCFrameContext *fc = lc->fc;
+const VVCSH *sh   = &lc->sc->sh;
+const CodingUnit *cu  = lc->cu;
+const int c_idx   = tb->c_idx;
+const int ch_type = c_idx > 0;
+const int w   = tb->tb_width;
+const int h   = tb->tb_height;
+const int chroma_scale= ch_type && sh->r->sh_lmcs_used_flag && 
fc->ps.ph.r->ph_chroma_residual_scale_flag && (w * h > 4);
+const int has_jcbcr   = tu->joint_cbcr_residual_flag && c_idx;
+
+for (int j = 0; j < 1 + has_jcbcr; j++) {
+const bool is_jcbcr   = j > 0;
+const int jcbcr_idx   = CB + tu->coded_flag[CB];
+TransformBlock *jcbcr = &tu->tbs[jcbcr_idx - tu->tbs[0].c_idx];
+int *coeffs   = is_jcbcr ? jcbcr->coeffs : tb->coeffs;
+
+if (!j && has_jcbcr) {
+const int c_sign = 1 - 2 * fc->ps.ph.r->ph_joint_cbcr_sign_flag;
+const int shift  = tu->coded_flag[CB] ^ tu->coded_flag[CR];
+fc->vvcdsp.itx.pred_residual_joint(jcbcr->coeffs, tb->coeffs, w, 
h, c_sign, shift);
+}
+if (chroma_scale)
+fc->vvcdsp.intra.lmcs_scale_chroma(lc, coeffs, w, h, cu->x0, 
cu->y0);
+}
+}
+
+static void add_residual(const VVCLocalContext *lc, TransformUnit *tu, const 
int target_ch_type)
+{
+const VVCFrameContext *fc = lc->fc;
 
 for (int i = 0; i < tu->nb_tbs; i++) {
-TransformBlock *tb  = &tu->tbs[i];
-const int c_idx = tb->c_idx;
-const int ch_type   = c_idx > 0;
-
-if (ch_type == target_ch_type && tb->has_coeffs) {
-const int w = tb->tb_width;
-const int h = tb->tb_height;
-const int chroma_scale  = ch_type && sh->r->sh_lmcs_used_flag && 
fc->ps.ph.r->ph_chroma_residual_scale_flag && (w * h > 4);
-const ptrdiff_t stride  = fc->frame->linesize[c_idx];
-const int hs= sps->hshift[c_idx];
-const int vs= sps->vshift[c_idx];
-const int has_jcbcr = tu->joint_cbcr_residual_flag && c_idx;
+TransformBlock *tb  = tu->tbs + i;
+const int c_idx = tb->c_idx;
+const int ch_type   = c_idx > 0;
+const ptrdiff_t stride  = fc->frame->linesize[c_idx];
+const bool has_residual = tb->has_coeffs ||
+  (c_idx && tu->joint_cbcr_residual_flag);
+uint8_t *dst= POS(c_idx, tb->x0, tb->y0);
+
+if (ch_type == target_ch_type && has_residual)
+ fc->vvcdsp.itx.add_residual(dst, tb->coeffs, tb->tb_width, 
tb->tb_height, stride);
+}
+}
+
+static void itransform(VVCLocalContext *lc, TransformUnit *tu, const int 
target_ch_type)
+{
+const VVCFrameContext *fc = lc->fc;
+const CodingUnit *cu  = lc->cu;
+TransformBlock *tbs   = tu->tbs;
+
+for (int i = 0; i < tu->nb_tbs; i++) {
+TransformBlock *tb = tbs + i;
+const int c_idx= tb->c_idx;
+const int ch_type  = c_idx > 0;
+const bool do_itx  = ch_type == target_ch_type;
 
+if (tb->has_coeffs && do_itx) {
 if (cu->bdpcm_flag[tb->c_idx])
 transform_bdpcm(tb, lc, cu);
 dequant(lc, tu, tb);
@@ -519,33 +560,15 @@ static void itransform(VVCLocalContext *lc, TransformUnit 
*tu, const int tu_idx,
 if (cu->apply_lfnst_flag[c_idx])
 ilfnst_transform(lc, tb);
 derive_transform_type(fc, lc, tb, &trh, &trv);
-if (w > 1 && h > 1)
+if (t

[FFmpeg-devel] [PATCH v1 19/23] avcodec/vvc/intra: refact, predict jcbcr to tb->coeffs

2025-05-14 Thread toqsxw
From: Wu Jianhua 

prepare for adaptive color transform

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/dsp.h  |  1 -
 libavcodec/vvc/dsp_template.c | 18 -
 libavcodec/vvc/intra.c| 51 ++-
 3 files changed, 20 insertions(+), 50 deletions(-)

diff --git a/libavcodec/vvc/dsp.h b/libavcodec/vvc/dsp.h
index e9ef9f5b25..fa1387aadd 100644
--- a/libavcodec/vvc/dsp.h
+++ b/libavcodec/vvc/dsp.h
@@ -122,7 +122,6 @@ typedef struct VVCIntraDSPContext {
 
 typedef struct VVCItxDSPContext {
 void (*add_residual)(uint8_t *dst, const int *res, int width, int height, 
ptrdiff_t stride);
-void (*add_residual_joint)(uint8_t *dst, const int *res, int width, int 
height, ptrdiff_t stride, int c_sign, int shift);
 void (*pred_residual_joint)(int *dst, const int *src, int width, int 
height, int c_sign, int shift);
 
 void (*itx[VVC_N_TX_TYPE][VVC_N_TX_SIZE])(int *coeffs, ptrdiff_t step, 
size_t nz);
diff --git a/libavcodec/vvc/dsp_template.c b/libavcodec/vvc/dsp_template.c
index 218a600cce..13bd8cd4a1 100644
--- a/libavcodec/vvc/dsp_template.c
+++ b/libavcodec/vvc/dsp_template.c
@@ -45,23 +45,6 @@ static void FUNC(add_residual)(uint8_t *_dst, const int *res,
 }
 }
 
-static void FUNC(add_residual_joint)(uint8_t *_dst, const int *res,
-const int w, const int h, const ptrdiff_t _stride, const int c_sign, const 
int shift)
-{
-pixel *dst = (pixel *)_dst;
-
-const int stride = _stride / sizeof(pixel);
-
-for (int y = 0; y < h; y++) {
-for (int x = 0; x < w; x++) {
-const int r = ((*res) * c_sign) >> shift;
-dst[x] = av_clip_pixel(dst[x] + r);
-res++;
-}
-dst += stride;
-}
-}
-
 static void FUNC(pred_residual_joint)(int *dst, const int *src, const int w, 
const int h,
 const int c_sign, const int shift)
 {
@@ -121,7 +104,6 @@ static void FUNC(ff_vvc_itx_dsp_init)(VVCItxDSPContext 
*const itx)
 VVC_ITX(TYPE, type, 32);
 
 itx->add_residual= FUNC(add_residual);
-itx->add_residual_joint  = FUNC(add_residual_joint);
 itx->pred_residual_joint = FUNC(pred_residual_joint);
 itx->transform_bdpcm = FUNC(transform_bdpcm);
 VVC_ITX(DCT2, dct2, 2)
diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index 5f9bbea3d1..3db3347d8c 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -164,28 +164,6 @@ static void derive_transform_type(const VVCFrameContext 
*fc, const VVCLocalConte
 *trv = mts_to_trv[cu->mts_idx];
 }
 
-static void add_residual_for_joint_coding_chroma(VVCLocalContext *lc,
-const TransformUnit *tu, TransformBlock *tb, const int chroma_scale)
-{
-const VVCFrameContext *fc  = lc->fc;
-const CodingUnit *cu = lc->cu;
-const int c_sign = 1 - 2 * fc->ps.ph.r->ph_joint_cbcr_sign_flag;
-const int shift  = tu->coded_flag[1] ^ tu->coded_flag[2];
-const int c_idx  = 1 + tu->coded_flag[1];
-const ptrdiff_t stride = fc->frame->linesize[c_idx];
-const int hs = fc->ps.sps->hshift[c_idx];
-const int vs = fc->ps.sps->vshift[c_idx];
-uint8_t *dst = &fc->frame->data[c_idx][(tb->y0 >> vs) * stride +
-  ((tb->x0 >> hs) << 
fc->ps.sps->pixel_shift)];
-if (chroma_scale) {
-fc->vvcdsp.itx.pred_residual_joint(tb->coeffs, tb->coeffs, 
tb->tb_width, tb->tb_height, c_sign, shift);
-fc->vvcdsp.intra.lmcs_scale_chroma(lc, tb->coeffs, tb->coeffs, 
tb->tb_width, tb->tb_height, cu->x0, cu->y0);
-fc->vvcdsp.itx.add_residual(dst, tb->coeffs, tb->tb_width, 
tb->tb_height, stride);
-} else {
-fc->vvcdsp.itx.add_residual_joint(dst, tb->coeffs, tb->tb_width, 
tb->tb_height, stride, c_sign, shift);
-}
-}
-
 static int add_reconstructed_area(VVCLocalContext *lc, const int ch_type, 
const int x0, const int y0, const int w, const int h)
 {
 const VVCSPS *sps   = lc->fc->ps.sps;
@@ -531,7 +509,7 @@ static void itransform(VVCLocalContext *lc, TransformUnit 
*tu, const int tu_idx,
 const ptrdiff_t stride  = fc->frame->linesize[c_idx];
 const int hs= sps->hshift[c_idx];
 const int vs= sps->vshift[c_idx];
-uint8_t *dst= &fc->frame->data[c_idx][(tb->y0 >> vs) * 
stride + ((tb->x0 >> hs) << ps)];
+const int has_jcbcr = tu->joint_cbcr_residual_flag && c_idx;
 
 if (cu->bdpcm_flag[tb->c_idx])
 transform_bdpcm(tb, lc, cu);
@@ -548,14 +526,25 @@ static void itransform(VVCLocalContext *lc, TransformUnit 
*tu, const int tu_idx,
 itx_1d(fc, tb, trh, trv);
 }
 
-if (chroma_scale)
-fc->vvcdsp.intra.lmcs_scale_chroma(lc, temp, tb->coeffs, w, h, 
cu->x0, cu->y0);
-// TODO: Address performance issue here by combining transform, 
lmcs_scale_chroma, and add_residual into one function.
-// Complete thi

[FFmpeg-devel] [PATCH v1 18/23] avcodec/vvc/intra: fix scaling process for transform coefficients

2025-05-14 Thread toqsxw
From: Wu Jianhua 

See 8.7.3 Scaling process for transform coefficients

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/intra.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index bdcb193077..5f9bbea3d1 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -303,21 +303,15 @@ static void scale(int *out, const int *in, const int w, 
const int h, const int s
 // part of 8.7.3 Scaling process for transform coefficients
 static void derive_qp(const VVCLocalContext *lc, const TransformUnit *tu, 
TransformBlock *tb)
 {
-const VVCSPS *sps   = lc->fc->ps.sps;
-const H266RawSliceHeader *rsh   = lc->sc->sh.r;
-const CodingUnit *cu= lc->cu;
-int qp, qp_act_offset;
+const VVCSPS *sps = lc->fc->ps.sps;
+const H266RawSliceHeader *rsh = lc->sc->sh.r;
+const CodingUnit *cu  = lc->cu;
+const bool is_jcbcr   = tb->c_idx && tu->joint_cbcr_residual_flag 
&& tu->coded_flag[CB] && tu->coded_flag[CR];
+const int idx = is_jcbcr ? JCBCR : tb->c_idx;
+const int qp  = cu->qp[idx] + (idx ? 0 : 
sps->qp_bd_offset);
+const int act_offset[]= { -5, 1, 3, 1 };
+const int qp_act_offset   = cu->act_enabled_flag ? act_offset[idx] : 0;
 
-if (tb->c_idx == 0) {
-//fix me
-qp = cu->qp[LUMA] + sps->qp_bd_offset;
-qp_act_offset = cu->act_enabled_flag ? -5 : 0;
-} else {
-const int is_jcbcr = tu->joint_cbcr_residual_flag && 
tu->coded_flag[CB] && tu->coded_flag[CR];
-const int idx = is_jcbcr ? JCBCR : tb->c_idx;
-qp = cu->qp[idx];
-qp_act_offset = cu->act_enabled_flag ? 1 : 0;
-}
 if (tb->ts) {
 const int qp_prime_ts_min = 4 + 6 * sps->r->sps_min_qp_prime_ts;
 
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 17/23] avcodec/vvc/dsp: add adaptive_color_transform

2025-05-14 Thread toqsxw
From: Wu Jianhua 

See 8.7.4.6 Residual modification process for blocks using colour space 
conversion

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/dsp.h  |  2 ++
 libavcodec/vvc/dsp_template.c | 20 
 2 files changed, 22 insertions(+)

diff --git a/libavcodec/vvc/dsp.h b/libavcodec/vvc/dsp.h
index 25b7755109..e9ef9f5b25 100644
--- a/libavcodec/vvc/dsp.h
+++ b/libavcodec/vvc/dsp.h
@@ -127,6 +127,8 @@ typedef struct VVCItxDSPContext {
 
 void (*itx[VVC_N_TX_TYPE][VVC_N_TX_SIZE])(int *coeffs, ptrdiff_t step, 
size_t nz);
 void (*transform_bdpcm)(int *coeffs, int width, int height, int vertical, 
int log2_transform_range);
+
+void (*adaptive_color_transform)(int *y, int *u, int *v, int width, int 
height);
 } VVCItxDSPContext;
 
 typedef struct VVCLMCSDSPContext {
diff --git a/libavcodec/vvc/dsp_template.c b/libavcodec/vvc/dsp_template.c
index c6dc6e22a7..218a600cce 100644
--- a/libavcodec/vvc/dsp_template.c
+++ b/libavcodec/vvc/dsp_template.c
@@ -91,6 +91,24 @@ static void FUNC(transform_bdpcm)(int *coeffs, const int 
width, const int height
 }
 }
 
+// 8.7.4.6 Residual modification process for blocks using colour space 
conversion
+static void FUNC(adaptive_color_transform)(int *y, int *u, int *v, const int 
width, const int height)
+{
+const int size = width * height;
+const int bits = BIT_DEPTH + 1;
+
+for (int i = 0; i < size; i++) {
+const int y0 = av_clip_intp2(y[i], bits);
+const int cg = av_clip_intp2(u[i], bits);
+const int co = av_clip_intp2(v[i], bits);
+const int t  = y0 - (cg >> 1);
+
+y[i] = cg + t;
+u[i] = t - (co >> 1);
+v[i] = co + u[i];
+}
+}
+
 static void FUNC(ff_vvc_itx_dsp_init)(VVCItxDSPContext *const itx)
 {
 #define VVC_ITX(TYPE, type, s) 
 \
@@ -112,6 +130,8 @@ static void FUNC(ff_vvc_itx_dsp_init)(VVCItxDSPContext 
*const itx)
 VVC_ITX_COMMON(DCT8, dct8)
 VVC_ITX_COMMON(DST7, dst7)
 
+itx->adaptive_color_transform = FUNC(adaptive_color_transform);
+
 #undef VVC_ITX
 #undef VVC_ITX_COMMON
 }
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 16/23] avcodec/vvc/dsp: update the interface of pred_residual_joint for joint cbcr residual functionality

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/dsp.h  |  2 +-
 libavcodec/vvc/dsp_template.c | 11 ---
 libavcodec/vvc/intra.c|  2 +-
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/libavcodec/vvc/dsp.h b/libavcodec/vvc/dsp.h
index fc4c3a6799..25b7755109 100644
--- a/libavcodec/vvc/dsp.h
+++ b/libavcodec/vvc/dsp.h
@@ -123,7 +123,7 @@ typedef struct VVCIntraDSPContext {
 typedef struct VVCItxDSPContext {
 void (*add_residual)(uint8_t *dst, const int *res, int width, int height, 
ptrdiff_t stride);
 void (*add_residual_joint)(uint8_t *dst, const int *res, int width, int 
height, ptrdiff_t stride, int c_sign, int shift);
-void (*pred_residual_joint)(int *buf, int width, int height, int c_sign, 
int shift);
+void (*pred_residual_joint)(int *dst, const int *src, int width, int 
height, int c_sign, int shift);
 
 void (*itx[VVC_N_TX_TYPE][VVC_N_TX_SIZE])(int *coeffs, ptrdiff_t step, 
size_t nz);
 void (*transform_bdpcm)(int *coeffs, int width, int height, int vertical, 
int log2_transform_range);
diff --git a/libavcodec/vvc/dsp_template.c b/libavcodec/vvc/dsp_template.c
index 1aa1e027bd..c6dc6e22a7 100644
--- a/libavcodec/vvc/dsp_template.c
+++ b/libavcodec/vvc/dsp_template.c
@@ -62,15 +62,12 @@ static void FUNC(add_residual_joint)(uint8_t *_dst, const 
int *res,
 }
 }
 
-static void FUNC(pred_residual_joint)(int *buf, const int w, const int h,
+static void FUNC(pred_residual_joint)(int *dst, const int *src, const int w, 
const int h,
 const int c_sign, const int shift)
 {
-for (int y = 0; y < h; y++) {
-for (int x = 0; x < w; x++) {
-*buf = ((*buf) * c_sign) >> shift;
-buf++;
-}
-}
+const int size = w * h;
+for (int i = 0; i < size; i++)
+dst[i] = (src[i] * c_sign) >> shift;
 }
 
 static void FUNC(transform_bdpcm)(int *coeffs, const int width, const int 
height,
diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index 7f772fa4ae..bdcb193077 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -178,7 +178,7 @@ static void 
add_residual_for_joint_coding_chroma(VVCLocalContext *lc,
 uint8_t *dst = &fc->frame->data[c_idx][(tb->y0 >> vs) * stride +
   ((tb->x0 >> hs) << 
fc->ps.sps->pixel_shift)];
 if (chroma_scale) {
-fc->vvcdsp.itx.pred_residual_joint(tb->coeffs, tb->tb_width, 
tb->tb_height, c_sign, shift);
+fc->vvcdsp.itx.pred_residual_joint(tb->coeffs, tb->coeffs, 
tb->tb_width, tb->tb_height, c_sign, shift);
 fc->vvcdsp.intra.lmcs_scale_chroma(lc, tb->coeffs, tb->coeffs, 
tb->tb_width, tb->tb_height, cu->x0, cu->y0);
 fc->vvcdsp.itx.add_residual(dst, tb->coeffs, tb->tb_width, 
tb->tb_height, stride);
 } else {
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v1 15/23] avcodec/vvc/ctu: fix derive_chroma_intra_pred_mode

2025-05-14 Thread toqsxw
From: Wu Jianhua 

See 8.4.3 Derivation process for chroma intra prediction mode

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/ctu.c | 57 +++-
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index a83c59f27c..e160199580 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -895,7 +895,7 @@ static void derive_chroma_intra_pred_mode(VVCLocalContext 
*lc,
 enum IntraPredMode luma_intra_pred_mode = SAMPLE_CTB(fc->tab.ipm, x_cb, 
y_cb);
 
 if (cu->tree_type == SINGLE_TREE && sps->r->sps_chroma_format_idc == 
CHROMA_FORMAT_444 &&
-intra_chroma_pred_mode == 4 && intra_mip_flag) {
+(intra_chroma_pred_mode == 4 || cu->act_enabled_flag) && 
intra_mip_flag) {
 cu->mip_chroma_direct_flag = 1;
 cu->intra_pred_mode_c = luma_intra_pred_mode;
 return;
@@ -1007,34 +1007,38 @@ static void intra_luma_pred_modes(VVCLocalContext *lc)
 
 static void intra_chroma_pred_modes(VVCLocalContext *lc)
 {
-const VVCSPS *sps   = lc->fc->ps.sps;
-CodingUnit *cu  = lc->cu;
-const int hs= sps->hshift[CHROMA];
-const int vs= sps->vshift[CHROMA];
+const VVCSPS *sps  = lc->fc->ps.sps;
+CodingUnit *cu = lc->cu;
+const int hs   = sps->hshift[CHROMA];
+const int vs   = sps->vshift[CHROMA];
+int cclm_mode_flag = 0;
+int cclm_mode_idx  = 0;
+int intra_chroma_pred_mode = 0;
+
+if (!cu->act_enabled_flag) {
+cu->mip_chroma_direct_flag = 0;
+if (sps->r->sps_bdpcm_enabled_flag &&
+(cu->cb_width  >> hs) <= sps->max_ts_size &&
+(cu->cb_height >> vs) <= sps->max_ts_size) {
+cu->bdpcm_flag[CB] = cu->bdpcm_flag[CR] = 
ff_vvc_intra_bdpcm_chroma_flag(lc);
+}
+if (cu->bdpcm_flag[CHROMA]) {
+cu->intra_pred_mode_c = ff_vvc_intra_bdpcm_chroma_dir_flag(lc) ? 
INTRA_VERT : INTRA_HORZ;
+} else {
+const int cclm_enabled = get_cclm_enabled(lc, cu->x0, cu->y0);
 
-cu->mip_chroma_direct_flag = 0;
-if (sps->r->sps_bdpcm_enabled_flag &&
-(cu->cb_width  >> hs) <= sps->max_ts_size &&
-(cu->cb_height >> vs) <= sps->max_ts_size) {
-cu->bdpcm_flag[CB] = cu->bdpcm_flag[CR] = 
ff_vvc_intra_bdpcm_chroma_flag(lc);
-}
-if (cu->bdpcm_flag[CHROMA]) {
-cu->intra_pred_mode_c = ff_vvc_intra_bdpcm_chroma_dir_flag(lc) ? 
INTRA_VERT : INTRA_HORZ;
-} else {
-const int cclm_enabled = get_cclm_enabled(lc, cu->x0, cu->y0);
-int cclm_mode_flag = 0;
-int cclm_mode_idx = 0;
-int intra_chroma_pred_mode = 0;
+if (cclm_enabled)
+cclm_mode_flag = ff_vvc_cclm_mode_flag(lc);
 
-if (cclm_enabled)
-cclm_mode_flag = ff_vvc_cclm_mode_flag(lc);
+if (cclm_mode_flag)
+cclm_mode_idx = ff_vvc_cclm_mode_idx(lc);
+else
+intra_chroma_pred_mode = ff_vvc_intra_chroma_pred_mode(lc);
+}
+}
 
-if (cclm_mode_flag)
-cclm_mode_idx = ff_vvc_cclm_mode_idx(lc);
-else
-intra_chroma_pred_mode = ff_vvc_intra_chroma_pred_mode(lc);
+if (!cu->bdpcm_flag[CHROMA])
 derive_chroma_intra_pred_mode(lc, cclm_mode_flag, cclm_mode_idx, 
intra_chroma_pred_mode);
-}
 }
 
 static PredMode pred_mode_decode(VVCLocalContext *lc,
@@ -2122,8 +2126,7 @@ static int intra_data(VVCLocalContext *lc)
 if ((ret = hls_palette_coding(lc, tree_type)) < 0)
 return ret;
 } else if (!pred_mode_plt_flag) {
-if (!cu->act_enabled_flag)
-intra_chroma_pred_modes(lc);
+intra_chroma_pred_modes(lc);
 }
 }
 
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] Vulkan hevc hdr decode regression on ffmpeg master?

2025-05-14 Thread Andrew Randrianasulu
So I was experimenting  with Vulkan decoding in cinelerra-gg.

After some fight I get build using ffmpeg git

commit 038314bc6be2f35a82e9fba2228bcac2e4fee648 for ffmpeg

here is bunch of errors like this:

[hevc @ 0x6f7465c0] Could not find ref with POC 296
[hevc @ 0x6f7465c0] Error constructing the frame RPS.
[hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 9
[hevc @ 0x6f72ad00] Could not find ref with POC 296
[hevc @ 0x6f72ad00] Error constructing the frame RPS.
[hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 8
[hevc @ 0x6f77b3c0] Could not find ref with POC 298
[hevc @ 0x6f77b3c0] Error constructing the frame RPS.
[hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 8
[hevc @ 0x6f789480] Could not find ref with POC 300
[hevc @ 0x6f789480] Error constructing the frame RPS.
[hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
[hevc @ 0x6f7465c0] Could not find ref with POC 300
[hevc @ 0x6f7465c0] Error constructing the frame RPS.
[hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
[hevc @ 0x6f72ad00] Could not find ref with POC 300
[hevc @ 0x6f72ad00] Error constructing the frame RPS.
[hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f77b3c0] Could not find ref with POC 302
[hevc @ 0x6f77b3c0] Error constructing the frame RPS.
[hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f789480] Could not find ref with POC 304
[hevc @ 0x6f789480] Error constructing the frame RPS.
[hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
[hevc @ 0x6f7465c0] Could not find ref with POC 304
[hevc @ 0x6f7465c0] Error constructing the frame RPS.
[hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
[hevc @ 0x6f72ad00] Could not find ref with POC 304
[hevc @ 0x6f72ad00] Error constructing the frame RPS.
[hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f77b3c0] Could not find ref with POC 306
[hevc @ 0x6f77b3c0] Error constructing the frame RPS.
[hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f789480] Could not find ref with POC 308
[hevc @ 0x6f789480] Error constructing the frame RPS.
[hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
[hevc @ 0x6f7465c0] Could not find ref with POC 308
[hevc @ 0x6f7465c0] Error constructing the frame RPS.
[hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
[hevc @ 0x6f72ad00] Could not find ref with POC 308
[hevc @ 0x6f72ad00] Error constructing the frame RPS.
[hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f77b3c0] Could not find ref with POC 310
[hevc @ 0x6f77b3c0] Error constructing the frame RPS.
[hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f789480] Could not find ref with POC 312
[hevc @ 0x6f789480] Error constructing the frame RPS.
[hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
[hevc @ 0x6f7465c0] Could not find ref with POC 312
[hevc @ 0x6f7465c0] Error constructing the frame RPS.
[hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
[hevc @ 0x6f72ad00] Could not find ref with POC 312
[hevc @ 0x6f72ad00] Error constructing the frame RPS.
[hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f77b3c0] Could not find ref with POC 314
[hevc @ 0x6f77b3c0] Error constructing the frame RPS.
[hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f789480] Could not find ref with POC 316
[hevc @ 0x6f789480] Error constructing the frame RPS.
[hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
[hevc @ 0x6f7465c0] Could not find ref with POC 316
[hevc @ 0x6f7465c0] Error constructing the frame RPS.
[hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
[hevc @ 0x6f72ad00] Could not find ref with POC 316
[hevc @ 0x6f72ad00] Error constructing the frame RPS.
[hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f77b3c0] Could not find ref with POC 318
[hevc @ 0x6f77b3c0] Error constructing the frame RPS.
[hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f789480] Could not find ref with POC 320
[hevc @ 0x6f789480] Error constructing the frame RPS.
[hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
[hevc @ 0x6f7465c0] Could not find ref with POC 320
[hevc @ 0x6f7465c0] Error constructing the frame RPS.
[hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
[hevc @ 0x6f72ad00] Could not find ref with POC 320
[hevc @ 0x6f72ad00] Error constructing the frame RPS.
[hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f77b3c0] Could not find ref with POC 322
[hevc @ 0x6f77b3c0] Error constructing the frame RPS.
[hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
[hevc @ 0x6f789480] Could not find ref with POC 324
[hevc @ 0x6f789480] Error constructing the frame RPS.
[hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
[hevc @ 0x6f7465c0] Could not find ref with POC 324
[hevc @ 0x6f7465c0] Error constructing the frame RPS.
[hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
[hevc @ 0x6f72ad00] Could not find ref with POC 324
[hevc @ 0x6f72

[FFmpeg-devel] [PATCH v1 23/23] Changelog: VVC supports all content of SCC

2025-05-14 Thread toqsxw
From: Wu Jianhua 

Signed-off-by: Wu Jianhua 
---
 Changelog | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Changelog b/Changelog
index a09dcd82c2..4f47b30038 100644
--- a/Changelog
+++ b/Changelog
@@ -11,6 +11,8 @@ version :
 - Enhanced FLV v2: Multitrack audio/video, modern codec support
 - Animated JPEG XL encoding (via libjxl)
 - VVC in Matroska
+- VVC decoder supports all content of SCC (Screen Content Coding):
+  IBC (Inter Block Copy), Palette Mode and ACT (Adaptive Color Transform)
 
 version 7.1:
 - Raw Captions with Time (RCWT) closed caption demuxer
@@ -53,7 +55,6 @@ version 7.1:
   constrains
 - FFV1 parser
 
-
 version 7.0:
 - DXV DXT1 encoder
 - LEAD MCMP decoder
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] Remove libpostproc

2025-05-14 Thread Kieran Kunhya via ffmpeg-devel
On Wed, May 14, 2025 at 8:26 AM Michael Niedermayer
 wrote:
>
> Hi
>
> On Wed, May 14, 2025 at 11:41:35AM +0100, Kieran Kunhya via ffmpeg-devel 
> wrote:
> > On Wed, May 14, 2025 at 11:21 AM Michael Niedermayer
> >  wrote:
> > >
> > > Hi Andrew
> > >
> > > On Wed, May 14, 2025 at 05:54:54AM +0300, Andrew Randrianasulu wrote:
> > > > ср, 14 мая 2025 г., 03:55 Andrew Randrianasulu 
> > > > :
> > > >
> > > > >
> > > > >
> > > > > вт, 6 мая 2025 г., 02:27 Michael Niedermayer :
> > > > >
> > > > >> This will be available in https://github.com/michaelni/libpostproc
> > > > >> either as a separate library or a ffmpeg source plugin whatever turns
> > > > >> out more convenient to maintain
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > Congratulations, you broke  building cinelerra-gg with ffmpeg.git 
> > > > > despite
> > > > > our best efforts :/
> > > > >
> > > > > Why all this code movement?!
> > > > >
> > > > > For whom it "simple"?
> > > > >
> > > >
> > > >
> > > > For some reason this mail not arrived into my inbox (spam filter ate 
> > > > it?)
> > > >
> > > > https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343192.html
> > > >
> > > > =
> > > >
> > > > The idea of course here is to expand this to filters and other
> > > > things. Which again is trivial, nothing really is needed except
> > > > people simply following this style of a source plugin
> > > >
> > > >
> > > > =
> > > >
> > > > I found this concerning. Because does this mean ffmpeg will be 
> > > > fragmented
> > > > like Python or Rust into million pieces users supposed to held together?
> >
> > libpostproc never really fit in FFmpeg,
>
> libostproc implements part of ISO/IEC 14496-2 (MPEG-4)

Lots of libraries implement parts of MPEG code.

On a technical level it does not match the code quality of FFmpeg.
Furthermore it was something forced into the project from Mplayer.

Kieran
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] 回复: [PATCH v1 01/23] avcodec/vvc/cabac: add 9.3.3.5 k-th order Exp - Golomb binarization process

2025-05-14 Thread Wu Jianhua
Wu Jianhua:
> From: Wu Jianhua 
>
> Signed-off-by: Wu Jianhua 
> ---
>  libavcodec/vvc/cabac.c | 21 +
>  1 file changed, 21 insertions(+)
>
> diff --git a/libavcodec/vvc/cabac.c b/libavcodec/vvc/cabac.c
> index 5510144893..54055ed736 100644
> --- a/libavcodec/vvc/cabac.c
> +++ b/libavcodec/vvc/cabac.c
> @@ -928,6 +928,27 @@ static int truncated_binary_decode(VVCLocalContext *lc, 
> const int c_max)
>  return v;
>  }

The patchset before didn't appear on the patchwork, so resend it for CI build.

Thanks,
Jianhua


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] Vulkan hevc hdr decode regression on ffmpeg master?

2025-05-14 Thread Andrew Randrianasulu
On Wed, May 14, 2025 at 1:31 PM Andrew Randrianasulu
 wrote:
>
> So I was experimenting  with Vulkan decoding in cinelerra-gg.
>
> After some fight I get build using ffmpeg git
>
> commit 038314bc6be2f35a82e9fba2228bcac2e4fee648 for ffmpeg
>
> here is bunch of errors like this:
>
> [hevc @ 0x6f7465c0] Could not find ref with POC 296
> [hevc @ 0x6f7465c0] Error constructing the frame RPS.
> [hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 9
> [hevc @ 0x6f72ad00] Could not find ref with POC 296
> [hevc @ 0x6f72ad00] Error constructing the frame RPS.
> [hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 8
> [hevc @ 0x6f77b3c0] Could not find ref with POC 298
> [hevc @ 0x6f77b3c0] Error constructing the frame RPS.
> [hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 8
> [hevc @ 0x6f789480] Could not find ref with POC 300
> [hevc @ 0x6f789480] Error constructing the frame RPS.
> [hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
> [hevc @ 0x6f7465c0] Could not find ref with POC 300
> [hevc @ 0x6f7465c0] Error constructing the frame RPS.
> [hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
> [hevc @ 0x6f72ad00] Could not find ref with POC 300
> [hevc @ 0x6f72ad00] Error constructing the frame RPS.
> [hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f77b3c0] Could not find ref with POC 302
> [hevc @ 0x6f77b3c0] Error constructing the frame RPS.
> [hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f789480] Could not find ref with POC 304
> [hevc @ 0x6f789480] Error constructing the frame RPS.
> [hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
> [hevc @ 0x6f7465c0] Could not find ref with POC 304
> [hevc @ 0x6f7465c0] Error constructing the frame RPS.
> [hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
> [hevc @ 0x6f72ad00] Could not find ref with POC 304
> [hevc @ 0x6f72ad00] Error constructing the frame RPS.
> [hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f77b3c0] Could not find ref with POC 306
> [hevc @ 0x6f77b3c0] Error constructing the frame RPS.
> [hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f789480] Could not find ref with POC 308
> [hevc @ 0x6f789480] Error constructing the frame RPS.
> [hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
> [hevc @ 0x6f7465c0] Could not find ref with POC 308
> [hevc @ 0x6f7465c0] Error constructing the frame RPS.
> [hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
> [hevc @ 0x6f72ad00] Could not find ref with POC 308
> [hevc @ 0x6f72ad00] Error constructing the frame RPS.
> [hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f77b3c0] Could not find ref with POC 310
> [hevc @ 0x6f77b3c0] Error constructing the frame RPS.
> [hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f789480] Could not find ref with POC 312
> [hevc @ 0x6f789480] Error constructing the frame RPS.
> [hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
> [hevc @ 0x6f7465c0] Could not find ref with POC 312
> [hevc @ 0x6f7465c0] Error constructing the frame RPS.
> [hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
> [hevc @ 0x6f72ad00] Could not find ref with POC 312
> [hevc @ 0x6f72ad00] Error constructing the frame RPS.
> [hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f77b3c0] Could not find ref with POC 314
> [hevc @ 0x6f77b3c0] Error constructing the frame RPS.
> [hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f789480] Could not find ref with POC 316
> [hevc @ 0x6f789480] Error constructing the frame RPS.
> [hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
> [hevc @ 0x6f7465c0] Could not find ref with POC 316
> [hevc @ 0x6f7465c0] Error constructing the frame RPS.
> [hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
> [hevc @ 0x6f72ad00] Could not find ref with POC 316
> [hevc @ 0x6f72ad00] Error constructing the frame RPS.
> [hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f77b3c0] Could not find ref with POC 318
> [hevc @ 0x6f77b3c0] Error constructing the frame RPS.
> [hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f789480] Could not find ref with POC 320
> [hevc @ 0x6f789480] Error constructing the frame RPS.
> [hevc @ 0x6f789480] Skipping invalid undecodable NALU: 1
> [hevc @ 0x6f7465c0] Could not find ref with POC 320
> [hevc @ 0x6f7465c0] Error constructing the frame RPS.
> [hevc @ 0x6f7465c0] Skipping invalid undecodable NALU: 3
> [hevc @ 0x6f72ad00] Could not find ref with POC 320
> [hevc @ 0x6f72ad00] Error constructing the frame RPS.
> [hevc @ 0x6f72ad00] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f77b3c0] Could not find ref with POC 322
> [hevc @ 0x6f77b3c0] Error constructing the frame RPS.
> [hevc @ 0x6f77b3c0] Skipping invalid undecodable NALU: 2
> [hevc @ 0x6f789480] Could not find ref with POC 324
> [hevc @ 0x6f789480] Error constructing the frame RPS.
> [hevc @ 0x6f789480] Skipping invalid undeco

[FFmpeg-devel] [PATCH v1 22/23] avcodec/vvc: add adaptive color transform support

2025-05-14 Thread toqsxw
From: Wu Jianhua 

passed files:
ACT_A_Kwai_3.bit
ACT_B_Kwai_3.bit

Signed-off-by: Wu Jianhua 
---
 libavcodec/vvc/ctu.c   |  2 ++
 libavcodec/vvc/intra.c | 13 +++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index e160199580..62c9d4f5c0 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -392,6 +392,8 @@ static int hls_transform_unit(VVCLocalContext *lc, int x0, 
int y0,int tu_width,
 if (ret < 0)
 return ret;
 set_tb_tab(fc->tab.tu_coded_flag[tb->c_idx], 
tu->coded_flag[tb->c_idx], fc, tb);
+} else if (cu->act_enabled_flag) {
+memset(tb->coeffs, 0, tb->tb_width * tb->tb_height * 
sizeof(*tb->coeffs));
 }
 if (tb->c_idx != CR)
 set_tb_size(fc, tb);
diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index 0ea33e1e73..f56b43be66 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -523,13 +523,14 @@ static void lmcs_scale_chroma(VVCLocalContext *lc, 
TransformUnit *tu, TransformB
 static void add_residual(const VVCLocalContext *lc, TransformUnit *tu, const 
int target_ch_type)
 {
 const VVCFrameContext *fc = lc->fc;
+const CodingUnit *cu  = lc->cu;
 
 for (int i = 0; i < tu->nb_tbs; i++) {
 TransformBlock *tb  = tu->tbs + i;
 const int c_idx = tb->c_idx;
 const int ch_type   = c_idx > 0;
 const ptrdiff_t stride  = fc->frame->linesize[c_idx];
-const bool has_residual = tb->has_coeffs ||
+const bool has_residual = tb->has_coeffs || cu->act_enabled_flag ||
   (c_idx && tu->joint_cbcr_residual_flag);
 uint8_t *dst= POS(c_idx, tb->x0, tb->y0);
 
@@ -543,12 +544,13 @@ static void itransform(VVCLocalContext *lc, TransformUnit 
*tu, const int target_
 const VVCFrameContext *fc = lc->fc;
 const CodingUnit *cu  = lc->cu;
 TransformBlock *tbs   = tu->tbs;
+const bool is_act_luma= cu->act_enabled_flag && target_ch_type == LUMA;
 
 for (int i = 0; i < tu->nb_tbs; i++) {
 TransformBlock *tb = tbs + i;
 const int c_idx= tb->c_idx;
 const int ch_type  = c_idx > 0;
-const bool do_itx  = ch_type == target_ch_type;
+const bool do_itx  = is_act_luma || !cu->act_enabled_flag && ch_type 
== target_ch_type;
 
 if (tb->has_coeffs && do_itx) {
 if (cu->bdpcm_flag[tb->c_idx])
@@ -568,6 +570,13 @@ static void itransform(VVCLocalContext *lc, TransformUnit 
*tu, const int target_
 lmcs_scale_chroma(lc, tu, tb, target_ch_type);
 }
 }
+
+if (is_act_luma) {
+fc->vvcdsp.itx.adaptive_color_transform(
+tbs[LUMA].coeffs, tbs[CB].coeffs, tbs[CR].coeffs,
+tbs[LUMA].tb_width, tbs[LUMA].tb_height);
+}
+
 add_residual(lc, tu, target_ch_type);
 }
 
-- 
2.44.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] Remove libpostproc

2025-05-14 Thread Kieran Kunhya via ffmpeg-devel
On Wed, May 14, 2025 at 11:21 AM Michael Niedermayer
 wrote:
>
> Hi Andrew
>
> On Wed, May 14, 2025 at 05:54:54AM +0300, Andrew Randrianasulu wrote:
> > ср, 14 мая 2025 г., 03:55 Andrew Randrianasulu :
> >
> > >
> > >
> > > вт, 6 мая 2025 г., 02:27 Michael Niedermayer :
> > >
> > >> This will be available in https://github.com/michaelni/libpostproc
> > >> either as a separate library or a ffmpeg source plugin whatever turns
> > >> out more convenient to maintain
> > >>
> > >
> > >
> > >
> > > Congratulations, you broke  building cinelerra-gg with ffmpeg.git despite
> > > our best efforts :/
> > >
> > > Why all this code movement?!
> > >
> > > For whom it "simple"?
> > >
> >
> >
> > For some reason this mail not arrived into my inbox (spam filter ate it?)
> >
> > https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343192.html
> >
> > =
> >
> > The idea of course here is to expand this to filters and other
> > things. Which again is trivial, nothing really is needed except
> > people simply following this style of a source plugin
> >
> >
> > =
> >
> > I found this concerning. Because does this mean ffmpeg will be fragmented
> > like Python or Rust into million pieces users supposed to held together?

libpostproc never really fit in FFmpeg, has a lot of out of date code
and that's why it was removed.

>
> simple awnser, no
>
> There is an increasing number of filters which do not fit into FFmpeg.
> For a wide range of reasons. ATM these are simply inaccessable and
> invissible to users.
> With plugins you will be able to use filters that have ugly dependancies,
> or cannot be in main FFmpeg for other reasons.
> Or you can also choose not to touch them.
>
> If there is interrest we can make releases with and without all plugins
> (in fact i intend to include libpostproc in the next relaase)

I do not think we should have a plugin API. As we have seen from other
open source multimedia projects like GStreamer, the only use case will
be to incorporate binary blobs. FFmpeg will then get the support
burden from these binary blobs. The Linux kernel suffered from the
same issues.

We should encourage users to upstream patches.

Regards,
Kieran Kunhya
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] Add tools/merge-all-source-plugins

2025-05-14 Thread Michael Niedermayer
Simple script to merge all source plugins.

Signed-off-by: Michael Niedermayer 
---
 INSTALL.md | 3 +++
 tools/merge-all-source-plugins | 3 +++
 2 files changed, 6 insertions(+)
 create mode 100644 tools/merge-all-source-plugins

diff --git a/INSTALL.md b/INSTALL.md
index bdf58140149..0de204cef5b 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -1,5 +1,8 @@
 ## Installing FFmpeg
 
+0. If you like to include source plugins, merge them before configure
+for example run tools/merge-all-source-plugins
+
 1. Type `./configure` to create the configuration. A list of configure
 options is printed by running `configure --help`.
 
diff --git a/tools/merge-all-source-plugins b/tools/merge-all-source-plugins
new file mode 100644
index 000..20764a07737
--- /dev/null
+++ b/tools/merge-all-source-plugins
@@ -0,0 +1,3 @@
+#!/bin/sh
+
+git pull --no-rebase --log --stat --commit --no-edit  
https://github.com/michaelni/FFmpeg.git sourceplugin-libpostproc
-- 
2.49.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] Remove libpostproc

2025-05-14 Thread Michael Niedermayer
Hi Andrew

On Wed, May 14, 2025 at 05:54:54AM +0300, Andrew Randrianasulu wrote:
> ср, 14 мая 2025 г., 03:55 Andrew Randrianasulu :
> 
> >
> >
> > вт, 6 мая 2025 г., 02:27 Michael Niedermayer :
> >
> >> This will be available in https://github.com/michaelni/libpostproc
> >> either as a separate library or a ffmpeg source plugin whatever turns
> >> out more convenient to maintain
> >>
> >
> >
> >
> > Congratulations, you broke  building cinelerra-gg with ffmpeg.git despite
> > our best efforts :/
> >
> > Why all this code movement?!
> >
> > For whom it "simple"?
> >
> 
> 
> For some reason this mail not arrived into my inbox (spam filter ate it?)
> 
> https://ffmpeg.org/pipermail/ffmpeg-devel/2025-May/343192.html
> 
> =
> 
> The idea of course here is to expand this to filters and other
> things. Which again is trivial, nothing really is needed except
> people simply following this style of a source plugin
> 
> 
> =
> 
> I found this concerning. Because does this mean ffmpeg will be fragmented
> like Python or Rust into million pieces users supposed to held together?

simple awnser, no

There is an increasing number of filters which do not fit into FFmpeg.
For a wide range of reasons. ATM these are simply inaccessable and
invissible to users.
With plugins you will be able to use filters that have ugly dependancies,
or cannot be in main FFmpeg for other reasons.
Or you can also choose not to touch them.

If there is interrest we can make releases with and without all plugins
(in fact i intend to include libpostproc in the next relaase)

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] Vittorio's mailinglist ban, git access removal and message deletion (was: Worsening messages)

2025-05-14 Thread Kieran Kunhya via ffmpeg-devel
On Tue, 31 Dec 2024, 06:44 Michael Niedermayer, 
wrote:

> Hi Ronald, Vittorio
>
> On Mon, Dec 30, 2024 at 11:05:59PM -0500, Ronald S. Bultje wrote:
> > Hi Michael,
> >
> > On Mon, Dec 30, 2024 at 2:13 PM Michael Niedermayer <
> mich...@niedermayer.cc>
> > wrote:
> >
> > > Hi CC-2024
> > >
> > > On Mon, Dec 30, 2024 at 08:31:00AM -0500, Ronald S. Bultje wrote:
> > > > Dear Michael,
> > > >
> > > > The other 4 members of the previous CC (JB, Steven, James, Ronald)
> would
> > > > like to make it clear that it was never consulted on this decision.
> You
> > > > (Michael) have - after the fact - asked us (the other 4 previous-CC
> > > > members) to look into Vittorio's emails and Kieran's re-post thereof,
> > > which
> > > > the current CC can pick up from here. If you disagree with their
> > > decision,
> > > > you can ask the GA to re-evaluate.
> > > >
> > > > We (the other 4 previous-CC members) recommend that - until a
> decision
> > > has
> > > > been rendered - Vittorio's git commit access gets reinstated and the
> > > > deleted email gets restored into the archives. The mailinglist
> ban(s) for
> > > > Vittorio has already been revoked.
> > >
> > > As i said in my reply to vittorio, i will re-enable his account
> > > once his identity is verified and he explains what he wants to
> maintain in
> > > ffmpeg. (note iam behind with my mails, so if i didnt react to
> something,
> > > its
> > > because i try to get some distance from this nonsense (not distance
> from
> > > FFmpeg,
> > > that is dear to my heart) but this political stuff)
> > >
> > > It seems you want to turn this into a political power play. I find this
> > > unneeded.
> > > You can simply verify that vittorio is himself as you meet him next
> time
> > > and drink
> > > beer with him (which you said you did previously).
> > > Its just a simple "yeah i met him after this and he is still in control
> > > of his accounts&keys", theres no kyc here, just a statement from you
> saying
> > > you know him and hes still your good friend vittorio
> >
> >
> > I just had a few beers with Vittorio and can confirm he’s still himself
> and
> > in control of his keys.
>
> Thats excelent, thanks alot for confirming and sorry also to you and
> especially
> to vittorio for the inconveniece.
>
> iam still quite a bit behind reading emails, I was just pointed to this
> one here
>
> thx
>
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> "I am not trying to be anyone's saviour, I'm trying to think about the
>  future and not be sad" - Elon Musk
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>

Hi Michael,

Just to confirm four FFmpeg developers saw Vittorio in New York today and
confirm he is a genuine human.

It's interesting you want Vittorio to comply with KYC when you say "What is
kyc? Its a tool that makes you give out your real ID, while criminals
give out a forged ID card."

So if I understand correctly, you would like Vittorio to be verified but
you don't need verification with the community?

Regards,
Kieran Kunhya

>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 6/7] checkasm: hevc sao_edge, benchmarking inside the width loop is meaningless

2025-05-14 Thread softworkz .



> -Original Message-
> From: ffmpeg-devel  On Behalf Of Nuo Mi
> Sent: Samstag, 3. Mai 2025 11:13
> To: ffmpeg-devel@ffmpeg.org
> Cc: Nuo Mi 
> Subject: [FFmpeg-devel] [PATCH v2 6/7] checkasm: hevc sao_edge, benchmarking
> inside the width loop is meaningless
> 
> ---
>  tests/checkasm/hevc_sao.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/tests/checkasm/hevc_sao.c b/tests/checkasm/hevc_sao.c
> index ad47423f10..f597eb5254 100644
> --- a/tests/checkasm/hevc_sao.c
> +++ b/tests/checkasm/hevc_sao.c
> @@ -119,21 +119,21 @@ static void check_sao_edge(HEVCDSPContext *h, int
> bit_depth)
>  declare_func(void, uint8_t *dst, const uint8_t *src, ptrdiff_t
> stride_dst,
>   const int16_t *sao_offset_val, int eo, int width, int
> height);
> 
> -for (int w = prev_size + 4; w <= block_size; w += 4) {
> -randomize_buffers(src0, src1, BUF_SIZE);
> -randomize_buffers2(offset_val, OFFSET_LENGTH);
> -memset(dst0, 0, BUF_SIZE);
> -memset(dst1, 0, BUF_SIZE);
> +if (check_func(h->sao_edge_filter[i], "hevc_sao_edge_%d_%d",
> block_size, bit_depth)) {
> +for (int w = prev_size + 4; w <= block_size; w += 4) {
> +randomize_buffers(src0, src1, BUF_SIZE);
> +randomize_buffers2(offset_val, OFFSET_LENGTH);
> +memset(dst0, 0, BUF_SIZE);
> +memset(dst1, 0, BUF_SIZE);
> 
> -if (check_func(h->sao_edge_filter[i], "hevc_sao_edge_%d_%d",
> block_size, bit_depth)) {
>  call_ref(dst0, src0 + offset, stride, offset_val, eo, w,
> block_size);
>  call_new(dst1, src1 + offset, stride, offset_val, eo, w,
> block_size);
>  for (int j = 0; j < block_size; j++) {
>  if (memcmp(dst0 + j*stride, dst1 + j*stride,
> w*SIZEOF_PIXEL))
>  fail();
>  }
> -bench_new(dst1, src1 + offset, stride, offset_val, eo,
> block_size, block_size);
>  }
> +bench_new(dst1, src1 + offset, stride, offset_val, eo,
> block_size, block_size);
>  }
>  }
>  }
> --

Hi Nuo,

since you have applied this patch (or 7/7)´today, both FATE builds on Windows
(MSVC + GCC) are failing - for all submitted patches. 

https://patchwork.ffmpeg.org/project/ffmpeg 


Could you please take a look?




With MSVC:

D:\a\1\s\libavcodec\get_bits.h(366): warning C4101: 're_cache': unreferenced 
local variable
vvc_alf.c
CC  tests/checkasm/vp9dsp.o
vp9dsp.c
D:\a\1\s\libavcodec\get_bits.h(366): warning C4101: 're_cache': unreferenced 
local variable
STRIP   tests/checkasm/x86/checkasm.o
skipping strip -x tests/checkasm/x86/checkasm.o
CC  tests/checkasm/vvc_sao.o
vvc_sao.c
D:\a\1\s\libavcodec\get_bits.h(366): warning C4101: 're_cache': unreferenced 
local variable
D:\a\1\s\libavcodec\get_bits.h(366): warning C4101: 're_cache': unreferenced 
local variable
C:\Program Files (x86)\Windows 
Kits\10\\include\10.0.26100.0\\um\winnt.h(21227): error C2143: syntax error: 
missing ':' before 'constant'
C:\Program Files (x86)\Windows 
Kits\10\\include\10.0.26100.0\\um\winnt.h(21227): error C2143: syntax error: 
missing ';' before ':'
C:\Program Files (x86)\Windows 
Kits\10\\include\10.0.26100.0\\um\winnt.h(21227): error C2059: syntax error: ':'
C:\Program Files (x86)\Windows 
Kits\10\\include\10.0.26100.0\\um\winnt.h(21228): error C2143: syntax error: 
missing '{' before ':'
C:\Program Files (x86)\Windows 
Kits\10\\include\10.0.26100.0\\um\winnt.h(21228): error C2059: syntax error: ':'
make: *** [ffbuild/common.mak:81: tests/checkasm/vvc_sao.o] Error 2
C:\Program Files (x86)\Windows 
Kits\10\\include\10.0.26100.0\\um\winnt.h(21229): error C2059: syntax error: '}'
C:\Program Files (x86)\Windows 
Kits\10\\include\10.0.26100.0\\um\winnt.h(21230): error C2059: syntax error: '}'
C:\Program Files (x86)\Windows 
Kits\10\\include\10.0.26100.0\\um\winnt.h(21231): error C2059: syntax error: '}'

https://dev.azure.com/githubsync/ffmpeg/_build/results?buildId=87858&view=logs


With GCC:

CC  tests/checkasm/vf_threshold.o
CC  tests/checkasm/videodsp.o
CC  tests/checkasm/vorbisdsp.o
CC  tests/checkasm/vp8dsp.o
CC  tests/checkasm/vp9dsp.o
CC  tests/checkasm/vvc_alf.o
CC  tests/checkasm/vvc_mc.o
CC  tests/checkasm/vvc_sao.o
X86ASM  tests/checkasm/x86/checkasm.o
CC  tests/api/api-threadmessage-test.o
In file included from ./libavcodec/vvc/ctu.h:31,
 from tests/checkasm/vvc_sao.c:27:
./libavcodec/vvc/dec.h:36:33: error: expected identifier or '(' before numeric 
constant
   36 | #define CR  2
  | ^
make: *** [ffbuild/common.mak:81: tests/checkasm/vvc_sao.o] Error 1
CC  tests/api/api-flac-test.o
CC  tests/api/api-seek-test.o
HOSTCC  tests/audiogen.o
STRIP   tests/checkasm/x86/checkasm.o
HOSTCC  tests/videogen.o
CC  libav