date:20240624

Re: [FFmpeg-devel] [PATCH] aarch64: Add OpenBSD runtime detection of dotprod and i8mm using sysctl

2024-06-24 Thread Martin Storsjö


On Sat, 22 Jun 2024, Brad Smith wrote:


[PATCH] aarch64: Add OpenBSD runtime detection of dotprod and i8mm using sysctl

Signed-off-by: Brad Smith 
---
libavutil/aarch64/cpu.c | 35 +++
1 file changed, 35 insertions(+)

diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c
index 196bdaf6b0..40fcc8d1ff 100644
--- a/libavutil/aarch64/cpu.c
+++ b/libavutil/aarch64/cpu.c
@@ -65,6 +65,41 @@ static int detect_flags(void)
return flags;
}

+#elif defined(__OpenBSD__)
+#include 
+#include 
+#include 
+#include 
+
+static int detect_flags(void)
+{
+int flags = 0;
+int mib[2];
+uint64_t isar0;
+uint64_t isar1;
+size_t len;
+
+mib[0] = CTL_MACHDEP;
+mib[1] = CPU_ID_AA64ISAR0;
+len = sizeof(isar0);
+if (sysctl(mib, 2, &isar0, &len, NULL, 0) != -1) {
+if (ID_AA64ISAR0_DP(isar0) >= ID_AA64ISAR0_DP_IMPL)
+flags |= AV_CPU_FLAG_DOTPROD;
+}
+
+mib[0] = CTL_MACHDEP;
+mib[1] = CPU_ID_AA64ISAR1;
+len = sizeof(isar1);
+if (sysctl(mib, 2, &isar1, &len, NULL, 0) != -1) {
+#ifdef ID_AA64ISAR1_I8MM_IMPL
+if (ID_AA64ISAR1_I8MM(isar1) >= ID_AA64ISAR1_I8MM_IMPL)
+flags |= AV_CPU_FLAG_I8MM;
+#endif
+}
+
+return flags;
+}
+
#elif defined(_WIN32)
#include 


This LGTM. Although, in 
https://code.videolan.org/videolan/dav1d/-/merge_requests/1673 you wrapped 
most of this in an #ifdef CPU_ID_AA64ISAR0, so would that be useful here 
too?


Feel free to push either with or without that.

// Martin

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3] movenc: Add an option for resilient, hybrid fragmented/non-fragmented muxing

2024-06-24 Thread Martin Storsjö


On Thu, 20 Jun 2024, Dennis Sädtler wrote:


On 2024-06-20 15:47, Timo Rothenpieler wrote:

On 20/06/2024 15:46, Martin Storsjö wrote:

On Wed, 19 Jun 2024, Martin Storsjö wrote:


This allows ending up with a normal, non-fragmented file when
the file is finished, while keeping the file readable if writing
is aborted abruptly at any point. (Normally when writing a
mov/mp4 file, the unfinished file is completely useless unless it
is finished properly.)

This results in a file where the mdat atom contains (and hides)
all the moof atoms that were part of the fragmented file structure
initially.
---
v3: Renamed the option to hybrid_fragmented.
---
doc/muxers.texi    | 11 ++
libavformat/movenc.c   | 62 +++---
libavformat/movenc.h   |  4 ++-
libavformat/version.h  |  4 +--
tests/fate/lavf-container.mak  |  3 +-
tests/ref/lavf/mov_hybrid_frag |  3 ++
6 files changed, 78 insertions(+), 9 deletions(-)
create mode 100644 tests/ref/lavf/mov_hybrid_frag


If there are no more comments on this one, I'll go ahead and push it soon.


+1 from me


Sounds good to me as well.


Pushed now, thanks for all the input!

// Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/4] hlsenc: Fix the return value accumulation in append_single_file

2024-06-24 Thread Martin Storsjö

Both the read_byte variable (which is accumulated into
append_single_file) and the return value are int64_t;
give the ret variable the right corresponding type too.
---
 libavformat/hlsenc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/hlsenc.c b/libavformat/hlsenc.c
index f5c0243cf1..3d5eb47e84 100644
--- a/libavformat/hlsenc.c
+++ b/libavformat/hlsenc.c
@@ -2380,7 +2380,7 @@ static int hls_init_file_resend(AVFormatContext *s, 
VariantStream *vs)
 
 static int64_t append_single_file(AVFormatContext *s, VariantStream *vs)
 {
-int ret = 0;
+int64_t ret = 0;
 int64_t read_byte = 0;
 int64_t total_size = 0;
 char *filename = NULL;
-- 
2.39.3 (Apple Git-146)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 2/4] hlsenc: Fix setting vs->start_pos when not using HLS_SINGLE_FILE or hls_segment_size

2024-06-24 Thread Martin Storsjö

When not using HLS_SINGLE_FILE or hls_segment_size, we're writing
each segment into a separate file. In that case, the file start pos for
each segment will be zero.

This matches the case in (hls->max_seg_size > 0) above, where we
decide to switch to a new file.

This fixes the calculation of "vs->size = new_start_pos - vs->start_pos"
at the start of hls_write_packet; previously, start_pos would
refer to the byte size of the previous segment file, giving
vs->size entirely bogus values here.
---
 libavformat/hlsenc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/hlsenc.c b/libavformat/hlsenc.c
index 3d5eb47e84..0c72774e29 100644
--- a/libavformat/hlsenc.c
+++ b/libavformat/hlsenc.c
@@ -2659,7 +2659,7 @@ static int hls_write_packet(AVFormatContext *s, AVPacket 
*pkt)
 vs->start_pos = new_start_pos;
 }
 } else {
-vs->start_pos = new_start_pos;
+vs->start_pos = 0;
 sls_flag_file_rename(hls, vs, old_filename);
 ret = hls_start(s, vs);
 }
-- 
2.39.3 (Apple Git-146)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 3/4] hlsenc: When not using HLS_SINGLE_FILE, set vs->size to range_length

2024-06-24 Thread Martin Storsjö

This matches what is done in the corresponding case for
HLS_SINGLE_FILE.

Normally, vs->size is already initialized correctly - but when
writing the initial segment, with mp4 files, vs->size has been set
to the size of the init segment, while range_length contains the
real size of the first segment.
---
 libavformat/hlsenc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavformat/hlsenc.c b/libavformat/hlsenc.c
index 0c72774e29..3ca99abdbb 100644
--- a/libavformat/hlsenc.c
+++ b/libavformat/hlsenc.c
@@ -2586,6 +2586,7 @@ static int hls_write_packet(AVFormatContext *s, AVPacket 
*pkt)
 av_dict_free(&options);
 return ret;
 }
+vs->size = range_length;
 ret = hlsenc_io_close(s, &vs->out, filename);
 if (ret < 0) {
 av_log(s, AV_LOG_WARNING, "upload segment failed,"
-- 
2.39.3 (Apple Git-146)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 4/4] hlsenc: Calculate the average and actual maximum bitrate of segments

2024-06-24 Thread Martin Storsjö

Previously, the bitrate advertised in the master playlist would only
be based on the nominal values in either AVCodecParameters bit_rate,
or via AVCPBProperties max_bitrate. On top of this, a
fudge factor of 10% is added, to account for container overhead.

Neither of these bitrates may be known, and if the encoder is
running in VBR mode, there is no such value to be known. And
the container overhead may be more or less than the given
constant factor of 10%.

Instead, calculate the maximum bitrate per segment based on
what actually gets output from the muxer, and average bitrate
across all segments.

When muxing of the file finishes, update the master playlist
with these values, exposing both the maximum (which previously
was a guesstimate based on the nominal values) via
EXT-X-STREAM-INF BANDWIDTH, and the average via
EXT-X-STREAM-INF AVERAGE-BANDWIDTH.

This makes it possible to use the hlsenc muxer with VBR
encodes, for VOD style muxing.
---
 libavformat/dashenc.c |  4 ++--
 libavformat/hlsenc.c  | 47 ---
 libavformat/hlsplaylist.c |  3 +++
 libavformat/hlsplaylist.h |  1 +
 4 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/libavformat/dashenc.c b/libavformat/dashenc.c
index 8c14aa746e..d4a6fe0304 100644
--- a/libavformat/dashenc.c
+++ b/libavformat/dashenc.c
@@ -1322,7 +1322,7 @@ static int write_manifest(AVFormatContext *s, int final)
 av_strlcat(codec_str, audio_codec_str, sizeof(codec_str));
 }
 get_hls_playlist_name(playlist_file, sizeof(playlist_file), 
NULL, i);
-ff_hls_write_stream_info(st, c->m3u8_out, stream_bitrate,
+ff_hls_write_stream_info(st, c->m3u8_out, stream_bitrate, 0,
  playlist_file, agroup,
  codec_str, NULL, NULL);
 }
@@ -1348,7 +1348,7 @@ static int write_manifest(AVFormatContext *s, int final)
 continue;
 av_strlcpy(codec_str, os->codec_str, sizeof(codec_str));
 get_hls_playlist_name(playlist_file, sizeof(playlist_file), 
NULL, i);
-ff_hls_write_stream_info(st, c->m3u8_out, stream_bitrate,
+ff_hls_write_stream_info(st, c->m3u8_out, stream_bitrate, 0,
  playlist_file, NULL,
  codec_str, NULL, NULL);
 }
diff --git a/libavformat/hlsenc.c b/libavformat/hlsenc.c
index 3ca99abdbb..26722c9b32 100644
--- a/libavformat/hlsenc.c
+++ b/libavformat/hlsenc.c
@@ -150,6 +150,11 @@ typedef struct VariantStream {
 int discontinuity;
 int reference_stream_index;
 
+int64_t total_size;
+double total_duration;
+int64_t avg_bitrate;
+int64_t max_bitrate;
+
 HLSSegment *segments;
 HLSSegment *last_segment;
 HLSSegment *old_segments;
@@ -1108,6 +1113,16 @@ static int hls_append_segment(struct AVFormatContext *s, 
HLSContext *hls,
 if (!en)
 return AVERROR(ENOMEM);
 
+vs->total_size += size;
+vs->total_duration += duration;
+if (duration > 0) {
+int cur_bitrate = (int)(8 * size / duration);
+if (cur_bitrate > vs->max_bitrate)
+vs->max_bitrate = cur_bitrate;
+}
+if (vs->total_duration > 0)
+vs->avg_bitrate = (int)(8 * vs->total_size / vs->total_duration);
+
 en->var_stream_idx = vs->var_stream_idx;
 ret = sls_flags_filename_process(s, hls, vs, en, duration, pos, size);
 if (ret < 0) {
@@ -1362,14 +1377,15 @@ static int64_t get_stream_bit_rate(AVStream *stream)
 }
 
 static int create_master_playlist(AVFormatContext *s,
-  VariantStream * const input_vs)
+  VariantStream * const input_vs,
+  int final)
 {
 HLSContext *hls = s->priv_data;
 VariantStream *vs, *temp_vs;
 AVStream *vid_st, *aud_st;
 AVDictionary *options = NULL;
 unsigned int i, j;
-int ret, bandwidth;
+int ret, bandwidth, avg_bandwidth;
 const char *m3u8_rel_name = NULL;
 const char *vtt_m3u8_rel_name = NULL;
 const char *ccgroup;
@@ -1389,8 +1405,8 @@ static int create_master_playlist(AVFormatContext *s,
 return 0;
 } else {
  /* Keep publishing the master playlist at the configured rate */
-if (&hls->var_streams[0] != input_vs || !hls->master_publish_rate ||
-input_vs->number % hls->master_publish_rate)
+if ((&hls->var_streams[0] != input_vs || !hls->master_publish_rate ||
+input_vs->number % hls->master_publish_rate) && !final)
 return 0;
 }
 
@@ -1480,12 +1496,17 @@ static int create_master_playlist(AVFormatContext *s,
 }
 }
 
-bandwidth = 0;
-if (vid_st)
-bandwidth += get_stream_bit_rate(vid_st);
-if (aud_st)
-bandwidth += get_stream_bi

Re: [FFmpeg-devel] [PATCH v3 2/3] avcodec/jpeg2000dec: Add support for placeholder passes

2024-06-24 Thread Andreas Rheinhardt

Osamu Watanabe:
> This commit adds support for placeholder pass parsing
> 

What is a placeholder pass?

> Signed-off-by: Osamu Watanabe 
> ---
>  libavcodec/jpeg2000.h  |   2 +
>  libavcodec/jpeg2000dec.c   | 292 +++--
>  libavcodec/jpeg2000htdec.c |  18 +--
>  3 files changed, 257 insertions(+), 55 deletions(-)
> 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3] movenc: Add an option for resilient, hybrid fragmented/non-fragmented muxing

2024-06-24 Thread Dennis Mungai

On Mon, 24 Jun 2024, 11:24 Martin Storsjö,  wrote:

> On Thu, 20 Jun 2024, Dennis Sädtler wrote:
>
> > On 2024-06-20 15:47, Timo Rothenpieler wrote:
> >> On 20/06/2024 15:46, Martin Storsjö wrote:
> >>> On Wed, 19 Jun 2024, Martin Storsjö wrote:
> >>>
>  This allows ending up with a normal, non-fragmented file when
>  the file is finished, while keeping the file readable if writing
>  is aborted abruptly at any point. (Normally when writing a
>  mov/mp4 file, the unfinished file is completely useless unless it
>  is finished properly.)
> 
>  This results in a file where the mdat atom contains (and hides)
>  all the moof atoms that were part of the fragmented file structure
>  initially.
>  ---
>  v3: Renamed the option to hybrid_fragmented.
>  ---
>  doc/muxers.texi| 11 ++
>  libavformat/movenc.c   | 62 +++---
>  libavformat/movenc.h   |  4 ++-
>  libavformat/version.h  |  4 +--
>  tests/fate/lavf-container.mak  |  3 +-
>  tests/ref/lavf/mov_hybrid_frag |  3 ++
>  6 files changed, 78 insertions(+), 9 deletions(-)
>  create mode 100644 tests/ref/lavf/mov_hybrid_frag
> >>>
> >>> If there are no more comments on this one, I'll go ahead and push it
> soon.
> >>
> >> +1 from me
> >
> > Sounds good to me as well.
>
> Pushed now, thanks for all the input!
>
> // Martin
>



Thanks for the patch, this resolves multiple issues with fragmented output
even when FFmpeg exits suddenly. Retesting shortly with fmp4 + tee
muxer(s).

>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3 2/3] avcodec/jpeg2000dec: Add support for placeholder passes

2024-06-24 Thread WATANABE Osamu

Placeholder pass is a coding pass having zero length. It is necessary to keep 
pass boundaries of layers for the transcoding from HT to non HT codestream. It 
is defined in the spec of HTJ2K.

A detaled explanation is available at 
https://ds.jpeg.org/documents/jpeg2000/wg1n100680-101-COM-Guideline_on_Placeholder_Passes_and_Multiple_HT_Sets_in_HTJ2K_codestreams.zip


> 2024/06/24 18:34、Andreas Rheinhardt のメール:
> 
> ?Osamu Watanabe:
>> This commit adds support for placeholder pass parsing
>> 
> 
> What is a placeholder pass?
> 
>> Signed-off-by: Osamu Watanabe 
>> ---
>> libavcodec/jpeg2000.h  |   2 +
>> libavcodec/jpeg2000dec.c   | 292 +++--
>> libavcodec/jpeg2000htdec.c |  18 +--
>> 3 files changed, 257 insertions(+), 55 deletions(-)
>> 
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/4] hlsenc: Fix setting vs->start_pos when not using HLS_SINGLE_FILE or hls_segment_size

2024-06-24 Thread Steven Liu



> On Jun 24, 2024, at 16:49, Martin Storsjö  wrote:
> 
> When not using HLS_SINGLE_FILE or hls_segment_size, we're writing
> each segment into a separate file. In that case, the file start pos for
> each segment will be zero.
> 
> This matches the case in (hls->max_seg_size > 0) above, where we
> decide to switch to a new file.
> 
> This fixes the calculation of "vs->size = new_start_pos - vs->start_pos"
> at the start of hls_write_packet; previously, start_pos would
> refer to the byte size of the previous segment file, giving
> vs->size entirely bogus values here.
> ---
> libavformat/hlsenc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavformat/hlsenc.c b/libavformat/hlsenc.c
> index 3d5eb47e84..0c72774e29 100644
> --- a/libavformat/hlsenc.c
> +++ b/libavformat/hlsenc.c
> @@ -2659,7 +2659,7 @@ static int hls_write_packet(AVFormatContext *s, 
> AVPacket *pkt)
> vs->start_pos = new_start_pos;
> }
> } else {
> -vs->start_pos = new_start_pos;
> +vs->start_pos = 0;
> sls_flag_file_rename(hls, vs, old_filename);
> ret = hls_start(s, vs);
> }
> -- 
> 2.39.3 (Apple Git-146)
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe”.

patchset lgtm

Thanks Martin

Steven


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v3 2/3] swscale/aarch64: Add bgra/rgba to yuv

2024-06-24 Thread Zhao Zhili

From: Zhao Zhili 

Test on Apple M1 with kperf
: -O3   : -O3 -fno-vectorize
bgra_to_uv_8_c  : 13.4  : 27.5
bgra_to_uv_8_neon   : 37.4  : 41.7
bgra_to_uv_128_c: 155.9 : 550.2
bgra_to_uv_128_neon : 91.7  : 92.7
bgra_to_uv_1080_c   : 1173.2: 4558.2
bgra_to_uv_1080_neon: 822.7 : 809.5
bgra_to_uv_1920_c   : 2078.2: 8115.2
bgra_to_uv_1920_neon: 1437.7: 1438.7
bgra_to_uv_half_8_c : 17.9  : 14.2
bgra_to_uv_half_8_neon  : 37.4  : 10.5
bgra_to_uv_half_128_c   : 103.9 : 326.0
bgra_to_uv_half_128_neon: 73.9  : 68.7
bgra_to_uv_half_1080_c  : 850.2 : 3732.0
bgra_to_uv_half_1080_neon   : 484.2 : 490.0
bgra_to_uv_half_1920_c  : 1479.2: 4942.7
bgra_to_uv_half_1920_neon   : 824.2 : 824.7
bgra_to_y_8_c   : 8.2   : 29.5
bgra_to_y_8_neon: 18.2  : 32.7
bgra_to_y_128_c : 101.4 : 361.5
bgra_to_y_128_neon  : 74.9  : 73.7
bgra_to_y_1080_c: 739.4 : 3018.0
bgra_to_y_1080_neon : 613.4 : 544.2
bgra_to_y_1920_c: 1298.7: 5326.0
bgra_to_y_1920_neon : 918.7 : 934.2
---
 libswscale/aarch64/input.S   | 91 ++--
 libswscale/aarch64/swscale.c | 16 +++
 2 files changed, 94 insertions(+), 13 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index 2cfec4cb6a..6d2c6034bb 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -20,8 +20,12 @@
 
 #include "libavutil/aarch64/asm.S"
 
-.macro rgb_to_yuv_load_rgb src
+.macro rgb_to_yuv_load_rgb src, element=3
+.if \element == 3
 ld3 { v16.16b, v17.16b, v18.16b }, [\src]
+.else
+ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [\src]
+.endif
 uxtlv19.8h, v16.8b // v19: r
 uxtlv20.8h, v17.8b // v20: g
 uxtlv21.8h, v18.8b // v21: b
@@ -51,7 +55,8 @@ function ff_bgr24ToY_neon, export=1
 ret
 endfunc
 
-function ff_rgb24ToY_neon, export=1
+.macro rgbToY_neon fmt, element
+function ff_\fmt\()ToY_neon, export=1
 cmp w4, #0  // check width > 0
 ldp w10, w11, [x5]  // w10: ry, w11: gy
 ldr w12, [x5, #8]   // w12: by
@@ -67,11 +72,11 @@ function ff_rgb24ToY_neon, export=1
 dup v2.8h, w12
 b.lt2f
 1:
-rgb_to_yuv_load_rgb x1
+rgb_to_yuv_load_rgb x1, \element
 rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
 rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
 sub w4, w4, #16 // width -= 16
-add x1, x1, #48 // src += 48
+add x1, x1, #(16*\element)
 cmp w4, #16 // width >= 16 ?
 stp q16, q17, [x0], #32 // store to dst
 b.ge1b
@@ -86,12 +91,25 @@ function ff_rgb24ToY_neon, export=1
 smaddl  x13, w15, w12, x13  // x13 += by * b
 asr w13, w13, #9// x13 >>= 9
 sub w4, w4, #1  // width--
-add x1, x1, #3  // src += 3
+add x1, x1, #\element
 strhw13, [x0], #2   // store to dst
 cbnzw4, 2b
 3:
 ret
 endfunc
+.endm
+
+rgbToY_neon fmt=rgb24, element=3
+
+function ff_bgra32ToY_neon, export=1
+cmp w4, #0  // check width > 0
+ldp w12, w11, [x5]  // w12: ry, w11: gy
+ldr w10, [x5, #8]   // w10: by
+b.gt4f
+ret
+endfunc
+
+rgbToY_neon fmt=rgba32, element=4
 
 .macro rgb_set_uv_coeff half
 .if \half
@@ -120,7 +138,8 @@ function ff_bgr24ToUV_half_neon, export=1
 b   4f
 endfunc
 
-function ff_rgb24ToUV_half_neon, export=1
+.macro rgbToUV_half_neon fmt, element
+function ff_\fmt\()ToUV_half_neon, export=1
 cmp w5, #0  // check width > 0
 b.le3f
 
@@ -132,7 +151,11 @@ function ff_rgb24ToUV_half_neon, export=1
 rgb_set_uv_coeff half=1
 b.lt2f
 1:
+.if \element == 3
 ld3 { v16.16b, v17.16b, v18.16b }, [x3]
+.else
+ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [x3]
+.endif
 uaddlp  v19.8h, v16.16b // v19: r
 uaddlp  v20.8h, v17.16b // v20: g
 u

[FFmpeg-devel] [PATCH v3 1/3] swscale/aarch64: Add bgr24 to yuv

2024-06-24 Thread Zhao Zhili

From: Zhao Zhili 

Test on Apple M1 with kperf
: -O3   : -O3 -fno-vectorize
bgr24_to_uv_8_c : 28.5  : 52.5
bgr24_to_uv_8_neon  : 54.5  : 59.7
bgr24_to_uv_128_c   : 294.0 : 830.7
bgr24_to_uv_128_neon: 99.7  : 112.0
bgr24_to_uv_1080_c  : 965.0 : 6624.0
bgr24_to_uv_1080_neon   : 751.5 : 754.7
bgr24_to_uv_1920_c  : 1693.2: 11554.5
bgr24_to_uv_1920_neon   : 1292.5: 1307.5
bgr24_to_uv_half_8_c: 54.2  : 37.0
bgr24_to_uv_half_8_neon : 27.2  : 22.5
bgr24_to_uv_half_128_c  : 127.2 : 392.5
bgr24_to_uv_half_128_neon   : 63.0  : 52.0
bgr24_to_uv_half_1080_c : 880.2 : 3329.0
bgr24_to_uv_half_1080_neon  : 401.5 : 390.7
bgr24_to_uv_half_1920_c : 1585.7: 6390.7
bgr24_to_uv_half_1920_neon  : 694.7 : 698.7
bgr24_to_y_8_c  : 21.7  : 22.5
bgr24_to_y_8_neon   : 797.2 : 25.5
bgr24_to_y_128_c: 88.0  : 280.5
bgr24_to_y_128_neon : 63.7  : 55.0
bgr24_to_y_1080_c   : 616.7 : 2208.7
bgr24_to_y_1080_neon: 900.0 : 452.0
bgr24_to_y_1920_c   : 1093.2: 3894.7
bgr24_to_y_1920_neon: 777.2 : 767.5
---
 libswscale/aarch64/input.S   | 71 ++--
 libswscale/aarch64/swscale.c | 32 +---
 2 files changed, 71 insertions(+), 32 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index 33afa34111..2cfec4cb6a 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -20,7 +20,7 @@
 
 #include "libavutil/aarch64/asm.S"
 
-.macro rgb24_to_yuv_load_rgb, src
+.macro rgb_to_yuv_load_rgb src
 ld3 { v16.16b, v17.16b, v18.16b }, [\src]
 uxtlv19.8h, v16.8b // v19: r
 uxtlv20.8h, v17.8b // v20: g
@@ -30,7 +30,7 @@
 uxtl2   v24.8h, v18.16b// v24: b
 .endm
 
-.macro rgb24_to_yuv_product, r, g, b, dst1, dst2, dst, coef0, coef1, coef2, 
right_shift
+.macro rgb_to_yuv_product r, g, b, dst1, dst2, dst, coef0, coef1, coef2, 
right_shift
 mov \dst1\().16b, v6.16b// dst1 = 
const_offset
 mov \dst2\().16b, v6.16b// dst2 = 
const_offset
 smlal   \dst1\().4s, \coef0\().4h, \r\().4h // dst1 += rx 
* r
@@ -43,12 +43,20 @@
 sqshrn2 \dst\().8h, \dst2\().4s, \right_shift   // 
dst_higher_half = dst2 >> right_shift
 .endm
 
+function ff_bgr24ToY_neon, export=1
+cmp w4, #0  // check width > 0
+ldp w12, w11, [x5]  // w12: ry, w11: gy
+ldr w10, [x5, #8]   // w10: by
+b.gt4f
+ret
+endfunc
+
 function ff_rgb24ToY_neon, export=1
 cmp w4, #0  // check width > 0
 ldp w10, w11, [x5]  // w10: ry, w11: gy
 ldr w12, [x5, #8]   // w12: by
 b.le3f
-
+4:
 mov w9, #256// w9 = 1 << (RGB2YUV_SHIFT - 
7)
 movkw9, #8, lsl #16 // w9 += 32 << (RGB2YUV_SHIFT 
- 1)
 dup v6.4s, w9   // w9: const_offset
@@ -59,9 +67,9 @@ function ff_rgb24ToY_neon, export=1
 dup v2.8h, w12
 b.lt2f
 1:
-rgb24_to_yuv_load_rgb x1
-rgb24_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
-rgb24_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
+rgb_to_yuv_load_rgb x1
+rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
+rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
 sub w4, w4, #16 // width -= 16
 add x1, x1, #48 // src += 48
 cmp w4, #16 // width >= 16 ?
@@ -85,10 +93,7 @@ function ff_rgb24ToY_neon, export=1
 ret
 endfunc
 
-.macro rgb24_load_uv_coeff half
-ldp w10, w11, [x6, #12] // w10: ru, w11: gu
-ldp w12, w13, [x6, #20] // w12: bu, w13: rv
-ldp w14, w15, [x6, #28] // w14: gv, w15: bv
+.macro rgb_set_uv_coeff half
 .if \half
 mov w9, #512
 movkw9, #128, lsl #16   // w9: const_offset
@@ -105,12 +110,26 @@ endfunc
 dup v6.4s, w9
 .endm
 
+function ff_bgr24ToUV_half_neon, export=1
+cmp w5, #0  // check width > 0
+b.le3f
+
+ldp w12, w11, [x6, #12]
+ldp

[FFmpeg-devel] [PATCH v3 3/3] swscale/aarch64: Add argb/abgr to yuv

2024-06-24 Thread Zhao Zhili

From: Zhao Zhili 

Test on Apple M1 with kperf:
: -O3   : -O3 -fno-vectorize
abgr_to_uv_8_c  : 19.4  : 26.1
abgr_to_uv_8_neon   : 29.9  : 51.1
abgr_to_uv_128_c: 146.4 : 558.9
abgr_to_uv_128_neon : 85.1  : 83.4
abgr_to_uv_1080_c   : 1162.6: 4786.4
abgr_to_uv_1080_neon: 819.6 : 826.6
abgr_to_uv_1920_c   : 2063.6: 8492.1
abgr_to_uv_1920_neon: 1435.1: 1447.1
abgr_to_uv_half_8_c : 16.4  : 11.4
abgr_to_uv_half_8_neon  : 35.6  : 20.4
abgr_to_uv_half_128_c   : 108.6 : 359.4
abgr_to_uv_half_128_neon: 75.4  : 42.6
abgr_to_uv_half_1080_c  : 883.4 : 2885.6
abgr_to_uv_half_1080_neon   : 460.6 : 481.1
abgr_to_uv_half_1920_c  : 1553.6: 5106.9
abgr_to_uv_half_1920_neon   : 817.6 : 820.4
abgr_to_y_8_c   : 6.1   : 26.4
abgr_to_y_8_neon: 40.6  : 6.4
abgr_to_y_128_c : 99.9  : 390.1
abgr_to_y_128_neon  : 67.4  : 55.9
abgr_to_y_1080_c: 735.9 : 3170.4
abgr_to_y_1080_neon : 534.6 : 536.6
abgr_to_y_1920_c: 1279.4: 6016.4
abgr_to_y_1920_neon : 932.6 : 927.6
---
 libswscale/aarch64/input.S   | 114 ---
 libswscale/aarch64/swscale.c |  17 ++
 2 files changed, 110 insertions(+), 21 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index 6d2c6034bb..f4d587fed0 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -34,6 +34,16 @@
 uxtl2   v24.8h, v18.16b// v24: b
 .endm
 
+.macro argb_to_yuv_load_rgb src
+ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [\src]
+uxtlv21.8h, v19.8b // v21: b
+uxtl2   v24.8h, v19.16b// v24: b
+uxtlv19.8h, v17.8b // v19: r
+uxtlv20.8h, v18.8b // v20: g
+uxtl2   v22.8h, v17.16b// v22: r
+uxtl2   v23.8h, v18.16b// v23: g
+.endm
+
 .macro rgb_to_yuv_product r, g, b, dst1, dst2, dst, coef0, coef1, coef2, 
right_shift
 mov \dst1\().16b, v6.16b// dst1 = 
const_offset
 mov \dst2\().16b, v6.16b// dst2 = 
const_offset
@@ -55,7 +65,7 @@ function ff_bgr24ToY_neon, export=1
 ret
 endfunc
 
-.macro rgbToY_neon fmt, element
+.macro rgbToY_neon fmt, element, alpha_first=0
 function ff_\fmt\()ToY_neon, export=1
 cmp w4, #0  // check width > 0
 ldp w10, w11, [x5]  // w10: ry, w11: gy
@@ -72,7 +82,11 @@ function ff_\fmt\()ToY_neon, export=1
 dup v2.8h, w12
 b.lt2f
 1:
+.if \alpha_first
+argb_to_yuv_load_rgb x1
+.else
 rgb_to_yuv_load_rgb x1, \element
+.endif
 rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
 rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
 sub w4, w4, #16 // width -= 16
@@ -82,9 +96,15 @@ function ff_\fmt\()ToY_neon, export=1
 b.ge1b
 cbz x4, 3f
 2:
+.if \alpha_first
+ldrbw13, [x1, #1]   // w13: r
+ldrbw14, [x1, #2]   // w14: g
+ldrbw15, [x1, #3]   // w15: b
+.else
 ldrbw13, [x1]   // w13: r
 ldrbw14, [x1, #1]   // w14: g
 ldrbw15, [x1, #2]   // w15: b
+.endif
 
 smaddl  x13, w13, w10, x9   // x13 = ry * r + const_offset
 smaddl  x13, w14, w11, x13  // x13 += gy * g
@@ -101,6 +121,16 @@ endfunc
 
 rgbToY_neon fmt=rgb24, element=3
 
+function ff_abgr32ToY_neon, export=1
+cmp w4, #0  // check width > 0
+ldp w12, w11, [x5]  // w12: ry, w11: gy
+ldr w10, [x5, #8]   // w10: by
+b.gt4f
+ret
+endfunc
+
+rgbToY_neon fmt=argb32, element=4, alpha_first=1
+
 function ff_bgra32ToY_neon, export=1
 cmp w4, #0  // check width > 0
 ldp w12, w11, [x5]  // w12: ry, w11: gy
@@ -138,7 +168,21 @@ function ff_bgr24ToUV_half_neon, export=1
 b   4f
 endfunc
 
-.macro rgbToUV_half_neon fmt, element
+.macro rgb_load_add_half off_r1, off_r2, off_g1, off_g2, off_b1, off_b2
+ldrbw2, [x3, #\off_r1] // w2: r1
+ldrbw4, [x3, #\off_r2]

Re: [FFmpeg-devel] [PATCH v3 1/3] swscale/aarch64: Add bgr24 to yuv

2024-06-24 Thread Martin Storsjö


On Mon, 24 Jun 2024, Zhao Zhili wrote:


From: Zhao Zhili 

Test on Apple M1 with kperf
: -O3   : -O3 -fno-vectorize
bgr24_to_uv_8_c : 28.5  : 52.5
bgr24_to_uv_8_neon  : 54.5  : 59.7
bgr24_to_uv_128_c   : 294.0 : 830.7
bgr24_to_uv_128_neon: 99.7  : 112.0
bgr24_to_uv_1080_c  : 965.0 : 6624.0
bgr24_to_uv_1080_neon   : 751.5 : 754.7
bgr24_to_uv_1920_c  : 1693.2: 11554.5
bgr24_to_uv_1920_neon   : 1292.5: 1307.5
bgr24_to_uv_half_8_c: 54.2  : 37.0
bgr24_to_uv_half_8_neon : 27.2  : 22.5
bgr24_to_uv_half_128_c  : 127.2 : 392.5
bgr24_to_uv_half_128_neon   : 63.0  : 52.0
bgr24_to_uv_half_1080_c : 880.2 : 3329.0
bgr24_to_uv_half_1080_neon  : 401.5 : 390.7
bgr24_to_uv_half_1920_c : 1585.7: 6390.7
bgr24_to_uv_half_1920_neon  : 694.7 : 698.7
bgr24_to_y_8_c  : 21.7  : 22.5
bgr24_to_y_8_neon   : 797.2 : 25.5
bgr24_to_y_128_c: 88.0  : 280.5
bgr24_to_y_128_neon : 63.7  : 55.0
bgr24_to_y_1080_c   : 616.7 : 2208.7
bgr24_to_y_1080_neon: 900.0 : 452.0
bgr24_to_y_1920_c   : 1093.2: 3894.7
bgr24_to_y_1920_neon: 777.2 : 767.5
---


This patch looks ok now

// Martin

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3 2/3] swscale/aarch64: Add bgra/rgba to yuv

2024-06-24 Thread Martin Storsjö


On Mon, 24 Jun 2024, Zhao Zhili wrote:


From: Zhao Zhili 

Test on Apple M1 with kperf
: -O3   : -O3 -fno-vectorize
bgra_to_uv_8_c  : 13.4  : 27.5
bgra_to_uv_8_neon   : 37.4  : 41.7
bgra_to_uv_128_c: 155.9 : 550.2
bgra_to_uv_128_neon : 91.7  : 92.7
bgra_to_uv_1080_c   : 1173.2: 4558.2
bgra_to_uv_1080_neon: 822.7 : 809.5
bgra_to_uv_1920_c   : 2078.2: 8115.2
bgra_to_uv_1920_neon: 1437.7: 1438.7
bgra_to_uv_half_8_c : 17.9  : 14.2
bgra_to_uv_half_8_neon  : 37.4  : 10.5
bgra_to_uv_half_128_c   : 103.9 : 326.0
bgra_to_uv_half_128_neon: 73.9  : 68.7
bgra_to_uv_half_1080_c  : 850.2 : 3732.0
bgra_to_uv_half_1080_neon   : 484.2 : 490.0
bgra_to_uv_half_1920_c  : 1479.2: 4942.7
bgra_to_uv_half_1920_neon   : 824.2 : 824.7
bgra_to_y_8_c   : 8.2   : 29.5
bgra_to_y_8_neon: 18.2  : 32.7
bgra_to_y_128_c : 101.4 : 361.5
bgra_to_y_128_neon  : 74.9  : 73.7
bgra_to_y_1080_c: 739.4 : 3018.0
bgra_to_y_1080_neon : 613.4 : 544.2
bgra_to_y_1920_c: 1298.7: 5326.0
bgra_to_y_1920_neon : 918.7 : 934.2
---
libswscale/aarch64/input.S   | 91 ++--
libswscale/aarch64/swscale.c | 16 +++
2 files changed, 94 insertions(+), 13 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index 2cfec4cb6a..6d2c6034bb 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -20,8 +20,12 @@

#include "libavutil/aarch64/asm.S"

-.macro rgb_to_yuv_load_rgb src
+.macro rgb_to_yuv_load_rgb src, element=3
+.if \element == 3
ld3 { v16.16b, v17.16b, v18.16b }, [\src]
+.else
+ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [\src]
+.endif
uxtlv19.8h, v16.8b // v19: r
uxtlv20.8h, v17.8b // v20: g
uxtlv21.8h, v18.8b // v21: b
@@ -51,7 +55,8 @@ function ff_bgr24ToY_neon, export=1
ret
endfunc

-function ff_rgb24ToY_neon, export=1
+.macro rgbToY_neon fmt, element
+function ff_\fmt\()ToY_neon, export=1
cmp w4, #0  // check width > 0
ldp w10, w11, [x5]  // w10: ry, w11: gy
ldr w12, [x5, #8]   // w12: by
@@ -67,11 +72,11 @@ function ff_rgb24ToY_neon, export=1
dup v2.8h, w12
b.lt2f
1:
-rgb_to_yuv_load_rgb x1
+rgb_to_yuv_load_rgb x1, \element
rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
sub w4, w4, #16 // width -= 16
-add x1, x1, #48 // src += 48
+add x1, x1, #(16*\element)
cmp w4, #16 // width >= 16 ?
stp q16, q17, [x0], #32 // store to dst
b.ge1b
@@ -86,12 +91,25 @@ function ff_rgb24ToY_neon, export=1
smaddl  x13, w15, w12, x13  // x13 += by * b
asr w13, w13, #9// x13 >>= 9
sub w4, w4, #1  // width--
-add x1, x1, #3  // src += 3
+add x1, x1, #\element
strhw13, [x0], #2   // store to dst
cbnzw4, 2b
3:
ret
endfunc
+.endm
+
+rgbToY_neon fmt=rgb24, element=3
+
+function ff_bgra32ToY_neon, export=1
+cmp w4, #0  // check width > 0
+ldp w12, w11, [x5]  // w12: ry, w11: gy
+ldr w10, [x5, #8]   // w10: by
+b.gt4f
+ret
+endfunc
+
+rgbToY_neon fmt=rgba32, element=4


It is extremely obscure to jump to a local label (4f) that is defined by 
the following macro. I think this would be much more readable if you'd 
include the bgr(a) version in the macro, so the reference to 4f is near to 
the actual label it refers to.



.macro rgb_set_uv_coeff half
.if \half
@@ -120,7 +138,8 @@ function ff_bgr24ToUV_half_neon, export=1
b   4f
endfunc

-function ff_rgb24ToUV_half_neon, export=1
+.macro rgbToUV_half_neon fmt, element
+function ff_\fmt\()ToUV_half_neon, export=1
cmp w5, #0  // check width > 0
b.le3f

@@ -132,7 +151,11 @@ function ff_rgb24ToUV_half_neon, export=1
rgb_set_uv_coeff half=1
b.lt2f
1:
+.if \element == 3
ld3 { v1

Re: [FFmpeg-devel] [PATCH 01/10 v4] avutil/stereo3d: add a Monoscopic view enum value

2024-06-24 Thread James Almer


On 6/22/2024 8:15 PM, James Almer wrote:

We need a way to signal the frame has a single view that doesn't map to any
particular eye, and it should be the default one.

Signed-off-by: James Almer 
---
  libavutil/stereo3d.c | 1 +
  libavutil/stereo3d.h | 5 +
  2 files changed, 6 insertions(+)

diff --git a/libavutil/stereo3d.c b/libavutil/stereo3d.c
index 19e81e4124..37cf093099 100644
--- a/libavutil/stereo3d.c
+++ b/libavutil/stereo3d.c
@@ -71,6 +71,7 @@ static const char * const stereo3d_view_names[] = {
  [AV_STEREO3D_VIEW_PACKED] = "packed",
  [AV_STEREO3D_VIEW_LEFT]   = "left",
  [AV_STEREO3D_VIEW_RIGHT]  = "right",
+[AV_STEREO3D_VIEW_MONO]   = "monoscopic",
  };
  
  static const char * const stereo3d_primary_eye_names[] = {

diff --git a/libavutil/stereo3d.h b/libavutil/stereo3d.h
index 00a5c3900e..9a004d88a1 100644
--- a/libavutil/stereo3d.h
+++ b/libavutil/stereo3d.h
@@ -156,6 +156,11 @@ enum AVStereo3DView {
   * Frame contains only the right view.
   */
  AV_STEREO3D_VIEW_RIGHT,
+
+/**
+ * Frame is monoscopic.
+ */
+AV_STEREO3D_VIEW_MONO,
  };
  
  /**


Looking more into this, i don't know if this is a good idea, or even 
backwards compatible. AVStereo3DView is right now only ever looked at if 
type is not 2D, so adding a view that only applies to 2D seems 
pointless. And if we make it the default, users (wrongly) making the 
assumption packed view is the default will find themselves with a 3D 
type signaling a monoscopic view.


For now I'll apply patch 2 adding unspec type plus the patches that 
don't deal with the view added here, unless there are objections.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v4 1/3] swscale/aarch64: Add bgr24 to yuv

2024-06-24 Thread Zhao Zhili

From: Zhao Zhili 

Test on Apple M1 with kperf
: -O3   : -O3 -fno-vectorize
bgr24_to_uv_8_c : 28.5  : 52.5
bgr24_to_uv_8_neon  : 54.5  : 59.7
bgr24_to_uv_128_c   : 294.0 : 830.7
bgr24_to_uv_128_neon: 99.7  : 112.0
bgr24_to_uv_1080_c  : 965.0 : 6624.0
bgr24_to_uv_1080_neon   : 751.5 : 754.7
bgr24_to_uv_1920_c  : 1693.2: 11554.5
bgr24_to_uv_1920_neon   : 1292.5: 1307.5
bgr24_to_uv_half_8_c: 54.2  : 37.0
bgr24_to_uv_half_8_neon : 27.2  : 22.5
bgr24_to_uv_half_128_c  : 127.2 : 392.5
bgr24_to_uv_half_128_neon   : 63.0  : 52.0
bgr24_to_uv_half_1080_c : 880.2 : 3329.0
bgr24_to_uv_half_1080_neon  : 401.5 : 390.7
bgr24_to_uv_half_1920_c : 1585.7: 6390.7
bgr24_to_uv_half_1920_neon  : 694.7 : 698.7
bgr24_to_y_8_c  : 21.7  : 22.5
bgr24_to_y_8_neon   : 797.2 : 25.5
bgr24_to_y_128_c: 88.0  : 280.5
bgr24_to_y_128_neon : 63.7  : 55.0
bgr24_to_y_1080_c   : 616.7 : 2208.7
bgr24_to_y_1080_neon: 900.0 : 452.0
bgr24_to_y_1920_c   : 1093.2: 3894.7
bgr24_to_y_1920_neon: 777.2 : 767.5
---
 libswscale/aarch64/input.S   | 71 ++--
 libswscale/aarch64/swscale.c | 32 +---
 2 files changed, 71 insertions(+), 32 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index 33afa34111..2cfec4cb6a 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -20,7 +20,7 @@
 
 #include "libavutil/aarch64/asm.S"
 
-.macro rgb24_to_yuv_load_rgb, src
+.macro rgb_to_yuv_load_rgb src
 ld3 { v16.16b, v17.16b, v18.16b }, [\src]
 uxtlv19.8h, v16.8b // v19: r
 uxtlv20.8h, v17.8b // v20: g
@@ -30,7 +30,7 @@
 uxtl2   v24.8h, v18.16b// v24: b
 .endm
 
-.macro rgb24_to_yuv_product, r, g, b, dst1, dst2, dst, coef0, coef1, coef2, 
right_shift
+.macro rgb_to_yuv_product r, g, b, dst1, dst2, dst, coef0, coef1, coef2, 
right_shift
 mov \dst1\().16b, v6.16b// dst1 = 
const_offset
 mov \dst2\().16b, v6.16b// dst2 = 
const_offset
 smlal   \dst1\().4s, \coef0\().4h, \r\().4h // dst1 += rx 
* r
@@ -43,12 +43,20 @@
 sqshrn2 \dst\().8h, \dst2\().4s, \right_shift   // 
dst_higher_half = dst2 >> right_shift
 .endm
 
+function ff_bgr24ToY_neon, export=1
+cmp w4, #0  // check width > 0
+ldp w12, w11, [x5]  // w12: ry, w11: gy
+ldr w10, [x5, #8]   // w10: by
+b.gt4f
+ret
+endfunc
+
 function ff_rgb24ToY_neon, export=1
 cmp w4, #0  // check width > 0
 ldp w10, w11, [x5]  // w10: ry, w11: gy
 ldr w12, [x5, #8]   // w12: by
 b.le3f
-
+4:
 mov w9, #256// w9 = 1 << (RGB2YUV_SHIFT - 
7)
 movkw9, #8, lsl #16 // w9 += 32 << (RGB2YUV_SHIFT 
- 1)
 dup v6.4s, w9   // w9: const_offset
@@ -59,9 +67,9 @@ function ff_rgb24ToY_neon, export=1
 dup v2.8h, w12
 b.lt2f
 1:
-rgb24_to_yuv_load_rgb x1
-rgb24_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
-rgb24_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
+rgb_to_yuv_load_rgb x1
+rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
+rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
 sub w4, w4, #16 // width -= 16
 add x1, x1, #48 // src += 48
 cmp w4, #16 // width >= 16 ?
@@ -85,10 +93,7 @@ function ff_rgb24ToY_neon, export=1
 ret
 endfunc
 
-.macro rgb24_load_uv_coeff half
-ldp w10, w11, [x6, #12] // w10: ru, w11: gu
-ldp w12, w13, [x6, #20] // w12: bu, w13: rv
-ldp w14, w15, [x6, #28] // w14: gv, w15: bv
+.macro rgb_set_uv_coeff half
 .if \half
 mov w9, #512
 movkw9, #128, lsl #16   // w9: const_offset
@@ -105,12 +110,26 @@ endfunc
 dup v6.4s, w9
 .endm
 
+function ff_bgr24ToUV_half_neon, export=1
+cmp w5, #0  // check width > 0
+b.le3f
+
+ldp w12, w11, [x6, #12]
+ldp

[FFmpeg-devel] [PATCH v4 2/3] swscale/aarch64: Add bgra/rgba to yuv

2024-06-24 Thread Zhao Zhili

From: Zhao Zhili 

Test on Apple M1 with kperf
: -O3   : -O3 -fno-vectorize
bgra_to_uv_8_c  : 13.4  : 27.5
bgra_to_uv_8_neon   : 37.4  : 41.7
bgra_to_uv_128_c: 155.9 : 550.2
bgra_to_uv_128_neon : 91.7  : 92.7
bgra_to_uv_1080_c   : 1173.2: 4558.2
bgra_to_uv_1080_neon: 822.7 : 809.5
bgra_to_uv_1920_c   : 2078.2: 8115.2
bgra_to_uv_1920_neon: 1437.7: 1438.7
bgra_to_uv_half_8_c : 17.9  : 14.2
bgra_to_uv_half_8_neon  : 37.4  : 10.5
bgra_to_uv_half_128_c   : 103.9 : 326.0
bgra_to_uv_half_128_neon: 73.9  : 68.7
bgra_to_uv_half_1080_c  : 850.2 : 3732.0
bgra_to_uv_half_1080_neon   : 484.2 : 490.0
bgra_to_uv_half_1920_c  : 1479.2: 4942.7
bgra_to_uv_half_1920_neon   : 824.2 : 824.7
bgra_to_y_8_c   : 8.2   : 29.5
bgra_to_y_8_neon: 18.2  : 32.7
bgra_to_y_128_c : 101.4 : 361.5
bgra_to_y_128_neon  : 74.9  : 73.7
bgra_to_y_1080_c: 739.4 : 3018.0
bgra_to_y_1080_neon : 613.4 : 544.2
bgra_to_y_1920_c: 1298.7: 5326.0
bgra_to_y_1920_neon : 918.7 : 934.2
---
 libswscale/aarch64/input.S   | 68 +++-
 libswscale/aarch64/swscale.c | 16 +
 2 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index 2cfec4cb6a..ce5b042371 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -20,8 +20,12 @@
 
 #include "libavutil/aarch64/asm.S"
 
-.macro rgb_to_yuv_load_rgb src
+.macro rgb_to_yuv_load_rgb src, element=3
+.if \element == 3
 ld3 { v16.16b, v17.16b, v18.16b }, [\src]
+.else
+ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [\src]
+.endif
 uxtlv19.8h, v16.8b // v19: r
 uxtlv20.8h, v17.8b // v20: g
 uxtlv21.8h, v18.8b // v21: b
@@ -43,7 +47,8 @@
 sqshrn2 \dst\().8h, \dst2\().4s, \right_shift   // 
dst_higher_half = dst2 >> right_shift
 .endm
 
-function ff_bgr24ToY_neon, export=1
+.macro rgbToY_neon fmt_bgr, fmt_rgb, element
+function ff_\fmt_bgr\()ToY_neon, export=1
 cmp w4, #0  // check width > 0
 ldp w12, w11, [x5]  // w12: ry, w11: gy
 ldr w10, [x5, #8]   // w10: by
@@ -51,7 +56,7 @@ function ff_bgr24ToY_neon, export=1
 ret
 endfunc
 
-function ff_rgb24ToY_neon, export=1
+function ff_\fmt_rgb\()ToY_neon, export=1
 cmp w4, #0  // check width > 0
 ldp w10, w11, [x5]  // w10: ry, w11: gy
 ldr w12, [x5, #8]   // w12: by
@@ -67,11 +72,11 @@ function ff_rgb24ToY_neon, export=1
 dup v2.8h, w12
 b.lt2f
 1:
-rgb_to_yuv_load_rgb x1
+rgb_to_yuv_load_rgb x1, \element
 rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
 rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
 sub w4, w4, #16 // width -= 16
-add x1, x1, #48 // src += 48
+add x1, x1, #(16*\element)
 cmp w4, #16 // width >= 16 ?
 stp q16, q17, [x0], #32 // store to dst
 b.ge1b
@@ -86,12 +91,17 @@ function ff_rgb24ToY_neon, export=1
 smaddl  x13, w15, w12, x13  // x13 += by * b
 asr w13, w13, #9// x13 >>= 9
 sub w4, w4, #1  // width--
-add x1, x1, #3  // src += 3
+add x1, x1, #\element
 strhw13, [x0], #2   // store to dst
 cbnzw4, 2b
 3:
 ret
 endfunc
+.endm
+
+rgbToY_neon bgr24, rgb24, element=3
+
+rgbToY_neon bgra32, rgba32, element=4
 
 .macro rgb_set_uv_coeff half
 .if \half
@@ -110,7 +120,8 @@ endfunc
 dup v6.4s, w9
 .endm
 
-function ff_bgr24ToUV_half_neon, export=1
+.macro rgbToUV_half_neon fmt_bgr, fmt_rgb, element
+function ff_\fmt_bgr\()ToUV_half_neon, export=1
 cmp w5, #0  // check width > 0
 b.le3f
 
@@ -120,7 +131,7 @@ function ff_bgr24ToUV_half_neon, export=1
 b   4f
 endfunc
 
-function ff_rgb24ToUV_half_neon, export=1
+function ff_\fmt_rgb\()ToUV_half_neon, export=1
 cmp w5, #0  // check width > 0
 b.le3f
 
@@ -132,7 +1

[FFmpeg-devel] [PATCH v4 3/3] swscale/aarch64: Add argb/abgr to yuv

2024-06-24 Thread Zhao Zhili

From: Zhao Zhili 

Test on Apple M1 with kperf:
: -O3   : -O3 -fno-vectorize
abgr_to_uv_8_c  : 19.4  : 26.1
abgr_to_uv_8_neon   : 29.9  : 51.1
abgr_to_uv_128_c: 146.4 : 558.9
abgr_to_uv_128_neon : 85.1  : 83.4
abgr_to_uv_1080_c   : 1162.6: 4786.4
abgr_to_uv_1080_neon: 819.6 : 826.6
abgr_to_uv_1920_c   : 2063.6: 8492.1
abgr_to_uv_1920_neon: 1435.1: 1447.1
abgr_to_uv_half_8_c : 16.4  : 11.4
abgr_to_uv_half_8_neon  : 35.6  : 20.4
abgr_to_uv_half_128_c   : 108.6 : 359.4
abgr_to_uv_half_128_neon: 75.4  : 42.6
abgr_to_uv_half_1080_c  : 883.4 : 2885.6
abgr_to_uv_half_1080_neon   : 460.6 : 481.1
abgr_to_uv_half_1920_c  : 1553.6: 5106.9
abgr_to_uv_half_1920_neon   : 817.6 : 820.4
abgr_to_y_8_c   : 6.1   : 26.4
abgr_to_y_8_neon: 40.6  : 6.4
abgr_to_y_128_c : 99.9  : 390.1
abgr_to_y_128_neon  : 67.4  : 55.9
abgr_to_y_1080_c: 735.9 : 3170.4
abgr_to_y_1080_neon : 534.6 : 536.6
abgr_to_y_1920_c: 1279.4: 6016.4
abgr_to_y_1920_neon : 932.6 : 927.6
---
 libswscale/aarch64/input.S   | 86 +++-
 libswscale/aarch64/swscale.c | 17 +++
 2 files changed, 82 insertions(+), 21 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index ce5b042371..5cb18711fb 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -34,6 +34,16 @@
 uxtl2   v24.8h, v18.16b// v24: b
 .endm
 
+.macro argb_to_yuv_load_rgb src
+ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [\src]
+uxtlv21.8h, v19.8b // v21: b
+uxtl2   v24.8h, v19.16b// v24: b
+uxtlv19.8h, v17.8b // v19: r
+uxtlv20.8h, v18.8b // v20: g
+uxtl2   v22.8h, v17.16b// v22: r
+uxtl2   v23.8h, v18.16b// v23: g
+.endm
+
 .macro rgb_to_yuv_product r, g, b, dst1, dst2, dst, coef0, coef1, coef2, 
right_shift
 mov \dst1\().16b, v6.16b// dst1 = 
const_offset
 mov \dst2\().16b, v6.16b// dst2 = 
const_offset
@@ -47,7 +57,7 @@
 sqshrn2 \dst\().8h, \dst2\().4s, \right_shift   // 
dst_higher_half = dst2 >> right_shift
 .endm
 
-.macro rgbToY_neon fmt_bgr, fmt_rgb, element
+.macro rgbToY_neon fmt_bgr, fmt_rgb, element, alpha_first=0
 function ff_\fmt_bgr\()ToY_neon, export=1
 cmp w4, #0  // check width > 0
 ldp w12, w11, [x5]  // w12: ry, w11: gy
@@ -72,7 +82,11 @@ function ff_\fmt_rgb\()ToY_neon, export=1
 dup v2.8h, w12
 b.lt2f
 1:
+.if \alpha_first
+argb_to_yuv_load_rgb x1
+.else
 rgb_to_yuv_load_rgb x1, \element
+.endif
 rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
 rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
 sub w4, w4, #16 // width -= 16
@@ -82,9 +96,15 @@ function ff_\fmt_rgb\()ToY_neon, export=1
 b.ge1b
 cbz x4, 3f
 2:
+.if \alpha_first
+ldrbw13, [x1, #1]   // w13: r
+ldrbw14, [x1, #2]   // w14: g
+ldrbw15, [x1, #3]   // w15: b
+.else
 ldrbw13, [x1]   // w13: r
 ldrbw14, [x1, #1]   // w14: g
 ldrbw15, [x1, #2]   // w15: b
+.endif
 
 smaddl  x13, w13, w10, x9   // x13 = ry * r + const_offset
 smaddl  x13, w14, w11, x13  // x13 += gy * g
@@ -103,6 +123,8 @@ rgbToY_neon bgr24, rgb24, element=3
 
 rgbToY_neon bgra32, rgba32, element=4
 
+rgbToY_neon abgr32, argb32, element=4, alpha_first=1
+
 .macro rgb_set_uv_coeff half
 .if \half
 mov w9, #512
@@ -120,7 +142,21 @@ rgbToY_neon bgra32, rgba32, element=4
 dup v6.4s, w9
 .endm
 
-.macro rgbToUV_half_neon fmt_bgr, fmt_rgb, element
+.macro rgb_load_add_half off_r1, off_r2, off_g1, off_g2, off_b1, off_b2
+ldrbw2, [x3, #\off_r1] // w2: r1
+ldrbw4, [x3, #\off_r2] // w4: r2
+add w2, w2, w4 // w2 = r1 + r2
+
+ldrbw4, [x3, #\off_g1] // w4: g1
+ldrbw7, [x3, #\off_g2] // w7: g2
+add w4, w4, w7

Re: [FFmpeg-devel] [PATCH v3 2/3] swscale/aarch64: Add bgra/rgba to yuv

2024-06-24 Thread Zhao Zhili



> On Jun 24, 2024, at 19:55, Martin Storsjö  wrote:
> 
> On Mon, 24 Jun 2024, Zhao Zhili wrote:
> 
>> From: Zhao Zhili 
>> 
>> Test on Apple M1 with kperf
>>  : -O3   : -O3 -fno-vectorize
>> bgra_to_uv_8_c   : 13.4  : 27.5
>> bgra_to_uv_8_neon: 37.4  : 41.7
>> bgra_to_uv_128_c : 155.9 : 550.2
>> bgra_to_uv_128_neon  : 91.7  : 92.7
>> bgra_to_uv_1080_c: 1173.2: 4558.2
>> bgra_to_uv_1080_neon : 822.7 : 809.5
>> bgra_to_uv_1920_c: 2078.2: 8115.2
>> bgra_to_uv_1920_neon : 1437.7: 1438.7
>> bgra_to_uv_half_8_c  : 17.9  : 14.2
>> bgra_to_uv_half_8_neon   : 37.4  : 10.5
>> bgra_to_uv_half_128_c: 103.9 : 326.0
>> bgra_to_uv_half_128_neon : 73.9  : 68.7
>> bgra_to_uv_half_1080_c   : 850.2 : 3732.0
>> bgra_to_uv_half_1080_neon: 484.2 : 490.0
>> bgra_to_uv_half_1920_c   : 1479.2: 4942.7
>> bgra_to_uv_half_1920_neon: 824.2 : 824.7
>> bgra_to_y_8_c: 8.2   : 29.5
>> bgra_to_y_8_neon : 18.2  : 32.7
>> bgra_to_y_128_c  : 101.4 : 361.5
>> bgra_to_y_128_neon   : 74.9  : 73.7
>> bgra_to_y_1080_c : 739.4 : 3018.0
>> bgra_to_y_1080_neon  : 613.4 : 544.2
>> bgra_to_y_1920_c : 1298.7: 5326.0
>> bgra_to_y_1920_neon  : 918.7 : 934.2
>> ---
>> libswscale/aarch64/input.S   | 91 ++--
>> libswscale/aarch64/swscale.c | 16 +++
>> 2 files changed, 94 insertions(+), 13 deletions(-)
>> 
>> diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
>> index 2cfec4cb6a..6d2c6034bb 100644
>> --- a/libswscale/aarch64/input.S
>> +++ b/libswscale/aarch64/input.S
>> @@ -20,8 +20,12 @@
>> 
>> #include "libavutil/aarch64/asm.S"
>> 
>> -.macro rgb_to_yuv_load_rgb src
>> +.macro rgb_to_yuv_load_rgb src, element=3
>> +.if \element == 3
>>ld3 { v16.16b, v17.16b, v18.16b }, [\src]
>> +.else
>> +ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [\src]
>> +.endif
>>uxtlv19.8h, v16.8b // v19: r
>>uxtlv20.8h, v17.8b // v20: g
>>uxtlv21.8h, v18.8b // v21: b
>> @@ -51,7 +55,8 @@ function ff_bgr24ToY_neon, export=1
>>ret
>> endfunc
>> 
>> -function ff_rgb24ToY_neon, export=1
>> +.macro rgbToY_neon fmt, element
>> +function ff_\fmt\()ToY_neon, export=1
>>cmp w4, #0  // check width > 0
>>ldp w10, w11, [x5]  // w10: ry, w11: gy
>>ldr w12, [x5, #8]   // w12: by
>> @@ -67,11 +72,11 @@ function ff_rgb24ToY_neon, export=1
>>dup v2.8h, w12
>>b.lt2f
>> 1:
>> -rgb_to_yuv_load_rgb x1
>> +rgb_to_yuv_load_rgb x1, \element
>>rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
>>rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
>>sub w4, w4, #16 // width -= 16
>> -add x1, x1, #48 // src += 48
>> +add x1, x1, #(16*\element)
>>cmp w4, #16 // width >= 16 ?
>>stp q16, q17, [x0], #32 // store to dst
>>b.ge1b
>> @@ -86,12 +91,25 @@ function ff_rgb24ToY_neon, export=1
>>smaddl  x13, w15, w12, x13  // x13 += by * b
>>asr w13, w13, #9// x13 >>= 9
>>sub w4, w4, #1  // width--
>> -add x1, x1, #3  // src += 3
>> +add x1, x1, #\element
>>strhw13, [x0], #2   // store to dst
>>cbnzw4, 2b
>> 3:
>>ret
>> endfunc
>> +.endm
>> +
>> +rgbToY_neon fmt=rgb24, element=3
>> +
>> +function ff_bgra32ToY_neon, export=1
>> +cmp w4, #0  // check width > 0
>> +ldp w12, w11, [x5]  // w12: ry, w11: gy
>> +ldr w10, [x5, #8]   // w10: by
>> +b.gt4f
>> +ret
>> +endfunc
>> +
>> +rgbToY_neon fmt=rgba32, element=4
> 
> It is extremely obscure to jump to a local label (4f) that is defined by the 
> following macro. I think this would be much more readable if you'd include 
> the bgr(a) version in the macro, so the reference to 4f is near to the actual 
> label it refers to.

Good idea, it saved a lot of typing. Fixed in v4.

> 
>> .macro rgb_set_uv_coeff half
>>.if \half
>> @@ -120,7 +138,8 @@ function ff_bgr24ToUV_half_neon, export=1
>>b   4f
>> endfunc
>>

Re: [FFmpeg-devel] [PATCH] avformat/tls_schannel: forward AVIO_FLAG_NONBLOCK to tcp stream

2024-06-24 Thread Timo Rothenpieler


On 24/06/2024 00:07, Timo Rothenpieler wrote:

On 03.06.2024 22:28, Timo Rothenpieler wrote:

From: BtbN 

Fixes for example rtmps streaming over schannel.
---
  libavformat/tls_schannel.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/libavformat/tls_schannel.c b/libavformat/tls_schannel.c
index 214a47a218..7265a9794d 100644
--- a/libavformat/tls_schannel.c
+++ b/libavformat/tls_schannel.c
@@ -113,6 +113,7 @@ static int tls_shutdown_client(URLContext *h)
   c->request_flags, 0, 0, 
NULL, 0, &c->ctxt_handle,
   &outbuf_desc, 
&c->context_flags, &c->ctxt_timestamp);
  if (sspi_ret == SEC_E_OK || sspi_ret == 
SEC_I_CONTEXT_EXPIRED) {

+    s->tcp->flags &= ~AVIO_FLAG_NONBLOCK;
  ret = ffurl_write(s->tcp, outbuf.pvBuffer, 
outbuf.cbBuffer);

  FreeContextBuffer(outbuf.pvBuffer);
  if (ret < 0 || ret != outbuf.cbBuffer)
@@ -316,6 +317,7 @@ static int tls_client_handshake(URLContext *h)
  goto fail;
  }
+    s->tcp->flags &= ~AVIO_FLAG_NONBLOCK;
  ret = ffurl_write(s->tcp, outbuf.pvBuffer, outbuf.cbBuffer);
  FreeContextBuffer(outbuf.pvBuffer);
  if (ret < 0 || ret != outbuf.cbBuffer) {
@@ -416,11 +418,16 @@ static int tls_read(URLContext *h, uint8_t *buf, 
int len)

  }
  }
+    s->tcp->flags &= ~AVIO_FLAG_NONBLOCK;
+    s->tcp->flags |= h->flags & AVIO_FLAG_NONBLOCK;
+
  ret = ffurl_read(s->tcp, c->enc_buf + c->enc_buf_offset,
   c->enc_buf_size - c->enc_buf_offset);
  if (ret == AVERROR_EOF) {
  c->connection_closed = 1;
  ret = 0;
+    } else if (ret == AVERROR(EAGAIN)) {
+    ret = 0;
  } else if (ret < 0) {
  av_log(h, AV_LOG_ERROR, "Unable to read from socket\n");
  return ret;
@@ -564,8 +571,14 @@ static int tls_write(URLContext *h, const uint8_t 
*buf, int len)

  sspi_ret = EncryptMessage(&c->ctxt_handle, 0, &outbuf_desc, 0);
  if (sspi_ret == SEC_E_OK)  {
  len = outbuf[0].cbBuffer + outbuf[1].cbBuffer + 
outbuf[2].cbBuffer;

+
+    s->tcp->flags &= ~AVIO_FLAG_NONBLOCK;
+    s->tcp->flags |= h->flags & AVIO_FLAG_NONBLOCK;
+
  ret = ffurl_write(s->tcp, data, len);
-    if (ret < 0 || ret != len) {
+    if (ret == AVERROR(EAGAIN)) {
+    goto done;
+    } else if (ret < 0 || ret != len) {
  ret = AVERROR(EIO);
  av_log(h, AV_LOG_ERROR, "Writing encrypted data to 
socket failed\n");

  goto done;


will apply soon


applied
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v4 3/3] swscale/aarch64: Add argb/abgr to yuv

2024-06-24 Thread Martin Storsjö


On Mon, 24 Jun 2024, Zhao Zhili wrote:


From: Zhao Zhili 

Test on Apple M1 with kperf:
: -O3   : -O3 -fno-vectorize
abgr_to_uv_8_c  : 19.4  : 26.1
abgr_to_uv_8_neon   : 29.9  : 51.1
abgr_to_uv_128_c: 146.4 : 558.9
abgr_to_uv_128_neon : 85.1  : 83.4
abgr_to_uv_1080_c   : 1162.6: 4786.4
abgr_to_uv_1080_neon: 819.6 : 826.6
abgr_to_uv_1920_c   : 2063.6: 8492.1
abgr_to_uv_1920_neon: 1435.1: 1447.1
abgr_to_uv_half_8_c : 16.4  : 11.4
abgr_to_uv_half_8_neon  : 35.6  : 20.4
abgr_to_uv_half_128_c   : 108.6 : 359.4
abgr_to_uv_half_128_neon: 75.4  : 42.6
abgr_to_uv_half_1080_c  : 883.4 : 2885.6
abgr_to_uv_half_1080_neon   : 460.6 : 481.1
abgr_to_uv_half_1920_c  : 1553.6: 5106.9
abgr_to_uv_half_1920_neon   : 817.6 : 820.4
abgr_to_y_8_c   : 6.1   : 26.4
abgr_to_y_8_neon: 40.6  : 6.4
abgr_to_y_128_c : 99.9  : 390.1
abgr_to_y_128_neon  : 67.4  : 55.9
abgr_to_y_1080_c: 735.9 : 3170.4
abgr_to_y_1080_neon : 534.6 : 536.6
abgr_to_y_1920_c: 1279.4: 6016.4
abgr_to_y_1920_neon : 932.6 : 927.6
---
libswscale/aarch64/input.S   | 86 +++-
libswscale/aarch64/swscale.c | 17 +++
2 files changed, 82 insertions(+), 21 deletions(-)


This patchset looks ok to me (but wait a little bit in case someone else 
has further opinions on it).


// Martin

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] lavc/vvc: Validate IBC block vector

2024-06-24 Thread Frank Plowman

From H.266 (V3) (09/2023) p. 321:

It is a requirement of bitstream conformance that the luma block
vector bvL shall obey the following constraints:
- CtbSizeY is greater than or equal to
((yCb + (bvL[ 1 ] >> 4)) & (CtbSizeY − 1)) + cbHeight

This patch checks this is true, which fixes crashes on fuzzed
bitstreams.

Signed-off-by: Frank Plowman 
---
 libavcodec/vvc/intra.c  | 25 ++---
 libavcodec/vvc/thread.c |  4 +---
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/libavcodec/vvc/intra.c b/libavcodec/vvc/intra.c
index f77a012f09..11371db797 100644
--- a/libavcodec/vvc/intra.c
+++ b/libavcodec/vvc/intra.c
@@ -624,15 +624,26 @@ static void intra_block_copy(const VVCLocalContext *lc, 
const int c_idx)
 }
 }
 
-static void vvc_predict_ibc(const VVCLocalContext *lc)
+static int vvc_predict_ibc(const VVCLocalContext *lc)
 {
-const H266RawSPS *rsps = lc->fc->ps.sps->r;
+const VVCFrameContext *fc = lc->fc;
+const VVCSPS *sps = lc->fc->ps.sps;
+const H266RawSPS *rsps= sps->r;
+const CodingUnit *cu  = lc->cu;
+const Mv *bv  = &cu->pu.mi.mv[L0][0];
+
+if (sps->ctb_size_y < ((cu->y0 + (bv->y >> 4)) & (sps->ctb_size_y - 1)) + 
cu->cb_height) {
+av_log(fc->log_ctx, AV_LOG_ERROR, "IBC region spans multiple CTBs.\n");
+return AVERROR_INVALIDDATA;
+}
 
 intra_block_copy(lc, LUMA);
 if (lc->cu->tree_type == SINGLE_TREE && rsps->sps_chroma_format_idc) {
 intra_block_copy(lc, CB);
 intra_block_copy(lc, CR);
 }
+
+return 0;
 }
 
 static void ibc_fill_vir_buf(const VVCLocalContext *lc, const CodingUnit *cu)
@@ -678,7 +689,10 @@ int ff_vvc_reconstruct(VVCLocalContext *lc, const int rs, 
const int rx, const in
 if (cu->ciip_flag)
 ff_vvc_predict_ciip(lc);
 else if (cu->pred_mode == MODE_IBC)
-vvc_predict_ibc(lc);
+ret = vvc_predict_ibc(lc);
+if (ret)
+goto fail;
+
 if (cu->coded_flag) {
 ret = reconstruct(lc);
 } else {
@@ -687,10 +701,15 @@ int ff_vvc_reconstruct(VVCLocalContext *lc, const int rs, 
const int rx, const in
 if (sps->r->sps_chroma_format_idc && cu->tree_type != 
DUAL_TREE_LUMA)
 add_reconstructed_area(lc, CHROMA, cu->x0, cu->y0, 
cu->cb_width, cu->cb_height);
 }
+if (ret)
+goto fail;
+
 if (sps->r->sps_ibc_enabled_flag)
 ibc_fill_vir_buf(lc, cu);
 cu = cu->next;
 }
+
+fail:
 ff_vvc_ctu_free_cus(ctu);
 return ret;
 }
diff --git a/libavcodec/vvc/thread.c b/libavcodec/vvc/thread.c
index 8777d380bf..5b01dd2d20 100644
--- a/libavcodec/vvc/thread.c
+++ b/libavcodec/vvc/thread.c
@@ -454,9 +454,7 @@ static int run_inter(VVCContext *s, VVCLocalContext *lc, 
VVCTask *t)
 
 static int run_recon(VVCContext *s, VVCLocalContext *lc, VVCTask *t)
 {
-ff_vvc_reconstruct(lc, t->rs, t->rx, t->ry);
-
-return 0;
+return ff_vvc_reconstruct(lc, t->rs, t->rx, t->ry);
 }
 
 static int run_lmcs(VVCContext *s, VVCLocalContext *lc, VVCTask *t)
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v4 1/3] avcodec/jpeg2000dec: Add support for CAP and CPF markers

2024-06-24 Thread Osamu Watanabe

This commit adds support for CAP and CPF markers.

The previous version
(v3, https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=12246)
was wrongly separated. I have fixed the way to separation.

The changes are essentially the same as v2
(https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=12199).
The suggested modifications have been made according to the discussion
on this mailing list.

Signed-off-by: Osamu Watanabe 
---
 libavcodec/jpeg2000.h|   8 +++
 libavcodec/jpeg2000dec.c | 112 ++-
 libavcodec/jpeg2000dec.h |   7 +++
 3 files changed, 126 insertions(+), 1 deletion(-)

diff --git a/libavcodec/jpeg2000.h b/libavcodec/jpeg2000.h
index d004c08f10..4bdc38df7c 100644
--- a/libavcodec/jpeg2000.h
+++ b/libavcodec/jpeg2000.h
@@ -37,12 +37,14 @@
 
 enum Jpeg2000Markers {
 JPEG2000_SOC = 0xff4f, // start of codestream
+JPEG2000_CAP = 0xff50, // extended capabilities
 JPEG2000_SIZ = 0xff51, // image and tile size
 JPEG2000_COD,  // coding style default
 JPEG2000_COC,  // coding style component
 JPEG2000_TLM = 0xff55, // tile-part length, main header
 JPEG2000_PLM = 0xff57, // packet length, main header
 JPEG2000_PLT,  // packet length, tile-part header
+JPEG2000_CPF,  // corresponding profile
 JPEG2000_QCD = 0xff5c, // quantization default
 JPEG2000_QCC,  // quantization component
 JPEG2000_RGN,  // region of interest
@@ -58,6 +60,12 @@ enum Jpeg2000Markers {
 JPEG2000_EOC = 0xffd9, // end of codestream
 };
 
+enum JPEG2000_Ccap15_b14_15_params {
+HTJ2K_HTONLY = 0,  // HTONLY, bit 14 and 15 are 0
+HTJ2K_HTDECLARED,  // HTDECLARED, bit 14 = 1 and bit 15 = 0
+HTJ2K_MIXED = 3,   // MIXED, bit 14 and 15 are 1
+};
+
 #define JPEG2000_SOP_FIXED_BYTES 0xFF910004
 #define JPEG2000_SOP_BYTE_LENGTH 6
 
diff --git a/libavcodec/jpeg2000dec.c b/libavcodec/jpeg2000dec.c
index 091931b1ff..d1046661c4 100644
--- a/libavcodec/jpeg2000dec.c
+++ b/libavcodec/jpeg2000dec.c
@@ -408,6 +408,73 @@ static int get_siz(Jpeg2000DecoderContext *s)
 s->avctx->bits_per_raw_sample = s->precision;
 return 0;
 }
+/* get extended capabilities (CAP) marker segment */
+static int get_cap(Jpeg2000DecoderContext *s, Jpeg2000CodingStyle *c)
+{
+uint32_t Pcap;
+uint16_t Ccap_i[32] = { 0 };
+uint16_t Ccap_15;
+uint8_t P;
+
+if (bytestream2_get_bytes_left(&s->g) < 6) {
+av_log(s->avctx, AV_LOG_ERROR, "Insufficient space for CAP\n");
+return AVERROR_INVALIDDATA;
+}
+
+Pcap = bytestream2_get_be32u(&s->g);
+s->isHT = (Pcap >> (31 - (15 - 1))) & 1;
+for (int i = 0; i < 32; i++) {
+if ((Pcap >> (31 - i)) & 1)
+Ccap_i[i] = bytestream2_get_be16u(&s->g);
+}
+Ccap_15 = Ccap_i[14];
+if (s->isHT == 1) {
+av_log(s->avctx, AV_LOG_INFO, "This is an HTJ2K codestream.\n");
+// Bits 14-15
+switch ((Ccap_15 >> 14) & 0x3) {
+case 0x3:
+s->Ccap15_b14_15 = HTJ2K_MIXED;
+break;
+case 0x1:
+s->Ccap15_b14_15 = HTJ2K_HTDECLARED;
+break;
+case 0x0:
+s->Ccap15_b14_15 = HTJ2K_HTONLY;
+break;
+default:
+av_log(s->avctx, AV_LOG_ERROR, "Unknown CCap value.\n");
+return AVERROR(EINVAL);
+break;
+}
+// Bit 13
+if ((Ccap_15 >> 13) & 1) {
+av_log(s->avctx, AV_LOG_ERROR, "MULTIHT set is not supported.\n");
+return AVERROR_PATCHWELCOME;
+}
+// Bit 12
+s->Ccap15_b12 = (Ccap_15 >> 12) & 1;
+// Bit 11
+s->Ccap15_b11 = (Ccap_15 >> 11) & 1;
+// Bit 5
+s->Ccap15_b05 = (Ccap_15 >> 5) & 1;
+// Bit 0-4
+P = Ccap_15 & 0x1F;
+if (!P)
+s->HT_MAGB = 8;
+else if (P < 20)
+s->HT_MAGB = P + 8;
+else if (P < 31)
+s->HT_MAGB = 4 * (P - 19) + 27;
+else
+s->HT_MAGB = 74;
+
+if (s->HT_MAGB > 31) {
+av_log(s->avctx, AV_LOG_ERROR, "Available internal precision 
is exceeded (MAGB> 31).\n");
+return AVERROR_PATCHWELCOME;
+}
+}
+return 0;
+}
 
 /* get common part for COD and COC segments */
 static int get_cox(Jpeg2000DecoderContext *s, Jpeg2000CodingStyle *c)
@@ -802,6 +869,15 @@ static int read_crg(Jpeg2000DecoderContext *s, int n)
 bytestream2_skip(&s->g, n - 2);
 return 0;
 }
+
+static int read_cpf(Jpeg2000DecoderContext *s, int n)
+{
+if (bytestream2_get_bytes_left(&s->g) < (n - 2))
+return AVERROR_INVALIDDATA;
+bytestream2_skip(&s->g, n - 2);
+return 0;
+}
+
 /* Tile-part lengths: see ISO 15444-1:2002, section A.7.1
  * Used to know the number of tile parts and lengths.
  * There may be multiple TLMs in the header.
@@ -965,6 +1041,14 @@ static int init_tile(Jp

[FFmpeg-devel] [PATCH v4 2/3] avcodec/jpeg2000dec: Add support for placeholder passes

2024-06-24 Thread Osamu Watanabe

This commit adds support for placeholder pass parsing

Signed-off-by: Osamu Watanabe 
---
 libavcodec/jpeg2000.h  |   2 +
 libavcodec/jpeg2000dec.c   | 352 +
 libavcodec/jpeg2000htdec.c |  90 +-
 3 files changed, 326 insertions(+), 118 deletions(-)

diff --git a/libavcodec/jpeg2000.h b/libavcodec/jpeg2000.h
index 4bdc38df7c..93221d90ca 100644
--- a/libavcodec/jpeg2000.h
+++ b/libavcodec/jpeg2000.h
@@ -200,6 +200,8 @@ typedef struct Jpeg2000Cblk {
 /* specific to HT code-blocks */
 int zbp;
 int pass_lengths[2];
+uint8_t modes; // copy of SPcod/SPcoc field to parse HT-MIXED mode
+uint8_t ht_plhd; // are we looking for HT placeholder passes?
 } Jpeg2000Cblk; // code block
 
 typedef struct Jpeg2000Prec {
diff --git a/libavcodec/jpeg2000dec.c b/libavcodec/jpeg2000dec.c
index d1046661c4..2c66c21b88 100644
--- a/libavcodec/jpeg2000dec.c
+++ b/libavcodec/jpeg2000dec.c
@@ -54,6 +54,15 @@
 #define HAD_COC 0x01
 #define HAD_QCC 0x02
 
+// Values of flag for placeholder passes
+enum HT_PLHD_STATUS {
+HT_PLHD_OFF,
+HT_PLHD_ON
+};
+
+#define HT_MIXED 0x80 // bit 7 of SPcod/SPcoc
+
+
 /* get_bits functions for JPEG2000 packet bitstream
  * It is a get_bit function with a bit-stuffing routine. If the value of the
  * byte is 0xFF, the next byte includes an extra zero bit stuffed into the MSB.
@@ -1160,100 +1169,293 @@ static int 
jpeg2000_decode_packet(Jpeg2000DecoderContext *s, Jpeg2000Tile *tile,
 int incl, newpasses, llen;
 void *tmp;
 
-if (cblk->npasses)
-incl = get_bits(s, 1);
-else
+if (!cblk->incl) {
+incl = 0;
+cblk->modes = codsty->cblk_style;
+if (cblk->modes >= JPEG2000_CTSY_HTJ2K_F)
+cblk->ht_plhd = HT_PLHD_ON;
+if (layno > 0)
+incl = tag_tree_decode(s, prec->cblkincl + cblkno, 0 + 1) 
== 0;
 incl = tag_tree_decode(s, prec->cblkincl + cblkno, layno + 1) 
== layno;
-if (!incl)
-continue;
-else if (incl < 0)
-return incl;
-
-if (!cblk->npasses) {
-int zbp = tag_tree_decode(s, prec->zerobits + cblkno, 100);
-int v = expn[bandno] + numgbits - 1 - zbp;
 
-if (v < 0 || v > 30) {
-av_log(s->avctx, AV_LOG_ERROR,
-   "nonzerobits %d invalid or unsupported\n", v);
-return AVERROR_INVALIDDATA;
+if (incl) {
+int zbp = tag_tree_decode(s, prec->zerobits + cblkno, 100);
+int v = expn[bandno] + numgbits - 1 - (zbp - 
tile->comp->roi_shift);
+if (v < 0 || v > 30) {
+av_log(s->avctx, AV_LOG_ERROR,
+   "nonzerobits %d invalid or unsupported\n", v);
+return AVERROR_INVALIDDATA;
+}
+cblk->incl = 1;
+cblk->nonzerobits = v;
+cblk->zbp = zbp;
+cblk->lblock = 3;
 }
-cblk->zbp = zbp;
-cblk->nonzerobits = v;
-}
-if ((newpasses = getnpasses(s)) < 0)
-return newpasses;
-av_assert2(newpasses > 0);
-if (cblk->npasses + newpasses >= JPEG2000_MAX_PASSES) {
-avpriv_request_sample(s->avctx, "Too many passes");
-return AVERROR_PATCHWELCOME;
-}
-if ((llen = getlblockinc(s)) < 0)
-return llen;
-if (cblk->lblock + llen + av_log2(newpasses) > 16) {
-avpriv_request_sample(s->avctx,
-  "Block with length beyond 16 bits");
-return AVERROR_PATCHWELCOME;
+} else {
+incl = get_bits(s, 1);
 }
 
-cblk->lblock += llen;
-
-cblk->nb_lengthinc = 0;
-cblk->nb_terminationsinc = 0;
-av_free(cblk->lengthinc);
-cblk->lengthinc = av_calloc(newpasses, sizeof(*cblk->lengthinc));
-if (!cblk->lengthinc)
-return AVERROR(ENOMEM);
-tmp = av_realloc_array(cblk->data_start, cblk->nb_terminations + 
newpasses + 1, sizeof(*cblk->data_start));
-if (!tmp)
-return AVERROR(ENOMEM);
-cblk->data_start = tmp;
-do {
-int newpasses1 = 0;
+if (incl) {
+uint8_t bypass_term_threshold = 0;
+uint8_t bits_to_read = 0;
+uint32_t segment_bytes = 0;
+int32_t segment_passes = 0;
+uint8_t next_segment_passes = 0;
+int32_t href_passes, pass_bound;
+uint32_t tmp_length = 0;
+int32_t newpasses_copy, npasses

[FFmpeg-devel] [PATCH v4 3/3] avcodec/jpeg2000dec: Fix HT decoding

2024-06-24 Thread Osamu Watanabe

This commit fixes wrong treatment of MAGBP value in Ccap15 and bugs in HT block 
decoding.

Signed-off-by: Osamu Watanabe 
---
 libavcodec/jpeg2000dec.c   |  11 +--
 libavcodec/jpeg2000htdec.c | 136 ++---
 libavcodec/jpeg2000htdec.h |   2 +-
 3 files changed, 89 insertions(+), 60 deletions(-)

diff --git a/libavcodec/jpeg2000dec.c b/libavcodec/jpeg2000dec.c
index 2c66c21b88..83cd5dbc7c 100644
--- a/libavcodec/jpeg2000dec.c
+++ b/libavcodec/jpeg2000dec.c
@@ -391,6 +391,9 @@ static int get_siz(Jpeg2000DecoderContext *s)
 } else if (ncomponents == 1 && s->precision == 8) {
 s->avctx->pix_fmt = AV_PIX_FMT_GRAY8;
 i = 0;
+} else if (ncomponents == 1 && s->precision == 12) {
+s->avctx->pix_fmt = AV_PIX_FMT_GRAY16LE;
+i = 0;
 }
 }
 
@@ -2204,7 +2207,7 @@ static inline int tile_codeblocks(const 
Jpeg2000DecoderContext *s, Jpeg2000Tile
 Jpeg2000Band *band = rlevel->band + bandno;
 int cblkno = 0, bandpos;
 /* See Rec. ITU-T T.800, Equation E-2 */
-int magp = quantsty->expn[subbandno] + quantsty->nguardbits - 
1;
+int M_b = quantsty->expn[subbandno] + quantsty->nguardbits - 1;
 
 bandpos = bandno + (reslevelno > 0);
 
@@ -2212,8 +2215,8 @@ static inline int tile_codeblocks(const 
Jpeg2000DecoderContext *s, Jpeg2000Tile
 band->coord[1][0] == band->coord[1][1])
 continue;
 
-if ((codsty->cblk_style & JPEG2000_CTSY_HTJ2K_F) && magp >= 
31) {
-avpriv_request_sample(s->avctx, "JPEG2000_CTSY_HTJ2K_F and 
magp >= 31");
+if ((codsty->cblk_style & JPEG2000_CTSY_HTJ2K_F) && M_b >= 31) 
{
+avpriv_request_sample(s->avctx, "JPEG2000_CTSY_HTJ2K_F and 
M_b >= 31");
 return AVERROR_PATCHWELCOME;
 }
 
@@ -2234,7 +2237,7 @@ static inline int tile_codeblocks(const 
Jpeg2000DecoderContext *s, Jpeg2000Tile
 ret = ff_jpeg2000_decode_htj2k(s, codsty, &t1, 
cblk,
cblk->coord[0][1] - 
cblk->coord[0][0],
cblk->coord[1][1] - 
cblk->coord[1][0],
-   magp, 
comp->roi_shift);
+   M_b, 
comp->roi_shift);
 else
 ret = decode_cblk(s, codsty, &t1, cblk,
   cblk->coord[0][1] - 
cblk->coord[0][0],
diff --git a/libavcodec/jpeg2000htdec.c b/libavcodec/jpeg2000htdec.c
index 9b473e11d3..0296792a6a 100644
--- a/libavcodec/jpeg2000htdec.c
+++ b/libavcodec/jpeg2000htdec.c
@@ -122,7 +122,7 @@ static void jpeg2000_init_mel(StateVars *s, uint32_t Pcup)
 
 static void jpeg2000_init_mag_ref(StateVars *s, uint32_t Lref)
 {
-s->pos   = Lref - 2;
+s->pos   = Lref - 1;
 s->bits  = 0;
 s->last  = 0xFF;
 s->tmp   = 0;
@@ -145,9 +145,10 @@ static void jpeg2000_init_mel_decoder(MelDecoderState 
*mel_state)
 static int jpeg2000_bitbuf_refill_backwards(StateVars *buffer, const uint8_t 
*array)
 {
 uint64_t tmp = 0;
-int32_t position = buffer->pos - 4;
 uint32_t new_bits = 32;
 
+buffer->last = array[buffer->pos + 1];
+
 if (buffer->bits_left >= 32)
 return 0; // enough data, no need to pull in more bits
 
@@ -157,9 +158,24 @@ static int jpeg2000_bitbuf_refill_backwards(StateVars 
*buffer, const uint8_t *ar
  *  the bottom most bits.
  */
 
-for(int i = FFMAX(0, position + 1); i <= buffer->pos + 1; i++)
-tmp = 256*tmp + array[i];
-
+if (buffer->pos >= 3) {  // Common case; we have at least 4 bytes available
+ tmp = array[buffer->pos - 3];
+ tmp = (tmp << 8) | array[buffer->pos - 2];
+ tmp = (tmp << 8) | array[buffer->pos - 1];
+ tmp = (tmp << 8) | array[buffer->pos];
+ tmp = (tmp << 8) | buffer->last;  // For stuffing bit detection
+ buffer->pos -= 4;
+} else {
+if (buffer->pos >= 2)
+tmp = array[buffer->pos - 2]; 
+if (buffer->pos >= 1)
+tmp = (tmp << 8) | array[buffer->pos - 1];
+if (buffer->pos >= 0)
+tmp = (tmp << 8) | array[buffer->pos];
+buffer->pos = 0;
+tmp = (tmp << 8) | buffer->last;  // For stuffing bit detection
+}
+// Now remove any stuffing bits, shifting things down as we go
 if ((tmp & 0x7FFF00) > 0x7F8F00) {
 tmp &= 0x7F;
 new_bits--;
@@ -176,13 +192,11 @@ static int jpeg2000_bitbuf_refill_backwards(StateVars 
*buffer, const uint8_t *ar
 tmp = (tmp & 0x007FFF) + ((tmp & 0xFF) >> 1);
 new_bits--;
 }
-
-tmp >>= 8; // Remove temporary byte loaded
+tmp >>= 8;  // Shift

Re: [FFmpeg-devel] [RFC]] swscale modernization proposal

2024-06-24 Thread Niklas Haas

On Sun, 23 Jun 2024 14:57:31 -0300 James Almer  wrote:
> On 6/22/2024 7:19 PM, Vittorio Giovara wrote:
> > Needless to say I support the plan of renaming the library so that it can
> > be inline with the other libraries names, and the use of a separate header
> > since downstream applications will need to update a lot to use the new
> > library (or the new apis in the existing library) and/or we could provide a
> > thin conversion layer when the new lib is finalized.
> 
> I don't quite agree with renaming it. As Michael already pointed out, 
> the av prefix wouldn't fit a scaling library nor a resampling one, as 
> they only handle one or the other.

By this logic, both libswscale and libswsresample should be merged into
libavscale. The mathematics of resampling and scaling is the same :)

Anyway, renaming a library needs a really strong motivating reason, and
I don't see that reason being present here. As discussed further
up-thread, I will try and re-use the existing swscale public API, but
internally restructure things so that SwsContext is itself the
"high-level wrapper" that I intended  to be.

We are very fortunate that SwsContext is entirely private, so I'm not
too concerned about the code implications of this. At worst it will
involve a bunch of renaming commits.

> There's also the precedent of avresample, which was ultimately dropped 
> in favor of swresample, so trying to replace swscale with a new avscale 
> library will be both confusing and going against what was already 
> established.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [RFC]] swscale modernization proposal

2024-06-24 Thread Vittorio Giovara

On Sun, Jun 23, 2024 at 7:57 PM James Almer  wrote:

> On 6/22/2024 7:19 PM, Vittorio Giovara wrote:
> > Needless to say I support the plan of renaming the library so that it can
> > be inline with the other libraries names, and the use of a separate
> header
> > since downstream applications will need to update a lot to use the new
> > library (or the new apis in the existing library) and/or we could
> provide a
> > thin conversion layer when the new lib is finalized.
>
> I don't quite agree with renaming it. As Michael already pointed out,
> the av prefix wouldn't fit a scaling library nor a resampling one, as
> they only handle one or the other.
>

by that reasoning we should ban all subtitles from all the libraries

av is a shorthand of multimedia and many people in the industry refer to
ffmpeg libs as "libav*" so it feels a bit odd to push for preserving an
alternative name


> There's also the precedent of avresample, which was ultimately dropped
> in favor of swresample, so trying to replace swscale with a new avscale
> library will be both confusing and going against what was already
> established.


it's still 4 libraries vs 2... and swr/avr is shrouded in bad history that
is not worth bringing up

I'd understand opposing a rename just for the sake of renaming, but this is
essentially a new library, i see no value in preserving the old naming
scheme, if not making downstream life worse :x
-- 
Vittorio
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] lavc/vvc: Always set flags for the current picture

2024-06-24 Thread Frank Plowman

ff_vvc_frame_rpl uses the flags to detect whether a frame is in use.
Therefore, in the case of a CVSS AU (RASL/GDR with
NoOutputBeforeRecoveryFlag) with ph_non_ref_pic_flag = 1, the frame
would be freed before it is used.  Fix this by always marking the
current frame with VVC_FRAME_FLAG_SHORT_REF, as is done by the HEVC
decoder.

Additionally, add an assert0 to mitigate the effects of a frame being
freed before it is used.

Signed-off-by: Frank Plowman 
---
 libavcodec/vvc/refs.c   | 2 +-
 libavcodec/vvc/thread.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/libavcodec/vvc/refs.c b/libavcodec/vvc/refs.c
index 8b7ba639a3..26a5b0b34c 100644
--- a/libavcodec/vvc/refs.c
+++ b/libavcodec/vvc/refs.c
@@ -191,7 +191,7 @@ int ff_vvc_set_new_ref(VVCContext *s, VVCFrameContext *fc, 
AVFrame **frame)
 fc->ref = ref;
 
 if (s->no_output_before_recovery_flag && (IS_RASL(s) || 
!GDR_IS_RECOVERED(s)))
-ref->flags = 0;
+ref->flags = VVC_FRAME_FLAG_SHORT_REF;
 else if (ph->r->ph_pic_output_flag)
 ref->flags = VVC_FRAME_FLAG_OUTPUT;
 
diff --git a/libavcodec/vvc/thread.c b/libavcodec/vvc/thread.c
index 5b01dd2d20..e87ed5b676 100644
--- a/libavcodec/vvc/thread.c
+++ b/libavcodec/vvc/thread.c
@@ -801,6 +801,8 @@ int ff_vvc_frame_wait(VVCContext *s, VVCFrameContext *fc)
 {
 VVCFrameThread *ft = fc->ft;
 
+av_assert0(fc->ref->progress);
+
 ff_mutex_lock(&ft->lock);
 
 while (atomic_load(&ft->nb_scheduled_tasks) || 
atomic_load(&ft->nb_scheduled_listeners))
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] lavu/stereo3d: change the horizontal FOV field to a rational

2024-06-24 Thread Derek Buitenhuis

On 6/24/2024 1:13 AM, James Almer wrote:
> If Derek is also ok with this then LGTM.

I do not object.

- Derek
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] avcodec/dovi_rpudec: fix reading el_bit_depth_minus8

2024-06-24 Thread Cosmin Stejerean via ffmpeg-devel

From: Cosmin Stejerean 

now that we are reading ext_mapping_idc as the upper 8 bits of
el_bit_depth_minus8 we need to use get_ue_golomb_long rather than
get_ue_golomb_31 for reading it

---
 libavcodec/dovi_rpudec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/dovi_rpudec.c b/libavcodec/dovi_rpudec.c
index 8cafdcf5e6..c025800206 100644
--- a/libavcodec/dovi_rpudec.c
+++ b/libavcodec/dovi_rpudec.c
@@ -420,7 +420,7 @@ int ff_dovi_rpu_parse(DOVIContext *s, const uint8_t *rpu, 
size_t rpu_size,
 
 if ((hdr->rpu_format & 0x700) == 0) {
 int bl_bit_depth_minus8 = get_ue_golomb_31(gb);
-int el_bit_depth_minus8 = get_ue_golomb_31(gb);
+int el_bit_depth_minus8 = get_ue_golomb_long(gb);
 int vdr_bit_depth_minus8 = get_ue_golomb_31(gb);
 int reserved_zero_3bits;
 /* ext_mapping_idc is in the upper 8 bits of el_bit_depth_minus8 */
-- 
2.42.1


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec/dovi_rpudec: fix reading el_bit_depth_minus8

2024-06-24 Thread Niklas Haas

On Mon, 24 Jun 2024 15:56:12 + Cosmin Stejerean via ffmpeg-devel 
 wrote:
> From: Cosmin Stejerean 
> 
> now that we are reading ext_mapping_idc as the upper 8 bits of
> el_bit_depth_minus8 we need to use get_ue_golomb_long rather than
> get_ue_golomb_31 for reading it
> 
> ---
>  libavcodec/dovi_rpudec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/dovi_rpudec.c b/libavcodec/dovi_rpudec.c
> index 8cafdcf5e6..c025800206 100644
> --- a/libavcodec/dovi_rpudec.c
> +++ b/libavcodec/dovi_rpudec.c
> @@ -420,7 +420,7 @@ int ff_dovi_rpu_parse(DOVIContext *s, const uint8_t *rpu, 
> size_t rpu_size,
>  
>  if ((hdr->rpu_format & 0x700) == 0) {
>  int bl_bit_depth_minus8 = get_ue_golomb_31(gb);
> -int el_bit_depth_minus8 = get_ue_golomb_31(gb);
> +int el_bit_depth_minus8 = get_ue_golomb_long(gb);
>  int vdr_bit_depth_minus8 = get_ue_golomb_31(gb);
>  int reserved_zero_3bits;
>  /* ext_mapping_idc is in the upper 8 bits of el_bit_depth_minus8 
> */
> -- 
> 2.42.1

LGTM. I checked also the equivalent for set_ue_golomb(), but it's fine
up to 2^16-2, which is enough for the max value of (0xFF << 8) | 8.

> 
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [GASPP PATCH 1/2] Translate .xword and .dword to .quad

2024-06-24 Thread Martin Storsjö

---
 gas-preprocessor.pl | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index b0c343e..20b927f 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1169,6 +1169,8 @@ sub handle_serialized_line {
 $line =~ s/\.syntax/$comm$&/x  if $as_type =~ /armasm/;
 
 $line =~ s/\.hword/.short/x;
+$line =~ s/\.xword/.quad/x;
+$line =~ s/\.dword/.quad/x;
 
 if ($as_type =~ /^apple-/) {
 # the syntax for these is a little different
-- 
2.34.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [GASPP PATCH 2/2] Handle local labels in expressions with .xword/.dword/.quad

2024-06-24 Thread Martin Storsjö

---
This might be needed in dav1d in the future.
---
 gas-preprocessor.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 20b927f..19b0131 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -958,7 +958,7 @@ sub handle_serialized_line {
 $xreg =~ s/w/x/;
 $line =~ s/\b$reg\b/$xreg/;
 }
-} elsif ($line =~ /^\s*.h?word.*\b\d+[bf]\b/) {
+} elsif ($line =~ /^\s*.([hxd]?word|quad).*\b\d+[bf]\b/) {
 while ($line =~ /\b(\d+)([bf])\b/g) {
 $line = handle_local_label($line, $1, $2);
 }
-- 
2.34.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/9] avcodec/dovi_rpudec: clarify semantics

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

ff_dovi_rpu_parse() and ff_dovi_rpu_generate() are a bit inconsistent in
that they expect different levels of encapsulation, due to the nature of
how this is handled in the context of different APIs. Clarify the status
quo. (And fix an incorrect reference to the RPU payload bytes as 'RBSP')
---
 libavcodec/dovi_rpu.h| 5 +++--
 libavcodec/dovi_rpudec.c | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/libavcodec/dovi_rpu.h b/libavcodec/dovi_rpu.h
index bfb118d6b5..205d16ffbc 100644
--- a/libavcodec/dovi_rpu.h
+++ b/libavcodec/dovi_rpu.h
@@ -95,8 +95,9 @@ void ff_dovi_ctx_unref(DOVIContext *s);
 void ff_dovi_ctx_flush(DOVIContext *s);
 
 /**
- * Parse the contents of a Dovi RPU NAL and update the parsed values in the
- * DOVIContext struct.
+ * Parse the contents of a Dolby Vision RPU and update the parsed values in the
+ * DOVIContext struct. This function should receive the decoded unit payload,
+ * without any T.35 or NAL unit headers.
  *
  * Returns 0 or an error code.
  *
diff --git a/libavcodec/dovi_rpudec.c b/libavcodec/dovi_rpudec.c
index c025800206..375e6e560b 100644
--- a/libavcodec/dovi_rpudec.c
+++ b/libavcodec/dovi_rpudec.c
@@ -360,7 +360,7 @@ int ff_dovi_rpu_parse(DOVIContext *s, const uint8_t *rpu, 
size_t rpu_size,
 emdf_protection = get_bits(gb, 5 + 12);
 VALIDATE(emdf_protection, 0x400, 0x400);
 } else {
-/* NAL RBSP with prefix and trailing zeroes */
+/* NAL unit with prefix and trailing zeroes */
 VALIDATE(rpu[0], 25, 25); /* NAL prefix */
 rpu++;
 rpu_size--;
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 2/9] avcodec/dovi_rpuenc: also copy ext blocks to dovi ctx

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

As the comment implies, DOVIContext.ext_blocks should also reflect the
current state after ff_dovi_rpu_generate().

Fluff for now, but will be needed once we start implementing metadata
compression for extension blocks as well.
---
 libavcodec/dovi_rpuenc.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/libavcodec/dovi_rpuenc.c b/libavcodec/dovi_rpuenc.c
index a14c9cc181..f0cfecc91b 100644
--- a/libavcodec/dovi_rpuenc.c
+++ b/libavcodec/dovi_rpuenc.c
@@ -506,6 +506,12 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 }
 }
 
+if (metadata->num_ext_blocks && !s->ext_blocks) {
+s->ext_blocks = ff_refstruct_allocz(sizeof(AVDOVIDmData) * 
AV_DOVI_MAX_EXT_BLOCKS);
+if (!s->ext_blocks)
+return AVERROR(ENOMEM);
+}
+
 vdr_dm_metadata_present = memcmp(color, &ff_dovi_color_default, 
sizeof(*color));
 use_prev_vdr_rpu = !memcmp(s->vdr[vdr_rpu_id], mapping, sizeof(*mapping));
 if (num_ext_blocks_v1 || num_ext_blocks_v2)
@@ -636,6 +642,7 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 }
 
 if (vdr_dm_metadata_present) {
+size_t ext_sz;
 const int denom = profile == 4 ? (1 << 30) : (1 << 28);
 set_ue_golomb(pb, color->dm_metadata_id); /* affected_dm_id */
 set_ue_golomb(pb, color->dm_metadata_id); /* current_dm_id */
@@ -673,6 +680,11 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 for (int i = 0; i < metadata->num_ext_blocks; i++)
 generate_ext_v2(pb, av_dovi_get_ext(metadata, i));
 }
+
+ext_sz = FFMIN(sizeof(AVDOVIDmData), metadata->ext_block_size);
+for (int i = 0; i < metadata->num_ext_blocks; i++)
+memcpy(&s->ext_blocks[i], av_dovi_get_ext(metadata, i), ext_sz);
+s->num_ext_blocks = metadata->num_ext_blocks;
 } else {
 s->color = &ff_dovi_color_default;
 s->num_ext_blocks = 0;
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 3/9] avcodec/dovi_rpuenc: try to re-use existing vdr_rpu_id

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

And only override it if we either have an exact match, or if we still
have unused metadata slots (to avoid an overwrite).
---
 libavcodec/dovi_rpuenc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/dovi_rpuenc.c b/libavcodec/dovi_rpuenc.c
index f0cfecc91b..30b6b09f1d 100644
--- a/libavcodec/dovi_rpuenc.c
+++ b/libavcodec/dovi_rpuenc.c
@@ -463,12 +463,12 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 return AVERROR_INVALIDDATA;
 }
 
-vdr_rpu_id = -1;
+vdr_rpu_id = mapping->vdr_rpu_id;
 for (int i = 0; i <= DOVI_MAX_DM_ID; i++) {
 if (s->vdr[i] && !memcmp(s->vdr[i], mapping, sizeof(*mapping))) {
 vdr_rpu_id = i;
 break;
-} else if (vdr_rpu_id < 0 && (!s->vdr[i] || i == DOVI_MAX_DM_ID)) {
+} else if (s->vdr[vdr_rpu_id] && !s->vdr[i]) {
 vdr_rpu_id = i;
 }
 }
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 4/9] avcodec/dovi_rpuenc: allow changing vdr_rpu_id

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

The version as written also compared the vdr_rpu_id field, which would
defeat the purpose of trying to look for a matching slot in the
first place.
---
 libavcodec/dovi_rpuenc.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/libavcodec/dovi_rpuenc.c b/libavcodec/dovi_rpuenc.c
index 30b6b09f1d..f10e175350 100644
--- a/libavcodec/dovi_rpuenc.c
+++ b/libavcodec/dovi_rpuenc.c
@@ -20,6 +20,8 @@
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  */
 
+#include 
+
 #include "libavutil/avassert.h"
 #include "libavutil/crc.h"
 #include "libavutil/mem.h"
@@ -201,6 +203,15 @@ skip:
 return 0;
 }
 
+/* compares data mappings, excluding vdr_rpu_id */
+static int cmp_data_mapping(const AVDOVIDataMapping *m1,
+const AVDOVIDataMapping *m2)
+{
+static_assert(offsetof(AVDOVIDataMapping, vdr_rpu_id) == 0, "vdr_rpu_id is 
first field");
+const void *p1 = &m1->vdr_rpu_id + 1, *p2 = &m2->vdr_rpu_id + 1;
+return memcmp(p1, p2, sizeof(AVDOVIDataMapping) - sizeof(m1->vdr_rpu_id));
+}
+
 static inline void put_ue_coef(PutBitContext *pb, const AVDOVIRpuDataHeader 
*hdr,
uint64_t coef)
 {
@@ -465,7 +476,7 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 
 vdr_rpu_id = mapping->vdr_rpu_id;
 for (int i = 0; i <= DOVI_MAX_DM_ID; i++) {
-if (s->vdr[i] && !memcmp(s->vdr[i], mapping, sizeof(*mapping))) {
+if (s->vdr[i] && !cmp_data_mapping(s->vdr[i], mapping)) {
 vdr_rpu_id = i;
 break;
 } else if (s->vdr[vdr_rpu_id] && !s->vdr[i]) {
@@ -639,6 +650,7 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 }
 
 memcpy(s->vdr[vdr_rpu_id], mapping, sizeof(*mapping));
+s->vdr[vdr_rpu_id]->vdr_rpu_id = vdr_rpu_id;
 }
 
 if (vdr_dm_metadata_present) {
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 5/9] avcodec/dovi_rpuenc: add `flags` to ff_dovi_rpu_generate()

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

Will be used to control compression, encapsulation etc.
---
 libavcodec/dovi_rpu.h| 2 +-
 libavcodec/dovi_rpuenc.c | 2 +-
 libavcodec/libaomenc.c   | 2 +-
 libavcodec/libsvtav1.c   | 2 +-
 libavcodec/libx265.c | 3 ++-
 5 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/libavcodec/dovi_rpu.h b/libavcodec/dovi_rpu.h
index 205d16ffbc..65a4529106 100644
--- a/libavcodec/dovi_rpu.h
+++ b/libavcodec/dovi_rpu.h
@@ -135,7 +135,7 @@ int ff_dovi_configure(DOVIContext *s, AVCodecContext 
*avctx);
  * including the EMDF header (profile 10) or NAL encapsulation (otherwise).
  */
 int ff_dovi_rpu_generate(DOVIContext *s, const AVDOVIMetadata *metadata,
- uint8_t **out_rpu, int *out_size);
+ int flags, uint8_t **out_rpu, int *out_size);
 
 
 /***
diff --git a/libavcodec/dovi_rpuenc.c b/libavcodec/dovi_rpuenc.c
index f10e175350..6bfb39a7ea 100644
--- a/libavcodec/dovi_rpuenc.c
+++ b/libavcodec/dovi_rpuenc.c
@@ -446,7 +446,7 @@ static void generate_ext_v2(PutBitContext *pb, const 
AVDOVIDmData *dm)
 }
 
 int ff_dovi_rpu_generate(DOVIContext *s, const AVDOVIMetadata *metadata,
- uint8_t **out_rpu, int *out_size)
+ int flags, uint8_t **out_rpu, int *out_size)
 {
 PutBitContext *pb = &(PutBitContext){0};
 const AVDOVIRpuDataHeader *hdr;
diff --git a/libavcodec/libaomenc.c b/libavcodec/libaomenc.c
index dec74ebecd..aa51c89e29 100644
--- a/libavcodec/libaomenc.c
+++ b/libavcodec/libaomenc.c
@@ -1294,7 +1294,7 @@ FF_ENABLE_DEPRECATION_WARNINGS
 const AVDOVIMetadata *metadata = (const AVDOVIMetadata *)sd->data;
 uint8_t *t35;
 int size;
-if ((res = ff_dovi_rpu_generate(&ctx->dovi, metadata, &t35, 
&size)) < 0)
+if ((res = ff_dovi_rpu_generate(&ctx->dovi, metadata, 0, &t35, 
&size)) < 0)
 return res;
 res = aom_img_add_metadata(rawimg, OBU_METADATA_TYPE_ITUT_T35,
t35, size, AOM_MIF_ANY_FRAME);
diff --git a/libavcodec/libsvtav1.c b/libavcodec/libsvtav1.c
index 2fef8c8971..b6db63fd7a 100644
--- a/libavcodec/libsvtav1.c
+++ b/libavcodec/libsvtav1.c
@@ -541,7 +541,7 @@ static int eb_send_frame(AVCodecContext *avctx, const 
AVFrame *frame)
 const AVDOVIMetadata *metadata = (const AVDOVIMetadata *)sd->data;
 uint8_t *t35;
 int size;
-if ((ret = ff_dovi_rpu_generate(&svt_enc->dovi, metadata, &t35, 
&size)) < 0)
+if ((ret = ff_dovi_rpu_generate(&svt_enc->dovi, metadata, 0, &t35, 
&size)) < 0)
 return ret;
 ret = svt_add_metadata(headerPtr, EB_AV1_METADATA_TYPE_ITUT_T35, t35, 
size);
 av_free(t35);
diff --git a/libavcodec/libx265.c b/libavcodec/libx265.c
index 0dc7ab6eeb..4302c3d587 100644
--- a/libavcodec/libx265.c
+++ b/libavcodec/libx265.c
@@ -783,7 +783,8 @@ static int libx265_encode_frame(AVCodecContext *avctx, 
AVPacket *pkt,
 sd = av_frame_get_side_data(pic, AV_FRAME_DATA_DOVI_METADATA);
 if (ctx->dovi.cfg.dv_profile && sd) {
 const AVDOVIMetadata *metadata = (const AVDOVIMetadata *)sd->data;
-ret = ff_dovi_rpu_generate(&ctx->dovi, metadata, 
&x265pic.rpu.payload,
+ret = ff_dovi_rpu_generate(&ctx->dovi, metadata, 0,
+   &x265pic.rpu.payload,
&x265pic.rpu.payloadSize);
 if (ret < 0) {
 free_picture(ctx, &x265pic);
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 6/9] avcodec/dovi_rpuenc: make encapsulation optional

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

And move the choice of desired container to `flags`. This is needed to
handle differing API requirements (e.g. libx265 requires the NAL RBSP,
but CBS BSF requires the unescaped bytes).
---
 libavcodec/dovi_rpu.h| 16 ++--
 libavcodec/dovi_rpuenc.c | 22 ++
 libavcodec/libaomenc.c   |  3 ++-
 libavcodec/libsvtav1.c   |  3 ++-
 libavcodec/libx265.c |  2 +-
 5 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/libavcodec/dovi_rpu.h b/libavcodec/dovi_rpu.h
index 65a4529106..226a769bff 100644
--- a/libavcodec/dovi_rpu.h
+++ b/libavcodec/dovi_rpu.h
@@ -123,16 +123,20 @@ int ff_dovi_attach_side_data(DOVIContext *s, AVFrame 
*frame);
  */
 int ff_dovi_configure(DOVIContext *s, AVCodecContext *avctx);
 
+enum {
+FF_DOVI_WRAP_NAL= 1 << 0, ///< wrap inside NAL RBSP
+FF_DOVI_WRAP_T35= 1 << 1, ///< wrap inside T.35+EMDF
+};
+
 /**
- * Synthesize a Dolby Vision RPU reflecting the current state. Note that this
- * assumes all previous calls to `ff_dovi_rpu_generate` have been appropriately
- * signalled, i.e. it will not re-send already transmitted redundant data.
+ * Synthesize a Dolby Vision RPU reflecting the current state. By default, the
+ * RPU is not encapsulated (see `flags` for more options). Note that this
+ * assumes all previous calls to `ff_dovi_rpu_generate` have been
+ * appropriately signalled, i.e. it will not re-send already transmitted
+ * redundant data.
  *
  * Mutates the internal state of DOVIContext to reflect the change.
  * Returns 0 or a negative error code.
- *
- * This generates a fully formed RPU ready for inclusion in the bitstream,
- * including the EMDF header (profile 10) or NAL encapsulation (otherwise).
  */
 int ff_dovi_rpu_generate(DOVIContext *s, const AVDOVIMetadata *metadata,
  int flags, uint8_t **out_rpu, int *out_size);
diff --git a/libavcodec/dovi_rpuenc.c b/libavcodec/dovi_rpuenc.c
index 6bfb39a7ea..41080521e1 100644
--- a/libavcodec/dovi_rpuenc.c
+++ b/libavcodec/dovi_rpuenc.c
@@ -710,9 +710,7 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 flush_put_bits(pb);
 
 rpu_size = put_bytes_output(pb);
-switch (s->cfg.dv_profile) {
-case 10:
-/* AV1 uses T.35 OBU with EMDF header */
+if (flags & FF_DOVI_WRAP_T35) {
 *out_rpu = av_malloc(rpu_size + 15);
 if (!*out_rpu)
 return AVERROR(ENOMEM);
@@ -739,10 +737,8 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 flush_put_bits(pb);
 *out_size = put_bytes_output(pb);
 return 0;
-
-case 5:
-case 8:
-*out_rpu = dst = av_malloc(1 + rpu_size * 3 / 2); /* worst case */
+} else if (flags & FF_DOVI_WRAP_NAL) {
+*out_rpu = dst = av_malloc(4 + rpu_size * 3 / 2); /* worst case */
 if (!*out_rpu)
 return AVERROR(ENOMEM);
 *dst++ = 25; /* NAL prefix */
@@ -765,10 +761,12 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 }
 *out_size = dst - *out_rpu;
 return 0;
-
-default:
-/* Should be unreachable */
-av_assert0(0);
-return AVERROR_BUG;
+} else {
+/* Return intermediate buffer directly */
+*out_rpu = s->rpu_buf;
+*out_size = rpu_size;
+s->rpu_buf = NULL;
+s->rpu_buf_sz = 0;
+return 0;
 }
 }
diff --git a/libavcodec/libaomenc.c b/libavcodec/libaomenc.c
index aa51c89e29..fd9bea2505 100644
--- a/libavcodec/libaomenc.c
+++ b/libavcodec/libaomenc.c
@@ -1294,7 +1294,8 @@ FF_ENABLE_DEPRECATION_WARNINGS
 const AVDOVIMetadata *metadata = (const AVDOVIMetadata *)sd->data;
 uint8_t *t35;
 int size;
-if ((res = ff_dovi_rpu_generate(&ctx->dovi, metadata, 0, &t35, 
&size)) < 0)
+if ((res = ff_dovi_rpu_generate(&ctx->dovi, metadata, 
FF_DOVI_WRAP_T35,
+&t35, &size)) < 0)
 return res;
 res = aom_img_add_metadata(rawimg, OBU_METADATA_TYPE_ITUT_T35,
t35, size, AOM_MIF_ANY_FRAME);
diff --git a/libavcodec/libsvtav1.c b/libavcodec/libsvtav1.c
index b6db63fd7a..e7b12fb488 100644
--- a/libavcodec/libsvtav1.c
+++ b/libavcodec/libsvtav1.c
@@ -541,7 +541,8 @@ static int eb_send_frame(AVCodecContext *avctx, const 
AVFrame *frame)
 const AVDOVIMetadata *metadata = (const AVDOVIMetadata *)sd->data;
 uint8_t *t35;
 int size;
-if ((ret = ff_dovi_rpu_generate(&svt_enc->dovi, metadata, 0, &t35, 
&size)) < 0)
+if ((ret = ff_dovi_rpu_generate(&svt_enc->dovi, metadata, 
FF_DOVI_WRAP_T35,
+&t35, &size)) < 0)
 return ret;
 ret = svt_add_metadata(headerPtr, EB_AV1_METADATA_TYPE_ITUT_T35, t35, 
size);
 av_free(t35);
diff --git a/libavcodec/libx265.c b/libavcodec/libx265.c

[FFmpeg-devel] [PATCH 7/9] avcodec/dovi_rpuenc: disable metadata compression by default

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

Keyframes must reset the metadata compression state, so we cannot enable
metadata compression inside the encoders. Solve this by adding a new
flag, rather than removing it entirely, because I plan on adding
a bitstream filter for metadata compression.
---
 libavcodec/dovi_rpu.h|  3 +++
 libavcodec/dovi_rpuenc.c | 26 ++
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/libavcodec/dovi_rpu.h b/libavcodec/dovi_rpu.h
index 226a769bff..f0d9c24379 100644
--- a/libavcodec/dovi_rpu.h
+++ b/libavcodec/dovi_rpu.h
@@ -126,6 +126,9 @@ int ff_dovi_configure(DOVIContext *s, AVCodecContext 
*avctx);
 enum {
 FF_DOVI_WRAP_NAL= 1 << 0, ///< wrap inside NAL RBSP
 FF_DOVI_WRAP_T35= 1 << 1, ///< wrap inside T.35+EMDF
+
+FF_DOVI_COMPRESS_VDR= 1 << 2, ///< enable VDR RPU compression
+FF_DOVI_COMPRESS_ALL= FF_DOVI_COMPRESS_VDR,
 };
 
 /**
diff --git a/libavcodec/dovi_rpuenc.c b/libavcodec/dovi_rpuenc.c
index 41080521e1..08170a9e84 100644
--- a/libavcodec/dovi_rpuenc.c
+++ b/libavcodec/dovi_rpuenc.c
@@ -21,6 +21,7 @@
  */
 
 #include 
+#include 
 
 #include "libavutil/avassert.h"
 #include "libavutil/crc.h"
@@ -452,9 +453,10 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 const AVDOVIRpuDataHeader *hdr;
 const AVDOVIDataMapping *mapping;
 const AVDOVIColorMetadata *color;
-int vdr_dm_metadata_present, vdr_rpu_id, use_prev_vdr_rpu, profile,
+int vdr_dm_metadata_present, vdr_rpu_id, profile,
 buffer_size, rpu_size, pad, zero_run;
 int num_ext_blocks_v1, num_ext_blocks_v2;
+int use_prev_vdr_rpu = false;
 uint32_t crc;
 uint8_t *dst;
 if (!metadata) {
@@ -475,12 +477,21 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 }
 
 vdr_rpu_id = mapping->vdr_rpu_id;
-for (int i = 0; i <= DOVI_MAX_DM_ID; i++) {
-if (s->vdr[i] && !cmp_data_mapping(s->vdr[i], mapping)) {
-vdr_rpu_id = i;
-break;
-} else if (s->vdr[vdr_rpu_id] && !s->vdr[i]) {
-vdr_rpu_id = i;
+if (flags & FF_DOVI_COMPRESS_VDR) {
+for (int i = 0; i <= DOVI_MAX_DM_ID; i++) {
+if (s->vdr[i] && !cmp_data_mapping(s->vdr[i], mapping)) {
+use_prev_vdr_rpu = true;
+vdr_rpu_id = i;
+break;
+} else if (s->vdr[vdr_rpu_id] && !s->vdr[i]) {
+vdr_rpu_id = i;
+}
+}
+} else {
+/* Flush VDRs to avoid leaking old state after keyframe */
+for (int i = 0; i <= DOVI_MAX_DM_ID; i++) {
+if (i != vdr_rpu_id)
+ff_refstruct_unref(&s->vdr[i]);
 }
 }
 
@@ -524,7 +535,6 @@ int ff_dovi_rpu_generate(DOVIContext *s, const 
AVDOVIMetadata *metadata,
 }
 
 vdr_dm_metadata_present = memcmp(color, &ff_dovi_color_default, 
sizeof(*color));
-use_prev_vdr_rpu = !memcmp(s->vdr[vdr_rpu_id], mapping, sizeof(*mapping));
 if (num_ext_blocks_v1 || num_ext_blocks_v2)
 vdr_dm_metadata_present = 1;
 
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 8/9] avcodec/dovi_rpu: add ff_dovi_get_metadata()

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

Provides direct access to the AVDOVIMetadata without having to attach it
to a frame.
---
 libavcodec/dovi_rpu.h|  9 +
 libavcodec/dovi_rpudec.c | 40 +++-
 2 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/libavcodec/dovi_rpu.h b/libavcodec/dovi_rpu.h
index f0d9c24379..10d5c7f566 100644
--- a/libavcodec/dovi_rpu.h
+++ b/libavcodec/dovi_rpu.h
@@ -108,8 +108,17 @@ void ff_dovi_ctx_flush(DOVIContext *s);
 int ff_dovi_rpu_parse(DOVIContext *s, const uint8_t *rpu, size_t rpu_size,
   int err_recognition);
 
+/**
+ * Get the decoded AVDOVIMetadata. Ownership passes to the caller.
+ *
+ * Returns the size of *out_metadata, a negative error code, or 0 if no
+ * metadata is available to return.
+ */
+int ff_dovi_get_metadata(DOVIContext *s, AVDOVIMetadata **out_metadata);
+
 /**
  * Attach the decoded AVDOVIMetadata as side data to an AVFrame.
+ * Returns 0 or a negative error code.
  */
 int ff_dovi_attach_side_data(DOVIContext *s, AVFrame *frame);
 
diff --git a/libavcodec/dovi_rpudec.c b/libavcodec/dovi_rpudec.c
index 375e6e560b..e8c25e9f3b 100644
--- a/libavcodec/dovi_rpudec.c
+++ b/libavcodec/dovi_rpudec.c
@@ -30,10 +30,8 @@
 #include "get_bits.h"
 #include "refstruct.h"
 
-int ff_dovi_attach_side_data(DOVIContext *s, AVFrame *frame)
+int ff_dovi_get_metadata(DOVIContext *s, AVDOVIMetadata **out_metadata)
 {
-AVFrameSideData *sd;
-AVBufferRef *buf;
 AVDOVIMetadata *dovi;
 size_t dovi_size, ext_sz;
 
@@ -44,7 +42,32 @@ int ff_dovi_attach_side_data(DOVIContext *s, AVFrame *frame)
 if (!dovi)
 return AVERROR(ENOMEM);
 
-buf = av_buffer_create((uint8_t *) dovi, dovi_size, NULL, NULL, 0);
+/* Copy only the parts of these structs known to us at compiler-time. */
+#define COPY(t, a, b, last) memcpy(a, b, offsetof(t, last) + sizeof((b)->last))
+COPY(AVDOVIRpuDataHeader, av_dovi_get_header(dovi), &s->header, 
ext_mapping_idc_5_7);
+COPY(AVDOVIDataMapping, av_dovi_get_mapping(dovi), s->mapping, nlq_pivots);
+COPY(AVDOVIColorMetadata, av_dovi_get_color(dovi), s->color, 
source_diagonal);
+ext_sz = FFMIN(sizeof(AVDOVIDmData), dovi->ext_block_size);
+for (int i = 0; i < s->num_ext_blocks; i++)
+memcpy(av_dovi_get_ext(dovi, i), &s->ext_blocks[i], ext_sz);
+dovi->num_ext_blocks = s->num_ext_blocks;
+
+*out_metadata = dovi;
+return dovi_size;
+}
+
+int ff_dovi_attach_side_data(DOVIContext *s, AVFrame *frame)
+{
+AVFrameSideData *sd;
+AVDOVIMetadata *dovi;
+AVBufferRef *buf;
+int size;
+
+size = ff_dovi_get_metadata(s, &dovi);
+if (size <= 0)
+return size;
+
+buf = av_buffer_create((uint8_t *) dovi, size, NULL, NULL, 0);
 if (!buf) {
 av_free(dovi);
 return AVERROR(ENOMEM);
@@ -56,15 +79,6 @@ int ff_dovi_attach_side_data(DOVIContext *s, AVFrame *frame)
 return AVERROR(ENOMEM);
 }
 
-/* Copy only the parts of these structs known to us at compiler-time. */
-#define COPY(t, a, b, last) memcpy(a, b, offsetof(t, last) + sizeof((b)->last))
-COPY(AVDOVIRpuDataHeader, av_dovi_get_header(dovi), &s->header, 
ext_mapping_idc_5_7);
-COPY(AVDOVIDataMapping, av_dovi_get_mapping(dovi), s->mapping, nlq_pivots);
-COPY(AVDOVIColorMetadata, av_dovi_get_color(dovi), s->color, 
source_diagonal);
-ext_sz = FFMIN(sizeof(AVDOVIDmData), dovi->ext_block_size);
-for (int i = 0; i < s->num_ext_blocks; i++)
-memcpy(av_dovi_get_ext(dovi, i), &s->ext_blocks[i], ext_sz);
-dovi->num_ext_blocks = s->num_ext_blocks;
 return 0;
 }
 
-- 
2.45.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 9/9] avcodec/bsf/dovi_rpu: add new bitstream filter

2024-06-24 Thread Niklas Haas

From: Niklas Haas 

This can be used to strip dovi metadata, or enable/disable dovi
metadata compression. Possibly more use cases in the future.
---
 configure  |   1 +
 doc/bitstream_filters.texi |  21 +++
 libavcodec/bitstream_filters.c |   1 +
 libavcodec/bsf/Makefile|   1 +
 libavcodec/bsf/dovi_rpu.c  | 258 +
 5 files changed, 282 insertions(+)
 create mode 100644 libavcodec/bsf/dovi_rpu.c

diff --git a/configure b/configure
index 3bca638459..32076079e7 100755
--- a/configure
+++ b/configure
@@ -3437,6 +3437,7 @@ aac_adtstoasc_bsf_select="adts_header mpeg4audio"
 av1_frame_merge_bsf_select="cbs_av1"
 av1_frame_split_bsf_select="cbs_av1"
 av1_metadata_bsf_select="cbs_av1"
+dovi_rpu_bsf_select="cbs_h265 cbs_av1 dovi_rpudec dovi_rpuenc"
 dts2pts_bsf_select="cbs_h264 h264parse"
 eac3_core_bsf_select="ac3_parser"
 evc_frame_merge_bsf_select="evcparse"
diff --git a/doc/bitstream_filters.texi b/doc/bitstream_filters.texi
index c03f04f858..918735e8c5 100644
--- a/doc/bitstream_filters.texi
+++ b/doc/bitstream_filters.texi
@@ -101,6 +101,27 @@ Remove zero padding at the end of a packet.
 Extract the core from a DCA/DTS stream, dropping extensions such as
 DTS-HD.
 
+@section dovi_rpu
+
+Manipulate Dolby Vision metadata in a HEVC/AV1 bitstream, optionally enabling
+metadata compression.
+
+@table @option
+@item strip
+If enabled, strip all Dolby Vision metadata (configuration record + RPU data
+blocks) from the stream.
+@item compression
+A bit mask of compression methods to enable.
+@table @samp
+@item none
+No compression. Selected automatically for keyframes.
+@item vdr
+Compress VDR metadata (color reshaping / data mapping parameters).
+@item all
+Enable all implemented compression methods. This is the default.
+@end table
+@end table
+
 @section dump_extra
 
 Add extradata to the beginning of the filtered packets except when
diff --git a/libavcodec/bitstream_filters.c b/libavcodec/bitstream_filters.c
index 138246c50e..f923411bee 100644
--- a/libavcodec/bitstream_filters.c
+++ b/libavcodec/bitstream_filters.c
@@ -31,6 +31,7 @@ extern const FFBitStreamFilter ff_av1_metadata_bsf;
 extern const FFBitStreamFilter ff_chomp_bsf;
 extern const FFBitStreamFilter ff_dump_extradata_bsf;
 extern const FFBitStreamFilter ff_dca_core_bsf;
+extern const FFBitStreamFilter ff_dovi_rpu_bsf;
 extern const FFBitStreamFilter ff_dts2pts_bsf;
 extern const FFBitStreamFilter ff_dv_error_marker_bsf;
 extern const FFBitStreamFilter ff_eac3_core_bsf;
diff --git a/libavcodec/bsf/Makefile b/libavcodec/bsf/Makefile
index fb70ad0c21..40b7fc6e9b 100644
--- a/libavcodec/bsf/Makefile
+++ b/libavcodec/bsf/Makefile
@@ -19,6 +19,7 @@ OBJS-$(CONFIG_H264_MP4TOANNEXB_BSF)   += 
bsf/h264_mp4toannexb.o
 OBJS-$(CONFIG_H264_REDUNDANT_PPS_BSF) += bsf/h264_redundant_pps.o
 OBJS-$(CONFIG_HAPQA_EXTRACT_BSF)  += bsf/hapqa_extract.o
 OBJS-$(CONFIG_HEVC_METADATA_BSF)  += bsf/h265_metadata.o
+OBJS-$(CONFIG_DOVI_RPU_BSF)   += bsf/dovi_rpu.o
 OBJS-$(CONFIG_HEVC_MP4TOANNEXB_BSF)   += bsf/hevc_mp4toannexb.o
 OBJS-$(CONFIG_IMX_DUMP_HEADER_BSF)+= bsf/imx_dump_header.o
 OBJS-$(CONFIG_MEDIA100_TO_MJPEGB_BSF) += bsf/media100_to_mjpegb.o
diff --git a/libavcodec/bsf/dovi_rpu.c b/libavcodec/bsf/dovi_rpu.c
new file mode 100644
index 00..c57c3d87dd
--- /dev/null
+++ b/libavcodec/bsf/dovi_rpu.c
@@ -0,0 +1,258 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/common.h"
+#include "libavutil/mem.h"
+#include "libavutil/opt.h"
+
+#include "bsf.h"
+#include "bsf_internal.h"
+#include "cbs.h"
+#include "cbs_bsf.h"
+#include "cbs_av1.h"
+#include "cbs_h265.h"
+#include "dovi_rpu.h"
+#include "h2645data.h"
+#include "h265_profile_level.h"
+#include "itut35.h"
+
+#include "hevc/hevc.h"
+
+typedef struct DoviRpuContext {
+CBSBSFContext common;
+DOVIContext dec;
+DOVIContext enc;
+
+int strip;
+int compression;
+} DoviRpuContext;
+
+static int update_rpu(AVBSFContext *bsf, const AVPacket *pkt, int flags,
+  const uint8_t *rpu, size_t rpu_size,
+  uint8_t **out_rpu, int *out_size)
+{
+DoviRpuContext *s = bsf->priv_data;
+AVDOVIMetadata *metadata = NULL

[FFmpeg-devel] [PATCH] fftools/ffplay_renderer: use correct NULL value for Vulkan type

2024-06-24 Thread Timo Rothenpieler

---
 fftools/ffplay_renderer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fftools/ffplay_renderer.c b/fftools/ffplay_renderer.c
index 80b700b3c5..f272cb46f1 100644
--- a/fftools/ffplay_renderer.c
+++ b/fftools/ffplay_renderer.c
@@ -766,7 +766,7 @@ static void destroy(VkRenderer *renderer)
 vkDestroySurfaceKHR = (PFN_vkDestroySurfaceKHR)
 ctx->get_proc_addr(ctx->inst, "vkDestroySurfaceKHR");
 vkDestroySurfaceKHR(ctx->inst, ctx->vk_surface, NULL);
-ctx->vk_surface = NULL;
+ctx->vk_surface = VK_NULL_HANDLE;
 }
 
 av_buffer_unref(&ctx->hw_device_ref);
-- 
2.44.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] Allow enabling SVC in libaomenc

2024-06-24 Thread Chun-Min Chang

This patch updates libaomenc.c to accept parameters for SVC (Scalable
Video Coding) settings via the FFmpeg API `av_opt_set`. The SVC
configuration is applied based on the provided parameters. As libaom's
SVC functionality only operates with constant bitrate encoding [1],
these parameters will only take effect when the bitrate is set to
constant.

[1] 
https://aomedia.googlesource.com/aom/+/a7ef80c44bfb34b08254194b1ab72d4e93ff4b07/av1/encoder/svc_layercontext.h#115
---
 libavcodec/libaomenc.c | 75 ++
 1 file changed, 75 insertions(+)

diff --git a/libavcodec/libaomenc.c b/libavcodec/libaomenc.c
index dec74ebecd..a8602a6b56 100644
--- a/libavcodec/libaomenc.c
+++ b/libavcodec/libaomenc.c
@@ -30,6 +30,7 @@
 #include 
 
 #include "libavutil/avassert.h"
+#include "libavutil/avstring.h"
 #include "libavutil/base64.h"
 #include "libavutil/common.h"
 #include "libavutil/cpu.h"
@@ -137,6 +138,7 @@ typedef struct AOMEncoderContext {
 int enable_diff_wtd_comp;
 int enable_dist_wtd_comp;
 int enable_dual_filter;
+AVDictionary *svc_parameters;
 AVDictionary *aom_params;
 } AOMContext;
 
@@ -201,6 +203,7 @@ static const char *const ctlidstr[] = {
 [AV1E_GET_TARGET_SEQ_LEVEL_IDX] = "AV1E_GET_TARGET_SEQ_LEVEL_IDX",
 #endif
 [AV1_GET_NEW_FRAME_IMAGE]   = "AV1_GET_NEW_FRAME_IMAGE",
+[AV1E_SET_SVC_PARAMS]   = "AV1E_SET_SVC_PARAMS",
 };
 
 static av_cold void log_encoder_error(AVCodecContext *avctx, const char *desc)
@@ -382,6 +385,31 @@ static av_cold int codecctl_imgp(AVCodecContext *avctx,
 return 0;
 }
 
+static av_cold int codecctl_svcp(AVCodecContext *avctx,
+#ifdef UENUM1BYTE
+ aome_enc_control_id id,
+#else
+ enum aome_enc_control_id id,
+#endif
+ aom_svc_params_t *svc_params)
+{
+AOMContext *ctx = avctx->priv_data;
+char buf[80];
+int res;
+
+snprintf(buf, sizeof(buf), "%s:", ctlidstr[id]);
+
+res = aom_codec_control(&ctx->encoder, id, svc_params);
+if (res != AOM_CODEC_OK) {
+snprintf(buf, sizeof(buf), "Failed to get %s codec control",
+ ctlidstr[id]);
+log_encoder_error(avctx, buf);
+return AVERROR(EINVAL);
+}
+
+return 0;
+}
+
 static av_cold int aom_free(AVCodecContext *avctx)
 {
 AOMContext *ctx = avctx->priv_data;
@@ -673,6 +701,18 @@ static int choose_tiling(AVCodecContext *avctx,
 return 0;
 }
 
+static void aom_svc_parse_int_array(int *dest, char *value, int max_entries)
+{
+int dest_idx = 0;
+char *saveptr = NULL;
+char *token = av_strtok(value, ",", &saveptr);
+
+while (token && dest_idx < max_entries) {
+dest[dest_idx++] = strtoul(token, NULL, 10);
+token = av_strtok(NULL, ",", &saveptr);
+}
+}
+
 static av_cold int aom_init(AVCodecContext *avctx,
 const struct aom_codec_iface *iface)
 {
@@ -968,6 +1008,40 @@ static av_cold int aom_init(AVCodecContext *avctx,
 if (ctx->enable_intrabc >= 0)
 codecctl_int(avctx, AV1E_SET_ENABLE_INTRABC, ctx->enable_intrabc);
 
+if (enccfg.rc_end_usage == AOM_CBR) {
+aom_svc_params_t svc_params = {};
+svc_params.framerate_factor[0] = 1;
+svc_params.number_spatial_layers = 1;
+svc_params.number_temporal_layers = 1;
+
+const AVDictionaryEntry *en = NULL;
+while ((en = av_dict_iterate(ctx->svc_parameters, en))) {
+if (!strlen(en->value))
+return AVERROR(EINVAL);
+
+if (!strcmp(en->key, "number_spatial_layers"))
+svc_params.number_spatial_layers = strtoul(en->value, NULL, 
10);
+else if (!strcmp(en->key, "number_temporal_layers"))
+svc_params.number_temporal_layers = strtoul(en->value, NULL, 
10);
+else if (!strcmp(en->key, "max_quantizers"))
+aom_svc_parse_int_array(svc_params.max_quantizers, en->value, 
AOM_MAX_LAYERS);
+else if (!strcmp(en->key, "min_quantizers"))
+aom_svc_parse_int_array(svc_params.min_quantizers, en->value, 
AOM_MAX_LAYERS);
+else if (!strcmp(en->key, "scaling_factor_num"))
+aom_svc_parse_int_array(svc_params.scaling_factor_num, 
en->value, AOM_MAX_SS_LAYERS);
+else if (!strcmp(en->key, "scaling_factor_den"))
+aom_svc_parse_int_array(svc_params.scaling_factor_den, 
en->value, AOM_MAX_SS_LAYERS);
+else if (!strcmp(en->key, "layer_target_bitrate"))
+aom_svc_parse_int_array(svc_params.layer_target_bitrate, 
en->value, AOM_MAX_LAYERS);
+else if (!strcmp(en->key, "framerate_factor"))
+aom_svc_parse_int_array(svc_params.framerate_factor, 
en->value, AOM_MAX_TS_LAYERS);
+}
+
+res = codecctl_svcp(avctx, AV1E_SET_SVC_PARAMS, &svc_params);
+if (res < 0)
+return res;
+}
+
 #if AOM_ENC

Re: [FFmpeg-devel] [PATCH 2/2] avformat/mov: default to Monoscopic view when parsing eyes box

2024-06-24 Thread Michael Niedermayer

On Sat, Jun 22, 2024 at 06:34:49PM -0300, James Almer wrote:
> On 6/22/2024 6:25 PM, Michael Niedermayer wrote:
> > On Fri, Jun 21, 2024 at 10:25:31PM -0300, James Almer wrote:
> > > Signed-off-by: James Almer 
> > > ---
> > >   libavformat/mov.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > doesnt apply automatically with "git am" with the v2
> > 
> > Applying: avformat/mov: default to Monoscopic view when parsing eyes box
> > error: sha1 information is lacking or useless (libavformat/mov.c).
> > error: could not build fake ancestor
> > Patch failed at 0001 avformat/mov: default to Monoscopic view when parsing 
> > eyes box
> > 
> > it applies with patch but inability to automatically apply patches
> > could affect tools which try to test patches posted
> > 
> > git am --show-current-patch=diff | patch -p1
> > patching file libavformat/mov.c
> > Hunk #1 succeeded at 6546 with fuzz 2.
> 
> Are you sure your tree is clean and up to date? There's no reason for this
> patch to not apply, standalone or after 1/1 v1 or v2.

the patch says this:
index 50e171c960..4fa39cf4fd 100644

git fetch origin
git fetch jamrial
git show 50e171c960
fatal: ambiguous argument '50e171c960': unknown revision or path not in the 
working tree.
Use '--' to separate paths from revisions, like this:
'git  [...] -- [...]'

git show 4fa39cf4fd
fatal: ambiguous argument '4fa39cf4fd': unknown revision or path not in the 
working tree.
Use '--' to separate paths from revisions, like this:
'git  [...] -- [...]'

so i think the blob this patch was based on is not in any repository known to 
my git
It might be able to apply the patch anyway but not having the full file this 
patch
is based on makes it harder for git. I did have other patches applied.
so git would try to merge this in and if needed produce conflict markers but
given that it doesnt seem to have the file this was based on it freaked out

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

While the State exists there can be no freedom; when there is freedom there
will be no State. -- Vladimir Lenin

signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3 2/2] lavc/hevcdec: Update slice index before hwaccel decode slice

2024-06-24 Thread Hendrik Leppkes

On Mon, Jun 24, 2024 at 8:32 AM  wrote:
>
> From: Fei Wang 
>
> Otherwise, slice index will never update for hwaccel decode, and slice
> RPL will be always overlap into first one which use slice index to construct.
>
> Fixes hwaccel decoding after 47d34ba7fbb81
>
> Signed-off-by: Fei Wang 
> ---
> 1. Update commit message.
>
>  libavcodec/hevc/hevcdec.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c
> index 39beb7e4dc..8bb564f1b3 100644
> --- a/libavcodec/hevc/hevcdec.c
> +++ b/libavcodec/hevc/hevcdec.c
> @@ -2770,6 +2770,9 @@ static int decode_slice_data(HEVCContext *s, const 
> H2645NAL *nal, GetBitContext
>  const HEVCPPS *pps = s->pps;
>  int ret;
>
> +if (!s->sh.first_slice_in_pic_flag)
> +s->slice_idx += !s->sh.dependent_slice_segment_flag;
> +
>  if (!s->sh.dependent_slice_segment_flag && s->sh.slice_type != 
> HEVC_SLICE_I) {
>  ret = ff_hevc_slice_rpl(s);
>  if (ret < 0) {
> @@ -2807,8 +2810,6 @@ static int decode_slice_data(HEVCContext *s, const 
> H2645NAL *nal, GetBitContext
>  s->local_ctx[0].tu.cu_qp_offset_cb = 0;
>  s->local_ctx[0].tu.cu_qp_offset_cr = 0;
>
> -s->slice_idx += !s->sh.dependent_slice_segment_flag;
> -
>  if (s->avctx->active_thread_type == FF_THREAD_SLICE  &&
>  s->sh.num_entry_point_offsets > 0&&
>  pps->num_tile_rows == 1 && pps->num_tile_columns == 1)
> --
> 2.34.1

I can confirm that this set fixes hwaccel with slices, LGTM from me.
Hopefully Anton can also quickly look over it, its his changes
afterall.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v4 2/4] lavc/vp9dsp: R-V V mc bilin hv

2024-06-24 Thread Rémi Denis-Courmont

Le lauantaina 15. kesäkuuta 2024, 14.50.32 EEST u...@foxmail.com a écrit :
> From: sunyuechi 
> 
>  C908   X60
> vp9_avg_bilin_4hv_8bpp_c   :   10.79.5
> vp9_avg_bilin_4hv_8bpp_rvv_i32 :4.03.5
> vp9_avg_bilin_8hv_8bpp_c   :   38.5   34.2
> vp9_avg_bilin_8hv_8bpp_rvv_i32 :7.26.5
> vp9_avg_bilin_16hv_8bpp_c  :  147.2  130.5
> vp9_avg_bilin_16hv_8bpp_rvv_i32:   14.5   12.7
> vp9_avg_bilin_32hv_8bpp_c  :  574.2  509.7
> vp9_avg_bilin_32hv_8bpp_rvv_i32:   42.5   38.0
> vp9_avg_bilin_64hv_8bpp_c  : 2321.2 2017.7
> vp9_avg_bilin_64hv_8bpp_rvv_i32:  163.5  131.0
> vp9_put_bilin_4hv_8bpp_c   :   10.08.7
> vp9_put_bilin_4hv_8bpp_rvv_i32 :3.53.0
> vp9_put_bilin_8hv_8bpp_c   :   35.2   31.2
> vp9_put_bilin_8hv_8bpp_rvv_i32 :6.55.7
> vp9_put_bilin_16hv_8bpp_c  :  134.0  119.0
> vp9_put_bilin_16hv_8bpp_rvv_i32:   12.7   11.5
> vp9_put_bilin_32hv_8bpp_c  :  538.5  464.2
> vp9_put_bilin_32hv_8bpp_rvv_i32:   39.7   35.2
> vp9_put_bilin_64hv_8bpp_c  : 2111.7 1833.2
> vp9_put_bilin_64hv_8bpp_rvv_i32:  138.5  122.5
> ---
>  libavcodec/riscv/vp9_mc_rvv.S  | 38 +-
>  libavcodec/riscv/vp9dsp_init.c | 10 +
>  2 files changed, 47 insertions(+), 1 deletion(-)
> 
> diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S
> index fb7377048a..5241562531 100644
> --- a/libavcodec/riscv/vp9_mc_rvv.S
> +++ b/libavcodec/riscv/vp9_mc_rvv.S
> @@ -147,6 +147,40 @@ func ff_\op\()_vp9_bilin_64\type\()_rvv, zve32x
>  endfunc
>  .endm
> 
> +.macro bilin_hv op
> +func ff_\op\()_vp9_bilin_64hv_rvv, zve32x
> +vsetvlstatic8   64, t0, 64
> +.Lbilin_hv\op:
> +.ifc \op,avg
> +csrwi   vxrm, 0
> +.endif
> +neg t1, a5
> +neg t2, a6
> +li  t4, 8
> +bilin_load_hv24, put, a5
> +add a2, a2, a3
> +1:
> +addia4, a4, -1
> +bilin_load_hv4, put, a5
> +vwmulu.vx   v16, v4, a6
> +vwmaccsu.vx v16, t2, v24
> +vwadd.wxv16, v16, t4
> +vnsra.wiv16, v16, 4

Why round manually?
It looks like vnclip.wi would be more straightforward here.

> +vadd.vv v0, v16, v24
> +.ifc \op,avg
> +vle8.v  v16, (a0)
> +vaaddu.vv   v0, v0, v16
> +.endif
> +vse8.v  v0, (a0)
> +vmv.v.v v24, v4
> +add a2, a2, a3
> +add a0, a0, a1
> +bneza4, 1b
> +
> +ret
> +endfunc
> +.endm
> +
>  .irp len, 64, 32, 16, 8, 4
>  copy_avg \len
>  .endr
> @@ -155,6 +189,8 @@ bilin_h_v  put, h, a5
>  bilin_h_v  avg, h, a5
>  bilin_h_v  put, v, a6
>  bilin_h_v  avg, v, a6
> +bilin_hv   put
> +bilin_hv   avg
> 
>  .macro func_bilin_h_v len, op, type
>  func ff_\op\()_vp9_bilin_\len\()\type\()_rvv, zve32x
> @@ -165,7 +201,7 @@ endfunc
> 
>  .irp len, 32, 16, 8, 4
>  .irp op, put, avg
> -.irp type, h, v
> +.irp type, h, v, hv
>  func_bilin_h_v \len, \op, \type
>  .endr
>  .endr
> diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c
> index 9606d8545f..b3700dfb08 100644
> --- a/libavcodec/riscv/vp9dsp_init.c
> +++ b/libavcodec/riscv/vp9dsp_init.c
> @@ -83,6 +83,16 @@ static av_cold void vp9dsp_mc_init_riscv(VP9DSPContext
> *dsp, int bpp) dsp->mc[4][FILTER_BILINEAR ][0][1][0] =
> ff_put_vp9_bilin_4h_rvv; dsp->mc[4][FILTER_BILINEAR ][1][0][1] =
> ff_avg_vp9_bilin_4v_rvv; dsp->mc[4][FILTER_BILINEAR ][1][1][0] =
> ff_avg_vp9_bilin_4h_rvv; +dsp->mc[0][FILTER_BILINEAR ][0][1][1] =
> ff_put_vp9_bilin_64hv_rvv; +dsp->mc[0][FILTER_BILINEAR ][1][1][1] =
> ff_avg_vp9_bilin_64hv_rvv; +dsp->mc[1][FILTER_BILINEAR ][0][1][1] =
> ff_put_vp9_bilin_32hv_rvv; +dsp->mc[1][FILTER_BILINEAR ][1][1][1] =
> ff_avg_vp9_bilin_32hv_rvv; +dsp->mc[2][FILTER_BILINEAR ][0][1][1] =
> ff_put_vp9_bilin_16hv_rvv; +dsp->mc[2][FILTER_BILINEAR ][1][1][1] =
> ff_avg_vp9_bilin_16hv_rvv; +dsp->mc[3][FILTER_BILINEAR ][0][1][1] =
> ff_put_vp9_bilin_8hv_rvv; +dsp->mc[3][FILTER_BILINEAR ][1][1][1] =
> ff_avg_vp9_bilin_8hv_rvv; +dsp->mc[4][FILTER_BILINEAR ][0][1][1] =
> ff_put_vp9_bilin_4hv_rvv; +dsp->mc[4][FILTER_BILINEAR ][1][1][1] =
> ff_avg_vp9_bilin_4hv_rvv;
> 
>  #undef init_fpel
>  }


-- 
Rémi Denis-Courmont
http://www.remlab.net/



__

Re: [FFmpeg-devel] [PATCH 2/4] lavc/vp8dsp: R-V V loop_filter_simple

2024-06-24 Thread Rémi Denis-Courmont

Le lauantaina 22. kesäkuuta 2024, 18.58.04 EEST u...@foxmail.com a écrit :
> From: sunyuechi 
> 
>  C908   X60
> vp8_loop_filter_simple_h_c :7.06.0
> vp8_loop_filter_simple_h_rvv_i32   :3.22.7
> vp8_loop_filter_simple_v_c :7.26.5
> vp8_loop_filter_simple_v_rvv_i32   :1.71.2
> ---
>  libavcodec/riscv/vp8dsp_init.c | 18 ++-
>  libavcodec/riscv/vp8dsp_rvv.S  | 87 ++
>  2 files changed, 104 insertions(+), 1 deletion(-)
> 
> diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c
> index dcb6307d5b..8c5b2c8b04 100644
> --- a/libavcodec/riscv/vp8dsp_init.c
> +++ b/libavcodec/riscv/vp8dsp_init.c
> @@ -49,6 +49,9 @@ VP8_BILIN(16, rvv256);
>  VP8_BILIN(8,  rvv256);
>  VP8_BILIN(4,  rvv256);
> 
> +VP8_LF(rvv128);
> +VP8_LF(rvv256);
> +
>  av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c)
>  {
>  #if HAVE_RV
> @@ -147,9 +150,15 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c)
>  av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c)
>  {
>  #if HAVE_RVV
> +int vlenb = ff_get_rv_vlenb();
> +
> +#define init_loop_filter(vlen)   \
> +c->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter16_simple_rvv##vlen;
> \ +c->vp8_h_loop_filter_simple =
> ff_vp8_h_loop_filter16_simple_rvv##vlen; +
>  int flags = av_get_cpu_flags();
> 
> -if (flags & AV_CPU_FLAG_RVV_I32 && ff_rv_vlen_least(128)) {
> +if (flags & AV_CPU_FLAG_RVV_I32 && vlenb >= 16) {
>  #if __riscv_xlen >= 64
>  if (flags & AV_CPU_FLAG_RVV_I64)
>  c->vp8_luma_dc_wht = ff_vp8_luma_dc_wht_rvv;
> @@ -159,6 +168,13 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c)
>  c->vp8_idct_dc_add4y = ff_vp8_idct_dc_add4y_rvv;
>  if (flags & AV_CPU_FLAG_RVV_I64)
>  c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv;
> +
> +if (vlenb >= 32) {
> +init_loop_filter(256);
> +} else {
> +init_loop_filter(128);
> +}
>  }
> +#undef init_loop_filter
>  #endif
>  }
> diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S
> index 0cbf1672f7..b5f8bb31b4 100644
> --- a/libavcodec/riscv/vp8dsp_rvv.S
> +++ b/libavcodec/riscv/vp8dsp_rvv.S
> @@ -275,6 +275,93 @@ func ff_vp78_idct_dc_add4uv_rvv, zve64x
>  ret
>  endfunc
> 
> +.macro filter_fmin len, vlen, a, f1, p0f2, q0f1
> +vsetvlstatic16  \len, \vlen
> +vsext.vf2   \q0f1, \a
> +vmin.vx \p0f2, \q0f1, a7
> +vmin.vx \q0f1, \q0f1, t3
> +vadd.vi \p0f2, \p0f2, 3
> +vadd.vi \q0f1, \q0f1, 4
> +vsra.vi \p0f2, \p0f2, 3
> +vsra.vi \f1,   \q0f1, 3

vssra.vi

> +vadd.vv \p0f2, \p0f2, v8
> +vsub.vv \q0f1, v16, \f1
> +vmax.vx \p0f2, \p0f2, zero
> +vmax.vx \q0f1, \q0f1, zero
> +.endm
> +
> +.macro filter len, vlen, type, normal, inner, dst, stride, fE, fI, thresh
> +.ifc \type,v
> +sllia6, \stride, 1
> +sub t2, \dst, a6
> +add t4, \dst, \stride
> +sub t1, \dst, \stride
> +vle8.v  v1, (t2)
> +vle8.v  v11, (t4)
> +vle8.v  v17, (t1)
> +vle8.v  v22, (\dst)
> +.else
> +addit1, \dst, -1
> +addia6, \dst, -2
> +addit4, \dst, 1
> +vlse8.v v1, (a6), \stride
> +vlse8.v v11, (t4), \stride
> +vlse8.v v17, (t1), \stride
> +vlse8.v v22, (\dst), \stride

vlsseg4e8.v

> +.endif
> +vwsubu.vv   v12, v1, v11 // p1-q1
> +vwsubu.vv   v24, v22, v17// q0-p0
> +vnclip.wi   v23, v12, 0

I can't find where VXRM is initialised for that.

> +vsetvlstatic16  \len, \vlen
> +// vp8_simple_limit(dst + i, stride, flim)
> +li  a7, 2
> +vneg.v  v18, v12
> +vmax.vv v18, v18, v12
> +vneg.v  v8, v24
> +vmax.vv v8, v8, v24
> +vsrl.vi v18, v18, 1
> +vmacc.vxv18, a7, v8
> +vmsleu.vx   v0, v18, \fE
> +
> +li  t5, 3
> +li  a7, 124
> +li  t3, 123
> +vmul.vx v30, v24, t5
> +vsext.vf2   v4, v23
> +vzext.vf2   v8, v17  // p0
> +vzext.vf2   v16, v22 // q0
> +vadd.vv v12, v30, v4

vwadd.wv

> +vsetvlstatic8   \len, \vlen
> +vnclip.wi   v11, v12, 0
> +filter_fmin \len, \vlen, v11, v24, v4, v6
> +vsetvlstatic8   \len, \vlen
> +vnclipu.wi  v4, v4, 0
> +

Re: [FFmpeg-devel] [PATCH v2] lavu/stereo3d: change the horizontal FOV field to a rational

2024-06-24 Thread Lynne via ffmpeg-devel


On 24/06/2024 17:51, Derek Buitenhuis wrote:

On 6/24/2024 1:13 AM, James Almer wrote:

If Derek is also ok with this then LGTM.


I do not object.

- Derek
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Thanks for the reviews.
Pushed with the requested changes.


OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] fftools/ffplay_renderer: use correct NULL value for Vulkan type

2024-06-24 Thread Lynne via ffmpeg-devel


On 24/06/2024 20:48, Timo Rothenpieler wrote:

---
  fftools/ffplay_renderer.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fftools/ffplay_renderer.c b/fftools/ffplay_renderer.c
index 80b700b3c5..f272cb46f1 100644
--- a/fftools/ffplay_renderer.c
+++ b/fftools/ffplay_renderer.c
@@ -766,7 +766,7 @@ static void destroy(VkRenderer *renderer)
  vkDestroySurfaceKHR = (PFN_vkDestroySurfaceKHR)
  ctx->get_proc_addr(ctx->inst, "vkDestroySurfaceKHR");
  vkDestroySurfaceKHR(ctx->inst, ctx->vk_surface, NULL);
-ctx->vk_surface = NULL;
+ctx->vk_surface = VK_NULL_HANDLE;
  }
  
  av_buffer_unref(&ctx->hw_device_ref);


Sure, LGTM
Thanks


OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] configure: align conditional library deps assignments

2024-06-24 Thread Gyan Doshi





On 2024-06-21 04:18 pm, Gyan Doshi wrote:

---
  configure | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

Plan to improve commit messages and push set in 24h.



diff --git a/configure b/configure
index 1e58c0dbac..db11a78c74 100755
--- a/configure
+++ b/configure
@@ -7764,14 +7764,14 @@ enabled elbg_filter && prepend avfilter_deps 
"avcodec"
  enabled find_rect_filter&& prepend avfilter_deps "avformat avcodec"
  enabled fsync_filter&& prepend avfilter_deps "avformat"
  enabled mcdeint_filter  && prepend avfilter_deps "avcodec"
-enabled movie_filter&& prepend avfilter_deps "avformat avcodec"
+enabled movie_filter&& prepend avfilter_deps "avformat avcodec"
  enabled pan_filter  && prepend avfilter_deps "swresample"
  enabled pp_filter   && prepend avfilter_deps "postproc"
  enabled qrencode_filter && prepend avfilter_deps "swscale"
  enabled qrencodesrc_filter  && prepend avfilter_deps "swscale"
  enabled removelogo_filter   && prepend avfilter_deps "avformat avcodec 
swscale"
  enabled sab_filter  && prepend avfilter_deps "swscale"
-enabled scale_filter&& prepend avfilter_deps "swscale"
+enabled scale_filter&& prepend avfilter_deps "swscale"
  enabled scale2ref_filter&& prepend avfilter_deps "swscale"
  enabled showcqt_filter  && prepend avfilter_deps "avformat swscale"
  enabled signature_filter&& prepend avfilter_deps "avcodec avformat"


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

52 matches

Mail list logo