Re: [FFmpeg-devel] [PATCH 1/2] lavf/dvenc: improve error messaging

2024-01-21 Thread Stefano Sabatini
On date Sunday 2024-01-21 07:36:48 +0100, Anton Khirnov wrote:
> Quoting Stefano Sabatini (2024-01-20 16:24:07)
> >  if ((c->sys->time_base.den != 25 && c->sys->time_base.den != 50) || 
> > c->sys->time_base.num != 1) {
> > -if (c->ast[0] && c->ast[0]->codecpar->sample_rate != 48000)
> > -goto bail_out;
> > -if (c->ast[1] && c->ast[1]->codecpar->sample_rate != 48000)
> > -goto bail_out;
> > +int j;
> 
> No need to declare a loop variable outside of the loop. Also, there's
> already i.

fixed locally

> > -if (((c->n_ast > 1) && (c->sys->n_difchan < 2)) ||
> > -((c->n_ast > 2) && (c->sys->n_difchan < 4))) {
> > -/* only 2 stereo pairs allowed in 50Mbps mode */
> > -goto bail_out;
> > +if ((c->n_ast > 1) && (c->sys->n_difchan < 2)) {
> > +av_log(s, AV_LOG_ERROR,
> > +   "Invalid number of channels %d, only 1 stereo pairs is 
> > allowed in 25Mps mode.\n",
> > +   c->n_ast);
> > +return AVERROR_INVALIDDATA;
> > +}
> > +if ((c->n_ast > 2) && (c->sys->n_difchan < 4)) {
> > +av_log(s, AV_LOG_ERROR,
> > +   "Invalid number of channels %d, only 2 stereo pairs are 
> > allowed in 50Mps mode.\n",
> > +   c->n_ast);
> > +return AVERROR_INVALIDDATA;
> 

> Surely this can be done in one log statement.

Yes, but this would complicate the logic for small gain.

> 
> >  }
> >  
> >  /* Ok, everything seems to be in working order */
> > @@ -376,14 +427,14 @@ static DVMuxContext* dv_init_mux(AVFormatContext* s)
> >  if (!c->ast[i])
> > continue;
> >  c->audio_data[i] = av_fifo_alloc2(100 * MAX_AUDIO_FRAME_SIZE, 1, 
> > 0);
> > -if (!c->audio_data[i])
> > -goto bail_out;
> > +if (!c->audio_data[i]) {
> > +av_log(s, AV_LOG_ERROR,
> > +   "Failed to allocate internal buffer.\n");
> 
> Dedicated log messages for small malloc failures are useless bloat.

Dropped.

> 
> > +return AVERROR(ENOMEM);
> > +}
> >  }
> >  
> > -return c;
> > -
> > -bail_out:
> > -return NULL;
> > +return 0;
> >  }
> >  
> >  static int dv_write_header(AVFormatContext *s)
> > @@ -392,10 +443,10 @@ static int dv_write_header(AVFormatContext *s)
> >  DVMuxContext *dvc = s->priv_data;
> >  AVDictionaryEntry *tcr = av_dict_get(s->metadata, "timecode", NULL, 0);
> >  
> > -if (!dv_init_mux(s)) {
> > +if (dv_init_mux(s) < 0) {
> >  av_log(s, AV_LOG_ERROR, "Can't initialize DV format!\n"
> >  "Make sure that you supply exactly two streams:\n"
> 
> This seems inconsistent with the other checks.

Yes, probably it's better to drop this entirely since we have more
puntual reporting now (and "exactly two stream" is wrong) 

> 
> > -" video: 25fps or 29.97fps, audio: 
> > 2ch/48|44|32kHz/PCM\n"

> > +" video: 25fps or 29.97fps, audio: 
> > 2ch/48000|44100|32000Hz/PCM\n"
> 
> This does not seem like an improvement.

44kHz != 44100

I could use 44.1 but this is not the unit used when setting the
option, so better to be explicit.

Thanks.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

2024-01-21 Thread James Almer

On 1/21/2024 3:27 AM, Anton Khirnov wrote:

Quoting James Almer (2024-01-20 23:04:06)

This includes a struct and helpers. It will be used to support container level
cropping and tiled image formats, but should be generic enough for general
usage.

Signed-off-by: James Almer 
---
Extended to include fields used for cropping. Should make the struct reusable
even for non tiled images, e.g. setting both rows and tiles to 1, in which case
tile width and height would become analogous to coded_{witdh,height}.


But why? What does cropping have to do with tiling? What advantage is
there to handling them in one struct?


The struct does not need to be used for non tiled image scenarios, but 
could if we decide we don't want to add another struct that would only 
contain a subset of the fields present here.


As to why said fields here present here, HEIF may use a clap box to 
define cropping for the final image, not for the tiles. This needs to be 
propagated, and the previous version of this API, which only defined 
cropping from right and bottom edges if output dimensions were smaller 
than the grid (standard case for tiled heif with no clap box), was not 
enough. Hence this change.


I can rename this struct to Image Grid or something else, which might 
make it feel less awkward if we decide to reuse it. We still need to 
propagate container cropping from clap boxes and from Matroska elements 
after all.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [vaapi-cavs 1/7] cavs: add cavs profile defs

2024-01-21 Thread jianfeng.zheng
Signed-off-by: jianfeng.zheng 
---
 libavcodec/defs.h | 3 +++
 libavcodec/profiles.c | 6 ++
 libavcodec/profiles.h | 1 +
 3 files changed, 10 insertions(+)

diff --git a/libavcodec/defs.h b/libavcodec/defs.h
index 00d840ec19..d59816a70f 100644
--- a/libavcodec/defs.h
+++ b/libavcodec/defs.h
@@ -192,6 +192,9 @@
 #define AV_PROFILE_EVC_BASELINE 0
 #define AV_PROFILE_EVC_MAIN 1
 
+#define AV_PROFILE_CAVS_JIZHUN  0x20
+#define AV_PROFILE_CAVS_GUANGDIAN   0x48
+
 
 #define AV_LEVEL_UNKNOWN  -99
 
diff --git a/libavcodec/profiles.c b/libavcodec/profiles.c
index 5bb8f150e6..b312f12281 100644
--- a/libavcodec/profiles.c
+++ b/libavcodec/profiles.c
@@ -200,4 +200,10 @@ const AVProfile ff_evc_profiles[] = {
 { AV_PROFILE_UNKNOWN },
 };
 
+const AVProfile ff_cavs_profiles[] = {
+{ AV_PROFILE_CAVS_JIZHUN,   "Jizhun"},
+{ AV_PROFILE_CAVS_GUANGDIAN,"Guangdian" },
+{ AV_PROFILE_UNKNOWN },
+};
+
 #endif /* !CONFIG_SMALL */
diff --git a/libavcodec/profiles.h b/libavcodec/profiles.h
index 270430a48b..9a2b348ad4 100644
--- a/libavcodec/profiles.h
+++ b/libavcodec/profiles.h
@@ -75,5 +75,6 @@ extern const AVProfile ff_prores_profiles[];
 extern const AVProfile ff_mjpeg_profiles[];
 extern const AVProfile ff_arib_caption_profiles[];
 extern const AVProfile ff_evc_profiles[];
+extern const AVProfile ff_cavs_profiles[];
 
 #endif /* AVCODEC_PROFILES_H */
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [vaapi-cavs 2/7] cavs: skip bits between pic header and slc header

2024-01-21 Thread jianfeng.zheng
Signed-off-by: jianfeng.zheng 
---
 libavcodec/cavs.h|  2 +
 libavcodec/cavsdec.c | 87 
 2 files changed, 89 insertions(+)

diff --git a/libavcodec/cavs.h b/libavcodec/cavs.h
index 244c322b35..ad49abff92 100644
--- a/libavcodec/cavs.h
+++ b/libavcodec/cavs.h
@@ -39,8 +39,10 @@
 #define EXT_START_CODE  0x01b5
 #define USER_START_CODE 0x01b2
 #define CAVS_START_CODE 0x01b0
+#define VIDEO_SEQ_END_CODE  0x01b1
 #define PIC_I_START_CODE0x01b3
 #define PIC_PB_START_CODE   0x01b6
+#define VIDEO_EDIT_CODE 0x01b7
 
 #define A_AVAIL  1
 #define B_AVAIL  2
diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c
index b356da0b04..9742bd1011 100644
--- a/libavcodec/cavsdec.c
+++ b/libavcodec/cavsdec.c
@@ -954,6 +954,80 @@ static inline int decode_slice_header(AVSContext *h, 
GetBitContext *gb)
 return 0;
 }
 
+/**
+ * skip stuffing bits before next start code "0x01"
+ * @return '0' no stuffing bits placed at h->gb being skip, else '1'.
+ */
+static inline int skip_stuffing_bits(AVSContext *h)
+{
+GetBitContext gb0 = h->gb;
+GetBitContext *gb = &h->gb;
+const uint8_t *start;
+const uint8_t *ptr;
+const uint8_t *end;
+int align;
+int stuffing_zeros;
+
+/**
+ * According to spec, there should be one stuffing_bit '1' and
+ * 0~7 stuffing_bit '0'. But seems not all the stream follow
+ * "next_start_code()" strictly.
+ */
+align = (-get_bits_count(gb)) & 7;
+if (align == 0 && show_bits_long(gb, 8) == 0x80) {
+skip_bits_long(gb, 8);
+}
+
+/**
+ *  skip leading zero bytes before 0x 00 00 01 stc
+ */
+ptr = start = align_get_bits(gb);
+end = gb->buffer_end;
+while (ptr < end && *ptr == 0)
+ptr++;
+
+if ((ptr >= end) || (*ptr == 1 && ptr - start >= 2)) {
+stuffing_zeros = (ptr >= end ? end - start : ptr - start - 2);
+if (stuffing_zeros > 0)
+av_log(h->avctx, AV_LOG_DEBUG, "Skip 0x%x stuffing zeros @0x%x.\n",
+stuffing_zeros, (int)(start - gb->buffer));
+skip_bits_long(gb, stuffing_zeros * 8);
+return 1;
+} else {
+av_log(h->avctx, AV_LOG_DEBUG, "No next_start_code() found @0x%x.\n",
+(int)(start - gb->buffer));
+goto restore_get_bits;
+}
+
+restore_get_bits:
+h->gb = gb0;
+return 0;
+}
+
+static inline int skip_extension_and_user_data(AVSContext *h)
+{
+int stc = -1;
+const uint8_t *start = align_get_bits(&h->gb);
+const uint8_t *end = h->gb.buffer_end;
+const uint8_t *ptr, *next;
+
+for (ptr = start; ptr + 4 < end; ptr = next) {
+stc = show_bits_long(&h->gb, 32);
+if (stc != EXT_START_CODE && stc != USER_START_CODE) {
+break;
+}
+next = avpriv_find_start_code(ptr + 4, end, &stc);
+if (next < end) {
+next -= 4;
+}
+skip_bits(&h->gb, (next - ptr) * 8);
+av_log(h->avctx, AV_LOG_DEBUG, "skip %d byte ext/user data\n",
+(int)(next - ptr));
+}
+
+return ptr > start;
+}
+
 static inline int check_for_slice(AVSContext *h)
 {
 GetBitContext *gb = &h->gb;
@@ -1019,6 +1093,8 @@ static int decode_pic(AVSContext *h)
 h->stream_revision = 1;
 if (h->stream_revision > 0)
 skip_bits(&h->gb, 1); //marker_bit
+
+av_log(h->avctx, AV_LOG_DEBUG, "stream_revision: %d\n", 
h->stream_revision);
 }
 
 if (get_bits_left(&h->gb) < 23)
@@ -1096,6 +1172,11 @@ static int decode_pic(AVSContext *h)
 h->alpha_offset = h->beta_offset  = 0;
 }
 
+if (h->stream_revision > 0) {
+skip_stuffing_bits(h);
+skip_extension_and_user_data(h);
+}
+
 ret = 0;
 if (h->cur.f->pict_type == AV_PICTURE_TYPE_I) {
 do {
@@ -1309,6 +1390,12 @@ static int cavs_decode_frame(AVCodecContext *avctx, 
AVFrame *rframe,
 case USER_START_CODE:
 //mpeg_decode_user_data(avctx, buf_ptr, input_size);
 break;
+case VIDEO_EDIT_CODE:
+av_log(h->avctx, AV_LOG_WARNING, "Skip video_edit_code\n");
+break;
+case VIDEO_SEQ_END_CODE:
+av_log(h->avctx, AV_LOG_WARNING, "Skip video_sequence_end_code\n");
+break;
 default:
 if (stc <= SLICE_MAX_START_CODE) {
 init_get_bits(&h->gb, buf_ptr, input_size);
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [vaapi-cavs 3/7] cavs: time code debug

2024-01-21 Thread jianfeng.zheng
Signed-off-by: jianfeng.zheng 
---
 libavcodec/cavsdec.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c
index 9742bd1011..9ad0f29b01 100644
--- a/libavcodec/cavsdec.c
+++ b/libavcodec/cavsdec.c
@@ -1061,6 +1061,7 @@ static int decode_pic(AVSContext *h)
 int ret;
 int skip_count= -1;
 enum cavs_mb mb_type;
+char tc[4];
 
 if (!h->top_qp) {
 av_log(h->avctx, AV_LOG_ERROR, "No sequence header decoded yet\n");
@@ -1082,8 +1083,16 @@ static int decode_pic(AVSContext *h)
 return AVERROR_INVALIDDATA;
 } else {
 h->cur.f->pict_type = AV_PICTURE_TYPE_I;
-if (get_bits1(&h->gb))
-skip_bits(&h->gb, 24);//time_code
+if (get_bits1(&h->gb)) {//time_code
+skip_bits(&h->gb, 1);
+tc[0] = get_bits(&h->gb, 5);
+tc[1] = get_bits(&h->gb, 6);
+tc[2] = get_bits(&h->gb, 6);
+tc[3] = get_bits(&h->gb, 6);
+av_log(h->avctx, AV_LOG_DEBUG, "timecode: %d:%d:%d.%d\n", 
+tc[0], tc[1], tc[2], tc[3]);
+}
+
 /* old sample clips were all progressive and no low_delay,
bump stream revision if detected otherwise */
 if (h->low_delay || !(show_bits(&h->gb, 9) & 1))
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [vaapi-cavs 4/7] cavs: fix dpb reorder issues when 'low_delay' is varied

2024-01-21 Thread jianfeng.zheng
Consider multi sequences in one stream, 'low_delay' may change
between sequences.

Signed-off-by: jianfeng.zheng 
---
 libavcodec/cavs.c|  12 +
 libavcodec/cavs.h|   2 +
 libavcodec/cavsdec.c | 105 +--
 3 files changed, 95 insertions(+), 24 deletions(-)

diff --git a/libavcodec/cavs.c b/libavcodec/cavs.c
index fdd577f7fb..ed7b278336 100644
--- a/libavcodec/cavs.c
+++ b/libavcodec/cavs.c
@@ -810,6 +810,14 @@ av_cold int ff_cavs_init(AVCodecContext *avctx)
 if (!h->cur.f || !h->DPB[0].f || !h->DPB[1].f)
 return AVERROR(ENOMEM);
 
+h->out[0].f = av_frame_alloc();
+h->out[1].f = av_frame_alloc();
+h->out[2].f = av_frame_alloc();
+if (!h->out[0].f || !h->out[1].f || !h->out[2].f) {
+ff_cavs_end(avctx);
+return AVERROR(ENOMEM);
+}
+
 h->luma_scan[0] = 0;
 h->luma_scan[1] = 8;
 h->intra_pred_l[INTRA_L_VERT]   = intra_pred_vert;
@@ -840,6 +848,10 @@ av_cold int ff_cavs_end(AVCodecContext *avctx)
 av_frame_free(&h->DPB[0].f);
 av_frame_free(&h->DPB[1].f);
 
+av_frame_free(&h->out[0].f);
+av_frame_free(&h->out[1].f);
+av_frame_free(&h->out[2].f);
+
 av_freep(&h->top_qp);
 av_freep(&h->top_mv[0]);
 av_freep(&h->top_mv[1]);
diff --git a/libavcodec/cavs.h b/libavcodec/cavs.h
index ad49abff92..f490657959 100644
--- a/libavcodec/cavs.h
+++ b/libavcodec/cavs.h
@@ -166,6 +166,7 @@ struct dec_2dvlc {
 typedef struct AVSFrame {
 AVFrame *f;
 int poc;
+int outputed;
 } AVSFrame;
 
 typedef struct AVSContext {
@@ -177,6 +178,7 @@ typedef struct AVSContext {
 GetBitContext gb;
 AVSFrame cur; ///< currently decoded frame
 AVSFrame DPB[2];  ///< reference frames
+AVSFrame out[3];  ///< output queue, size 2 maybe enough
 int dist[2]; ///< temporal distances from current frame to ref frames
 int low_delay;
 int profile, level;
diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c
index 9ad0f29b01..6f462d861c 100644
--- a/libavcodec/cavsdec.c
+++ b/libavcodec/cavsdec.c
@@ -1056,6 +1056,44 @@ static inline int check_for_slice(AVSContext *h)
  *
  /
 
+/**
+ * @brief remove frame out of dpb
+ */
+static void cavs_frame_unref(AVSFrame *frame)
+{
+/* frame->f can be NULL if context init failed */
+if (!frame->f || !frame->f->buf[0])
+return;
+
+av_frame_unref(frame->f);
+}
+
+static int output_one_frame(AVSContext *h, AVFrame *data, int *got_frame)
+{
+if (h->out[0].f->buf[0]) {
+av_log(h->avctx, AV_LOG_DEBUG, "output frame: poc=%d\n", 
h->out[0].poc);
+av_frame_move_ref(data, h->out[0].f);
+*got_frame = 1;
+
+// out[0] <- out[1] <- out[2] <- out[0]
+cavs_frame_unref(&h->out[2]);
+FFSWAP(AVSFrame, h->out[0], h->out[2]);
+FFSWAP(AVSFrame, h->out[0], h->out[1]);
+
+return 1;
+}
+
+return 0;
+}
+
+static void queue_one_frame(AVSContext *h, AVSFrame *out)
+{
+int idx = !h->out[0].f->buf[0] ? 0 : (!h->out[1].f->buf[0] ? 1 : 2);
+av_log(h->avctx, AV_LOG_DEBUG, "queue in out[%d]: poc=%d\n", idx, 
out->poc);
+av_frame_ref(h->out[idx].f, out->f);
+h->out[idx].poc = out->poc;
+}
+
 static int decode_pic(AVSContext *h)
 {
 int ret;
@@ -1068,7 +1106,7 @@ static int decode_pic(AVSContext *h)
 return AVERROR_INVALIDDATA;
 }
 
-av_frame_unref(h->cur.f);
+cavs_frame_unref(&h->cur);
 
 skip_bits(&h->gb, 16);//bbv_dwlay
 if (h->stc == PIC_PB_START_CODE) {
@@ -1077,10 +1115,13 @@ static int decode_pic(AVSContext *h)
 av_log(h->avctx, AV_LOG_ERROR, "illegal picture type\n");
 return AVERROR_INVALIDDATA;
 }
+
 /* make sure we have the reference frames we need */
-if (!h->DPB[0].f->data[0] ||
-   (!h->DPB[1].f->data[0] && h->cur.f->pict_type == AV_PICTURE_TYPE_B))
+if (!h->DPB[0].f->buf[0] ||
+(!h->DPB[1].f->buf[0] && h->cur.f->pict_type == 
AV_PICTURE_TYPE_B)) {
+av_log(h->avctx, AV_LOG_ERROR, "Invalid reference frame\n");
 return AVERROR_INVALIDDATA;
+}
 } else {
 h->cur.f->pict_type = AV_PICTURE_TYPE_I;
 if (get_bits1(&h->gb)) {//time_code
@@ -1124,6 +1165,8 @@ static int decode_pic(AVSContext *h)
 if ((ret = ff_cavs_init_pic(h)) < 0)
 return ret;
 h->cur.poc = get_bits(&h->gb, 8) * 2;
+av_log(h->avctx, AV_LOG_DEBUG, "poc=%d, type=%d\n",
+h->cur.poc, h->cur.f->pict_type);
 
 /* get temporal distances and MV scaling factors */
 if (h->cur.f->pict_type != AV_PICTURE_TYPE_B) {
@@ -1137,6 +1180,8 @@ static int decode_pic(AVSContext *h)
 if (h->cur.f->pict_type == AV_PICTURE_TYPE_B) {
 h->sym_factor = h->dist[0] * h->scale_den[1];
 if (FFABS(h->sym_factor) > 32768) {
+av_log(h->avctx, AV_LOG_ERROR, "poc=%d/%d/

[FFmpeg-devel] [vaapi-cavs 5/7] cavs: decode wqm and slice weighting for future usage

2024-01-21 Thread jianfeng.zheng
Signed-off-by: jianfeng.zheng 
---
 libavcodec/cavs.h|  26 +++-
 libavcodec/cavsdec.c | 142 +--
 2 files changed, 147 insertions(+), 21 deletions(-)

diff --git a/libavcodec/cavs.h b/libavcodec/cavs.h
index f490657959..33ef10e850 100644
--- a/libavcodec/cavs.h
+++ b/libavcodec/cavs.h
@@ -186,12 +186,36 @@ typedef struct AVSContext {
 int mb_width, mb_height;
 int width, height;
 int stream_revision; ///<0 for samples from 2006, 1 for rm52j encoder
-int progressive;
+int progressive_seq;
+int progressive_frame;
 int pic_structure;
 int skip_mode_flag; ///< select between skip_count or one skip_flag per MB
 int loop_filter_disable;
 int alpha_offset, beta_offset;
 int ref_flag;
+
+/** \defgroup guangdian profile
+ * @{
+ */
+int aec_flag;
+int weight_quant_flag;
+int chroma_quant_param_delta_cb;
+int chroma_quant_param_delta_cr;
+uint8_t wqm_8x8[64];
+/**@}*/
+
+/** \defgroup slice weighting
+ * FFmpeg don't support slice weighting natively, but maybe needed for 
HWaccel.
+ * @{
+ */
+uint32_t slice_weight_pred_flag : 1;
+uint32_t mb_weight_pred_flag: 1;
+uint8_t luma_scale[4];
+int8_t luma_shift[4];
+uint8_t chroma_scale[4];
+int8_t chroma_shift[4];
+/**@}*/
+
 int mbx, mby, mbidx; ///< macroblock coordinates
 int flags; ///< availability flags of neighbouring macroblocks
 int stc;   ///< last start code
diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c
index 6f462d861c..8d3ba530a6 100644
--- a/libavcodec/cavsdec.c
+++ b/libavcodec/cavsdec.c
@@ -30,6 +30,7 @@
 #include "avcodec.h"
 #include "get_bits.h"
 #include "golomb.h"
+#include "profiles.h"
 #include "cavs.h"
 #include "codec_internal.h"
 #include "decode.h"
@@ -37,6 +38,43 @@
 #include "mpeg12data.h"
 #include "startcode.h"
 
+static const uint8_t default_wq_param[4][6] = {
+{128,  98, 106, 116, 116, 128},
+{135, 143, 143, 160, 160, 213},
+{128,  98, 106, 116, 116, 128},
+{128, 128, 128, 128, 128, 128},
+};
+static const uint8_t wq_model_2_param[4][64] = {
+{
+0, 0, 0, 4, 4, 4, 5, 5,
+0, 0, 3, 3, 3, 3, 5, 5,
+0, 3, 2, 2, 1, 1, 5, 5,
+4, 3, 2, 2, 1, 5, 5, 5,
+4, 3, 1, 1, 5, 5, 5, 5,
+4, 3, 1, 5, 5, 5, 5, 5,
+5, 5, 5, 5, 5, 5, 5, 5,
+5, 5, 5, 5, 5, 5, 5, 5,
+}, {
+0, 0, 0, 4, 4, 4, 5, 5,
+0, 0, 4, 4, 4, 4, 5, 5,
+0, 3, 2, 2, 2, 1, 5, 5,
+3, 3, 2, 2, 1, 5, 5, 5,
+3, 3, 2, 1, 5, 5, 5, 5,
+3, 3, 1, 5, 5, 5, 5, 5,
+5, 5, 5, 5, 5, 5, 5, 5,
+5, 5, 5, 5, 5, 5, 5, 5,
+}, {
+0, 0, 0, 4, 4, 3, 5, 5,
+0, 0, 4, 4, 3, 2, 5, 5,
+0, 4, 4, 3, 2, 1, 5, 5,
+4, 4, 3, 2, 1, 5, 5, 5,
+4, 3, 2, 1, 5, 5, 5, 5,
+3, 2, 1, 5, 5, 5, 5, 5,
+5, 5, 5, 5, 5, 5, 5, 5,
+5, 5, 5, 5, 5, 5, 5, 5,
+}
+};
+
 static const uint8_t mv_scan[4] = {
 MV_FWD_X0, MV_FWD_X1,
 MV_FWD_X2, MV_FWD_X3
@@ -927,7 +965,11 @@ static int decode_mb_b(AVSContext *h, enum cavs_mb mb_type)
 
 static inline int decode_slice_header(AVSContext *h, GetBitContext *gb)
 {
-if (h->stc > 0xAF)
+int i, nref;
+
+av_log(h->avctx, AV_LOG_TRACE, "slice start code 0x%02x\n", h->stc);
+
+if (h->stc > SLICE_MAX_START_CODE)
 av_log(h->avctx, AV_LOG_ERROR, "unexpected start code 0x%02x\n", 
h->stc);
 
 if (h->stc >= h->mb_height) {
@@ -946,11 +988,29 @@ static inline int decode_slice_header(AVSContext *h, 
GetBitContext *gb)
 }
 /* inter frame or second slice can have weighting params */
 if ((h->cur.f->pict_type != AV_PICTURE_TYPE_I) ||
-(!h->pic_structure && h->mby >= h->mb_width / 2))
-if (get_bits1(gb)) { //slice_weighting_flag
+(!h->pic_structure && h->mby >= h->mb_height / 2)) {
+h->slice_weight_pred_flag = get_bits1(gb);
+if (h->slice_weight_pred_flag) {
+nref = h->cur.f->pict_type == AV_PICTURE_TYPE_I ? 1 : 
(h->pic_structure ? 2 : 4);
+for (i = 0; i < nref; i++) {
+h->luma_scale[i] = get_bits(gb, 8);
+h->luma_shift[i] = get_sbits(gb, 8);
+skip_bits1(gb);
+h->chroma_scale[i] = get_bits(gb, 8);
+h->chroma_shift[i] = get_sbits(gb, 8);
+skip_bits1(gb);
+}
+h->mb_weight_pred_flag = get_bits1(gb);
+if (!h->avctx->hwaccel) {
 av_log(h->avctx, AV_LOG_ERROR,
"weighted prediction not yet supported\n");
 }
+}
+}
+if (h->aec_flag) {
+align_get_bits(gb);
+}
+
 return 0;
 }
 
@@ -1108,7 +1168,11 @@ static int decode_pic(AVSContext *h)
 
 cavs_frame_unref(&h->cur);
 
-skip_bits(&h->gb, 16);//bbv_dwlay
+skip_bits(&h->gb, 16);//bbv_delay
+if (h->profile == AV_PROFILE_CAVS_GUANGDIAN) {

[FFmpeg-devel] [vaapi-cavs 7/7] cavs: support vaapi hwaccel decoding

2024-01-21 Thread jianfeng.zheng
see https://github.com/intel/libva/pull/738

Signed-off-by: jianfeng.zheng 
---
 configure |  14 
 libavcodec/Makefile   |   1 +
 libavcodec/cavs.h |   4 +
 libavcodec/cavsdec.c  | 101 +--
 libavcodec/hwaccels.h |   1 +
 libavcodec/vaapi_cavs.c   | 164 ++
 libavcodec/vaapi_decode.c |   4 +
 7 files changed, 284 insertions(+), 5 deletions(-)
 create mode 100644 libavcodec/vaapi_cavs.c

diff --git a/configure b/configure
index c8ae0a061d..89759eda5d 100755
--- a/configure
+++ b/configure
@@ -2463,6 +2463,7 @@ HAVE_LIST="
 xmllint
 zlib_gzip
 openvino2
+va_profile_avs
 "
 
 # options emitted with CONFIG_ prefix but not available on the command line
@@ -3202,6 +3203,7 @@ wmv3_dxva2_hwaccel_select="vc1_dxva2_hwaccel"
 wmv3_nvdec_hwaccel_select="vc1_nvdec_hwaccel"
 wmv3_vaapi_hwaccel_select="vc1_vaapi_hwaccel"
 wmv3_vdpau_hwaccel_select="vc1_vdpau_hwaccel"
+cavs_vaapi_hwaccel_deps="vaapi va_profile_avs VAPictureParameterBufferAVS"
 
 # hardware-accelerated codecs
 mediafoundation_deps="mftransform_h MFCreateAlignedMemoryBuffer"
@@ -7175,6 +7177,18 @@ if enabled vaapi; then
 check_type "va/va.h va/va_enc_vp8.h"  "VAEncPictureParameterBufferVP8"
 check_type "va/va.h va/va_enc_vp9.h"  "VAEncPictureParameterBufferVP9"
 check_type "va/va.h va/va_enc_av1.h"  "VAEncPictureParameterBufferAV1"
+
+#
+# Using 'VA_CHECK_VERSION' in source codes make things easy. But we have 
to wait
+# until newly added VAProfile being distributed by VAAPI released version.
+#
+# Before or after that, we can use auto-detection to keep version 
compatibility.
+# It always works.
+#
+disable va_profile_avs &&
+test_code cc va/va.h "VAProfile p1 = VAProfileAVSJizhun, p2 = 
VAProfileAVSGuangdian;" &&
+enable va_profile_avs
+enabled va_profile_avs && check_type "va/va.h va/va_dec_avs.h" 
"VAPictureParameterBufferAVS"
 fi
 
 if enabled_all opencl libdrm ; then
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index bb42095165..7d92375fed 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -1055,6 +1055,7 @@ OBJS-$(CONFIG_VP9_VAAPI_HWACCEL)  += vaapi_vp9.o
 OBJS-$(CONFIG_VP9_VDPAU_HWACCEL)  += vdpau_vp9.o
 OBJS-$(CONFIG_VP9_VIDEOTOOLBOX_HWACCEL)   += videotoolbox_vp9.o
 OBJS-$(CONFIG_VP8_QSV_HWACCEL)+= qsvdec.o
+OBJS-$(CONFIG_CAVS_VAAPI_HWACCEL) += vaapi_cavs.o
 
 # Objects duplicated from other libraries for shared builds
 SHLIBOBJS  += log2_tab.o reverse.o
diff --git a/libavcodec/cavs.h b/libavcodec/cavs.h
index 33ef10e850..4a0918da5a 100644
--- a/libavcodec/cavs.h
+++ b/libavcodec/cavs.h
@@ -167,10 +167,14 @@ typedef struct AVSFrame {
 AVFrame *f;
 int poc;
 int outputed;
+
+AVBufferRef   *hwaccel_priv_buf;
+void  *hwaccel_picture_private;
 } AVSFrame;
 
 typedef struct AVSContext {
 AVCodecContext *avctx;
+int got_pix_fmt;
 BlockDSPContext bdsp;
 H264ChromaContext h264chroma;
 VideoDSPContext vdsp;
diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c
index 5036ef50f7..5ca021c098 100644
--- a/libavcodec/cavsdec.c
+++ b/libavcodec/cavsdec.c
@@ -25,11 +25,14 @@
  * @author Stefan Gehrer 
  */
 
+#include "config_components.h"
 #include "libavutil/avassert.h"
 #include "libavutil/emms.h"
 #include "avcodec.h"
 #include "get_bits.h"
 #include "golomb.h"
+#include "hwaccel_internal.h"
+#include "hwconfig.h"
 #include "profiles.h"
 #include "cavs.h"
 #include "codec_internal.h"
@@ -1002,9 +1005,9 @@ static inline int decode_slice_header(AVSContext *h, 
GetBitContext *gb)
 }
 h->mb_weight_pred_flag = get_bits1(gb);
 if (!h->avctx->hwaccel) {
-av_log(h->avctx, AV_LOG_ERROR,
-   "weighted prediction not yet supported\n");
-}
+av_log(h->avctx, AV_LOG_ERROR,
+"weighted prediction not yet supported\n");
+}
 }
 }
 if (h->aec_flag) {
@@ -1115,6 +1118,46 @@ static inline int check_for_slice(AVSContext *h)
  * frame level
  *
  /
+static int hwaccel_pic(AVSContext *h)
+{
+int ret = 0;
+int stc = -1;
+const uint8_t *frm_start = align_get_bits(&h->gb);
+const uint8_t *frm_end = h->gb.buffer_end;
+const uint8_t *slc_start = frm_start;
+const uint8_t *slc_end = frm_end;
+GetBitContext gb = h->gb;
+const FFHWAccel *hwaccel = ffhwaccel(h->avctx->hwaccel);
+
+ret = hwaccel->start_frame(h->avctx, NULL, 0);
+if (ret < 0)
+return ret;
+
+for (slc_start = frm_start; slc_start + 4 < frm_end; slc_start = slc_end) {
+slc_end = avpriv_find_start_code(slc_start + 4, frm_end, &stc);
+if (slc_end < frm_end) {
+slc_end -= 4;
+}
+
+init_get_bits(&h->gb, slc_start, (slc_end - s

[FFmpeg-devel] [vaapi-cavs 6/7] cavs: set profile & level for AVCodecContext

2024-01-21 Thread jianfeng.zheng
Signed-off-by: jianfeng.zheng 
---
 libavcodec/cavsdec.c  | 5 -
 tests/ref/fate/cavs-demux | 2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c
index 8d3ba530a6..5036ef50f7 100644
--- a/libavcodec/cavsdec.c
+++ b/libavcodec/cavsdec.c
@@ -1499,7 +1499,10 @@ static int cavs_decode_frame(AVCodecContext *avctx, 
AVFrame *rframe,
 switch (stc) {
 case CAVS_START_CODE:
 init_get_bits(&h->gb, buf_ptr, input_size);
-decode_seq_header(h);
+if ((ret = decode_seq_header(h)) < 0)
+return ret;
+avctx->profile = h->profile;
+avctx->level = h->level;
 break;
 case PIC_I_START_CODE:
 if (!h->got_keyframe) {
diff --git a/tests/ref/fate/cavs-demux b/tests/ref/fate/cavs-demux
index 000b32ab05..6381f2075b 100644
--- a/tests/ref/fate/cavs-demux
+++ b/tests/ref/fate/cavs-demux
@@ -58,5 +58,5 @@ 
packet|codec_type=video|stream_index=0|pts=228|pts_time=1.90|dts=228
 
packet|codec_type=video|stream_index=0|pts=232|pts_time=1.93|dts=232|dts_time=1.93|duration=4|duration_time=0.03|size=67|pos=172185|flags=K__|data_hash=CRC32:42484449
 
packet|codec_type=video|stream_index=0|pts=236|pts_time=1.97|dts=236|dts_time=1.97|duration=4|duration_time=0.03|size=83|pos=172252|flags=K__|data_hash=CRC32:a941bdf0
 
packet|codec_type=video|stream_index=0|pts=240|pts_time=2.00|dts=240|dts_time=2.00|duration=4|duration_time=0.03|size=5417|pos=172335|flags=K__|data_hash=CRC32:9d0d503b
-stream|index=0|codec_name=cavs|profile=unknown|codec_type=video|codec_tag_string=[0][0][0][0]|codec_tag=0x|width=1280|height=720|coded_width=1280|coded_height=720|closed_captions=0|film_grain=0|has_b_frames=0|sample_aspect_ratio=N/A|display_aspect_ratio=N/A|pix_fmt=yuv420p|level=-99|color_range=unknown|color_space=unknown|color_transfer=unknown|color_primaries=unknown|chroma_location=unspecified|field_order=unknown|refs=1|id=N/A|r_frame_rate=30/1|avg_frame_rate=25/1|time_base=1/120|start_pts=N/A|start_time=N/A|duration_ts=N/A|duration=N/A|bit_rate=N/A|max_bit_rate=N/A|bits_per_raw_sample=N/A|nb_frames=N/A|nb_read_frames=N/A|nb_read_packets=60|extradata_size=18|extradata_hash=CRC32:1255d52e|disposition:default=0|disposition:dub=0|disposition:original=0|disposition:comment=0|disposition:lyrics=0|disposition:karaoke=0|disposition:forced=0|disposition:hearing_impaired=0|disposition:visual_impaired=0|disposition:clean_effects=0|disposition:attached_pic=0|disposition:timed_thumbna
 
ils=0|disposition:non_diegetic=0|disposition:captions=0|disposition:descriptions=0|disposition:metadata=0|disposition:dependent=0|disposition:still_image=0
+stream|index=0|codec_name=cavs|profile=32|codec_type=video|codec_tag_string=[0][0][0][0]|codec_tag=0x|width=1280|height=720|coded_width=1280|coded_height=720|closed_captions=0|film_grain=0|has_b_frames=0|sample_aspect_ratio=N/A|display_aspect_ratio=N/A|pix_fmt=yuv420p|level=64|color_range=unknown|color_space=unknown|color_transfer=unknown|color_primaries=unknown|chroma_location=unspecified|field_order=unknown|refs=1|id=N/A|r_frame_rate=30/1|avg_frame_rate=25/1|time_base=1/120|start_pts=N/A|start_time=N/A|duration_ts=N/A|duration=N/A|bit_rate=N/A|max_bit_rate=N/A|bits_per_raw_sample=N/A|nb_frames=N/A|nb_read_frames=N/A|nb_read_packets=60|extradata_size=18|extradata_hash=CRC32:1255d52e|disposition:default=0|disposition:dub=0|disposition:original=0|disposition:comment=0|disposition:lyrics=0|disposition:karaoke=0|disposition:forced=0|disposition:hearing_impaired=0|disposition:visual_impaired=0|disposition:clean_effects=0|disposition:attached_pic=0|disposition:timed_thumbnails=0|
 
disposition:non_diegetic=0|disposition:captions=0|disposition:descriptions=0|disposition:metadata=0|disposition:dependent=0|disposition:still_image=0
 
format|filename=bunny.mp4|nb_streams=1|nb_programs=0|format_name=cavsvideo|start_time=N/A|duration=N/A|size=177752|bit_rate=N/A|probe_score=51
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v1 2/2] vaapi: add vaapi_avs2 support

2024-01-21 Thread Jianfeng Zheng
Zhao Zhili  于2024年1月20日周六 12:22写道:
>
>
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of 
> > jianfeng.zheng
> > Sent: 2024年1月19日 23:53
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: jianfeng.zheng 
> > Subject: [FFmpeg-devel] [PATCH v1 2/2] vaapi: add vaapi_avs2 support
> >
> > see https://github.com/intel/libva/pull/738
> >
> > [Moore Threads](https://www.mthreads.com) (short for Mthreads) is a
> > Chinese GPU manufacturer. All our products, like MTTS70/MTTS80/.. ,
> > support AVS2 8bit/10bit HW decoding at max 8k resolution.
> >
> > Signed-off-by: jianfeng.zheng 
> > ---
> >  configure|   7 +
> >  libavcodec/Makefile  |   2 +
> >  libavcodec/allcodecs.c   |   1 +
> >  libavcodec/avs2.c| 345 ++-
> >  libavcodec/avs2.h| 460 +++-
> >  libavcodec/avs2_parser.c |   5 +-
> >  libavcodec/avs2dec.c | 569 +
> >  libavcodec/avs2dec.h |  48 +++
> >  libavcodec/avs2dec_headers.c | 787 +++
> >  libavcodec/codec_desc.c  |   5 +-
> >  libavcodec/defs.h|   4 +
> >  libavcodec/hwaccels.h|   1 +
> >  libavcodec/libdavs2.c|   2 +-
> >  libavcodec/profiles.c|   6 +
> >  libavcodec/profiles.h|   1 +
> >  libavcodec/vaapi_avs2.c  | 227 ++
> >  libavcodec/vaapi_decode.c|   5 +
> >  libavformat/matroska.c   |   1 +
> >  libavformat/mpeg.h   |   1 +
> >  19 files changed, 2450 insertions(+), 27 deletions(-)
> >  create mode 100644 libavcodec/avs2dec.c
> >  create mode 100644 libavcodec/avs2dec.h
> >  create mode 100644 libavcodec/avs2dec_headers.c
> >  create mode 100644 libavcodec/vaapi_avs2.c
> >
>
> Please split the patch properly. It's hard to review in a single chunk, and 
> it can't be tested
> without the hardware.

As a new player in the GPU market, we have always attached great importance
to the participation of the open source community. And willing to feed back our
new features to the field of video hardware acceleration.

As a pioneer, these new features may only be supported by our hardware
at current
time. We are willing to provide some market selling devices for free
to community
accredited contributors for testing related functions.

>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] flv: fix stereo flag when writing PCMA/PCMU

2024-01-21 Thread Alessandro Ros
Currently, when writing PCMA or PCMU tracks with FLV or RTMP, the
stereo flag and sample rate flag inside RTMP audio messages are
overridden, making impossible to distinguish between mono and stereo
tracks. This patch fixes the issue by restoring the same flag mechanism
of all other codecs, that takes into consideration the right channel
count and sample rate.

Signed-off-by: Alessandro Ros 
---
 libavformat/flvenc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavformat/flvenc.c b/libavformat/flvenc.c
index 874560fac1..772d891136 100644
--- a/libavformat/flvenc.c
+++ b/libavformat/flvenc.c
@@ -208,10 +208,10 @@ error:
 flags |= FLV_CODECID_NELLYMOSER| FLV_SAMPLESSIZE_16BIT;
 break;
 case AV_CODEC_ID_PCM_MULAW:
-flags = FLV_CODECID_PCM_MULAW | FLV_SAMPLERATE_SPECIAL | 
FLV_SAMPLESSIZE_16BIT;
+flags |= FLV_CODECID_PCM_MULAW | FLV_SAMPLESSIZE_16BIT;
 break;
 case AV_CODEC_ID_PCM_ALAW:
-flags = FLV_CODECID_PCM_ALAW  | FLV_SAMPLERATE_SPECIAL | 
FLV_SAMPLESSIZE_16BIT;
+flags |= FLV_CODECID_PCM_ALAW | FLV_SAMPLESSIZE_16BIT;
 break;
 case 0:
 flags |= par->codec_tag << 4;
-- 
2.34.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v4 0/2] GSoC 2023: Add Audio Overlay Filter

2024-01-21 Thread Harshit Karwal
Ping

On Tue, 16 Jan 2024 at 5:46 PM, Harshit Karwal 
wrote:

> Includes some fixes authored by Paul over the v3 patch I sent earlier, and
> FATE tests for the filter.
>
> Harshit Karwal (2):
>   avfilter: add audio overlay filter
>   fate: Add tests for aoverlay filter
>
>  doc/filters.texi   |  40 ++
>  libavfilter/Makefile   |   1 +
>  libavfilter/af_aoverlay.c  | 538 +
>  libavfilter/allfilters.c   |   1 +
>  tests/fate/filter-audio.mak|  22 +
>  tests/ref/fate/filter-aoverlay-crossfade-d | 224 +
>  tests/ref/fate/filter-aoverlay-crossfade-t | 202 
>  tests/ref/fate/filter-aoverlay-default | 259 ++
>  tests/ref/fate/filter-aoverlay-timeline| 254 ++
>  9 files changed, 1541 insertions(+)
>  create mode 100644 libavfilter/af_aoverlay.c
>  create mode 100644 tests/ref/fate/filter-aoverlay-crossfade-d
>  create mode 100644 tests/ref/fate/filter-aoverlay-crossfade-t
>  create mode 100644 tests/ref/fate/filter-aoverlay-default
>  create mode 100644 tests/ref/fate/filter-aoverlay-timeline
>
> --
> 2.39.3 (Apple Git-145)
>
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

2024-01-21 Thread Anton Khirnov
Quoting James Almer (2024-01-21 13:06:28)
> On 1/21/2024 3:27 AM, Anton Khirnov wrote:
> > Quoting James Almer (2024-01-20 23:04:06)
> >> This includes a struct and helpers. It will be used to support container 
> >> level
> >> cropping and tiled image formats, but should be generic enough for general
> >> usage.
> >>
> >> Signed-off-by: James Almer 
> >> ---
> >> Extended to include fields used for cropping. Should make the struct 
> >> reusable
> >> even for non tiled images, e.g. setting both rows and tiles to 1, in which 
> >> case
> >> tile width and height would become analogous to coded_{witdh,height}.
> > 
> > But why? What does cropping have to do with tiling? What advantage is
> > there to handling them in one struct?
> 
> The struct does not need to be used for non tiled image scenarios, but 
> could if we decide we don't want to add another struct that would only 
> contain a subset of the fields present here.
> 
> As to why said fields here present here, HEIF may use a clap box to 
> define cropping for the final image, not for the tiles. This needs to be 
> propagated, and the previous version of this API, which only defined 
> cropping from right and bottom edges if output dimensions were smaller 
> than the grid (standard case for tiled heif with no clap box), was not 
> enough. Hence this change.
> 
> I can rename this struct to Image Grid or something else, which might 
> make it feel less awkward if we decide to reuse it. We still need to 
> propagate container cropping from clap boxes and from Matroska elements 
> after all.

Honestly this whole new API strikes me as massively overthinking it. All
you should need to describe an arbitrary partition of an image into
sub-rectangles is an array of (x, y, width, height). Instead you're
proposing a new public header, struct, three functions, multiple "tile
types", and if I'm not mistaken it still cannot describe an arbitrary
partitioning. Plus it's in libavutil for some reason, even though
libavformat seems to be the only intended user.

Is all this complexity really warranted?

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] lavf/dvenc: improve error messaging

2024-01-21 Thread Anton Khirnov
Quoting Stefano Sabatini (2024-01-21 11:30:27)
> > > -if (((c->n_ast > 1) && (c->sys->n_difchan < 2)) ||
> > > -((c->n_ast > 2) && (c->sys->n_difchan < 4))) {
> > > -/* only 2 stereo pairs allowed in 50Mbps mode */
> > > -goto bail_out;
> > > +if ((c->n_ast > 1) && (c->sys->n_difchan < 2)) {
> > > +av_log(s, AV_LOG_ERROR,
> > > +   "Invalid number of channels %d, only 1 stereo pairs is 
> > > allowed in 25Mps mode.\n",
> > > +   c->n_ast);
> > > +return AVERROR_INVALIDDATA;
> > > +}
> > > +if ((c->n_ast > 2) && (c->sys->n_difchan < 4)) {
> > > +av_log(s, AV_LOG_ERROR,
> > > +   "Invalid number of channels %d, only 2 stereo pairs are 
> > > allowed in 50Mps mode.\n",
> > > +   c->n_ast);
> > > +return AVERROR_INVALIDDATA;
> > 
> 
> > Surely this can be done in one log statement.
> 
> Yes, but this would complicate the logic for small gain.

More complicated than duplicating 5 lines? I wouldn't say so, not to
mention the string also has to be duplicated in the binary.

Also, can the second case even trigger? Seems like the block above
ensures n_ast is never larger than 2.

> > > -" video: 25fps or 29.97fps, audio: 
> > > 2ch/48|44|32kHz/PCM\n"
> 
> > > +" video: 25fps or 29.97fps, audio: 
> > > 2ch/48000|44100|32000Hz/PCM\n"
> > 
> > This does not seem like an improvement.
> 
> 44kHz != 44100
> 
> I could use 44.1 but this is not the unit used when setting the
> option

It can be.

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 7/8] fftools/ffmpeg_demux: implement -bsf for input

2024-01-21 Thread Anton Khirnov
Quoting Stefano Sabatini (2024-01-20 12:32:42)
> On date Wednesday 2024-01-17 10:02:31 +0100, Anton Khirnov wrote:
> > Quoting Stefano Sabatini (2024-01-06 13:12:19)
> > > 
> > > This looks spurious, since this suggests the example is about the
> > > listing, and it's applying a weird order of example/explanation
> > > (rather than the opposite).
> 
> Use the @code{-bsfs} option to get the list of bitstream filters. E.g.
> @example
> ...
> 
> The problem here is that "E.g." is placed close to a statement about
> the listing, therefore it might sound like the example is about the
> listing (which is not).

I moved it to a new paragraph.

> > I see nothing weird about this order, it's the standard way it is done
> > in most literature I encounter. I find the reverse order you're
> > suggesting far more weird and unnatural.
> 
> When you present an example you usually start with an explanation
> (what it does) and then present the command, not the other way around.

I don't, neither does most literature I can recall. Typically you first
present a thing, then explain its structure. Explaning the structure of
something the reader has not seen yet is backwards, unnatural, and hard
to understand.

> 
> Also the following:
> --
> ffmpeg -bsf:v h264_mp4toannexb -i h264.mp4 -c:v copy -an out.h264
> @end example
> applies the @code{h264_mp4toannexb} bitstream filter (which converts
> MP4-encapsulated H.264 stream to Annex B) to the @emph{input} video stream.
> 
> On the other hand,
> @example
> ffmpeg -i file.mov -an -vn -bsf:s mov2textsub -c:s copy -f rawvideo sub.txt
> @end example
> applies the @code{mov2textsub} bitstream filter (which extracts text from MOV
> subtitles) to the @emph{output} subtitle stream. Note, however, that since 
> both
> examples use @code{-c copy}, it matters little whether the filters are applied
> on input or output - that would change if transcoding was hapenning.
> ---
> 
> this makes the reader need to correlate the two examples to figure
> them out, that's why I reworked the presentation in my suggestion as a
> more linear sequence of presentation/command/presentation/command.
> 
> In general examples should focus on how a task can be done, not on the
> explanation of the command itself.

I disagree. Examples should focus on whatever can be usefully explained
with an example.

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

2024-01-21 Thread James Almer

On 1/21/2024 2:29 PM, Anton Khirnov wrote:

Quoting James Almer (2024-01-21 13:06:28)

On 1/21/2024 3:27 AM, Anton Khirnov wrote:

Quoting James Almer (2024-01-20 23:04:06)

This includes a struct and helpers. It will be used to support container level
cropping and tiled image formats, but should be generic enough for general
usage.

Signed-off-by: James Almer 
---
Extended to include fields used for cropping. Should make the struct reusable
even for non tiled images, e.g. setting both rows and tiles to 1, in which case
tile width and height would become analogous to coded_{witdh,height}.


But why? What does cropping have to do with tiling? What advantage is
there to handling them in one struct?


The struct does not need to be used for non tiled image scenarios, but
could if we decide we don't want to add another struct that would only
contain a subset of the fields present here.

As to why said fields here present here, HEIF may use a clap box to
define cropping for the final image, not for the tiles. This needs to be
propagated, and the previous version of this API, which only defined
cropping from right and bottom edges if output dimensions were smaller
than the grid (standard case for tiled heif with no clap box), was not
enough. Hence this change.

I can rename this struct to Image Grid or something else, which might
make it feel less awkward if we decide to reuse it. We still need to
propagate container cropping from clap boxes and from Matroska elements
after all.


Honestly this whole new API strikes me as massively overthinking it. All
you should need to describe an arbitrary partition of an image into
sub-rectangles is an array of (x, y, width, height). Instead you're
proposing a new public header, struct, three functions, multiple "tile
types", and if I'm not mistaken it still cannot describe an arbitrary
partitioning. Plus it's in libavutil for some reason, even though
libavformat seems to be the only intended user.

Is all this complexity really warranted?


1. It needs to be usable as a Stream Group type, so a struct is 
required. Said struct needs an allocator unless we want to have its size 
be part of the ABI. I can remove the free function, but then the caller 
needs to manually free any internal data.
2. We need tile dimensions (Width and height) plus row and column count, 
which give you the final size of the grid, then offsets x and y to get 
the actual image within the grid meant for presentation.
3. I want to support uniform tiles as well as variable tile dimensions, 
hence multiple tile types. The latter currently has no use case, but 
eventually might. I can if you prefer not include said type at first, 
but i want to keep the union in place so it and other extensions can be 
added.
4. It's in lavu because its meant to be generic. It can also be used to 
transport tiling and cropping information as stream and packet side 
data, which can't depend on something defined in lavf.


And what do you mean with not supporting describing arbitrary 
partitioning? Isn't that what variable tile dimensions achieve?

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 7/8] fftools/ffmpeg_demux: implement -bsf for input

2024-01-21 Thread Stefano Sabatini
On date Sunday 2024-01-21 18:43:36 +0100, Anton Khirnov wrote:
> Quoting Stefano Sabatini (2024-01-20 12:32:42)
[...]
> > When you present an example you usually start with an explanation
> > (what it does) and then present the command, not the other way around.
> 
> I don't, neither does most literature I can recall. Typically you first
> present a thing, then explain its structure. Explaning the structure of
> something the reader has not seen yet is backwards, unnatural, and hard
> to understand.

I still don't understand what "literature" you are referring to.

If you see most examples in the FFmpeg docs they are in the form:
@item
This does this and that...:
@example
...
@end example

An explanation is presented *before* introducing the example itself,
in other words plain English before the actual command/code.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

2024-01-21 Thread Anton Khirnov
Quoting James Almer (2024-01-21 18:47:43)
> On 1/21/2024 2:29 PM, Anton Khirnov wrote:
> > Honestly this whole new API strikes me as massively overthinking it. All
> > you should need to describe an arbitrary partition of an image into
> > sub-rectangles is an array of (x, y, width, height). Instead you're
> > proposing a new public header, struct, three functions, multiple "tile
> > types", and if I'm not mistaken it still cannot describe an arbitrary
> > partitioning. Plus it's in libavutil for some reason, even though
> > libavformat seems to be the only intended user.
> > 
> > Is all this complexity really warranted?
> 
> 1. It needs to be usable as a Stream Group type, so a struct is 
> required. Said struct needs an allocator unless we want to have its size 
> be part of the ABI. I can remove the free function, but then the caller 
> needs to manually free any internal data.

If the struct lives in lavf and is always allocated as a part of
AVStreamGroup then you don't need a public constructor/destructor and
can still extend the struct.

> 2. We need tile dimensions (Width and height) plus row and column count, 
> which give you the final size of the grid, then offsets x and y to get 
> the actual image within the grid meant for presentation.
> 3. I want to support uniform tiles as well as variable tile dimensions, 
> hence multiple tile types. The latter currently has no use case, but 
> eventually might. I can if you prefer not include said type at first, 
> but i want to keep the union in place so it and other extensions can be 
> added.
> 4. It's in lavu because its meant to be generic. It can also be used to 
> transport tiling and cropping information as stream and packet side 
> data, which can't depend on something defined in lavf.

When would you have tiling information associated with a specific
stream?

> And what do you mean with not supporting describing arbitrary 
> partitioning? Isn't that what variable tile dimensions achieve?

IIUC your tiling scheme still assumes that the partitioning is by rows
and columns. A completely generic partitioning could be irregular.

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] 0001-fix-segment-fault-in-function-decode

2024-01-21 Thread Stefano Sabatini
On date Saturday 2024-01-13 05:57:18 +0800, 陈督 wrote:
> 
> 
>  /*When it is not a planar arrangement, data[1] is empty,
> 
> and all the data is interleaved in data[0].
> 
> This can result in a segmentation fault when accessing data[ch] .*/
> 
> //So I delete the code below:
> 
> for (i = 0; i < frame->nb_samples; i++)
> 
> for (ch = 0; ch < dec_ctx->ch_layout.nb_channels; ch++)
> 
> fwrite(frame->data[ch] + data_size*i, 1, data_size, outfile);
> 
> 
> 
> 
> //And I write this instead
> 
> // L R data order
> 
> if (av_sample_fmt_is_planar(dec_ctx->sample_fmt))
> 
> {
> 
> // planar:LLL...RRR... in different data[ch]
> 

> for (ch = 0; ch < dec_ctx->ch_layout.nb_channels; ch++)
> 
> {
> 
> fwrite(frame->data[ch], 1, frame->linesize[0], outfile); // only 
> linesize[0] has data.
> 

The problem with this approach is that this is generating output in a
format which cannot be played by ffplay, which is assuming packed
(i.e. non planar) format. So it is expecting the output file to be
written as:
LRLRLR...

rather than as:
LLR

also because ffplay does not know the linesize.

...

But I see the example code should be fixed, it was designed with the
assumption that the input sample format was always packed, which is
not the case anymore.

[...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 7/8] fftools/ffmpeg_demux: implement -bsf for input

2024-01-21 Thread Anton Khirnov
Quoting Stefano Sabatini (2024-01-21 19:22:35)
> On date Sunday 2024-01-21 18:43:36 +0100, Anton Khirnov wrote:
> > Quoting Stefano Sabatini (2024-01-20 12:32:42)
> [...]
> > > When you present an example you usually start with an explanation
> > > (what it does) and then present the command, not the other way around.
> > 
> > I don't, neither does most literature I can recall. Typically you first
> > present a thing, then explain its structure. Explaning the structure of
> > something the reader has not seen yet is backwards, unnatural, and hard
> > to understand.
> 
> I still don't understand what "literature" you are referring to.

Various manuals and textbooks I've read.

> If you see most examples in the FFmpeg docs they are in the form:

Our documentation is widely considered to be somewhere between atrocious
and unusable (and sometimes actively misleading), so the fact that it
does something in a specific way does not at all mean that it's a good
idea.

I have also personally seen (and fixed) countless instances of
mindlessly perpetuated cargo cults in it.

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

2024-01-21 Thread James Almer

On 1/21/2024 3:29 PM, Anton Khirnov wrote:

Quoting James Almer (2024-01-21 18:47:43)

On 1/21/2024 2:29 PM, Anton Khirnov wrote:

Honestly this whole new API strikes me as massively overthinking it. All
you should need to describe an arbitrary partition of an image into
sub-rectangles is an array of (x, y, width, height). Instead you're
proposing a new public header, struct, three functions, multiple "tile
types", and if I'm not mistaken it still cannot describe an arbitrary
partitioning. Plus it's in libavutil for some reason, even though
libavformat seems to be the only intended user.

Is all this complexity really warranted?


1. It needs to be usable as a Stream Group type, so a struct is
required. Said struct needs an allocator unless we want to have its size
be part of the ABI. I can remove the free function, but then the caller
needs to manually free any internal data.


If the struct lives in lavf and is always allocated as a part of
AVStreamGroup then you don't need a public constructor/destructor and
can still extend the struct.


Yes, but that would be the case if it's only meant to be allocated by 
AVStreamGroup and nothing else.





2. We need tile dimensions (Width and height) plus row and column count,
which give you the final size of the grid, then offsets x and y to get
the actual image within the grid meant for presentation.
3. I want to support uniform tiles as well as variable tile dimensions,
hence multiple tile types. The latter currently has no use case, but
eventually might. I can if you prefer not include said type at first,
but i want to keep the union in place so it and other extensions can be
added.
4. It's in lavu because its meant to be generic. It can also be used to
transport tiling and cropping information as stream and packet side
data, which can't depend on something defined in lavf.


When would you have tiling information associated with a specific
stream?


Can't think of an example for tiling, but i can for cropping. If you 
insist on not reusing this for non-HEIF cropping usage in mp4/matroska, 
then ok, I'll move it to lavf.





And what do you mean with not supporting describing arbitrary
partitioning? Isn't that what variable tile dimensions achieve?


IIUC your tiling scheme still assumes that the partitioning is by rows
and columns. A completely generic partitioning could be irregular.


A new tile type that doesn't define rows and columns can be added if 
needed. But the current variable tile type can support things like grids 
of two rows and two columns where the second row is effectively a single 
tile, simply by setting the second tile in said row as having a width of 0.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

2024-01-21 Thread Anton Khirnov
Quoting James Almer (2024-01-21 19:38:50)
> On 1/21/2024 3:29 PM, Anton Khirnov wrote:
> > Quoting James Almer (2024-01-21 18:47:43)
> >> On 1/21/2024 2:29 PM, Anton Khirnov wrote:
> >>> Honestly this whole new API strikes me as massively overthinking it. All
> >>> you should need to describe an arbitrary partition of an image into
> >>> sub-rectangles is an array of (x, y, width, height). Instead you're
> >>> proposing a new public header, struct, three functions, multiple "tile
> >>> types", and if I'm not mistaken it still cannot describe an arbitrary
> >>> partitioning. Plus it's in libavutil for some reason, even though
> >>> libavformat seems to be the only intended user.
> >>>
> >>> Is all this complexity really warranted?
> >>
> >> 1. It needs to be usable as a Stream Group type, so a struct is
> >> required. Said struct needs an allocator unless we want to have its size
> >> be part of the ABI. I can remove the free function, but then the caller
> >> needs to manually free any internal data.
> > 
> > If the struct lives in lavf and is always allocated as a part of
> > AVStreamGroup then you don't need a public constructor/destructor and
> > can still extend the struct.
> 
> Yes, but that would be the case if it's only meant to be allocated by 
> AVStreamGroup and nothing else.

That is the case right now, no?

If that ever changes then the constructor can be added.

> > 
> >> 2. We need tile dimensions (Width and height) plus row and column count,
> >> which give you the final size of the grid, then offsets x and y to get
> >> the actual image within the grid meant for presentation.
> >> 3. I want to support uniform tiles as well as variable tile dimensions,
> >> hence multiple tile types. The latter currently has no use case, but
> >> eventually might. I can if you prefer not include said type at first,
> >> but i want to keep the union in place so it and other extensions can be
> >> added.
> >> 4. It's in lavu because its meant to be generic. It can also be used to
> >> transport tiling and cropping information as stream and packet side
> >> data, which can't depend on something defined in lavf.
> > 
> > When would you have tiling information associated with a specific
> > stream?
> 
> Can't think of an example for tiling, but i can for cropping. If you 
> insist on not reusing this for non-HEIF cropping usage in mp4/matroska, 
> then ok, I'll move it to lavf.

I still don't see why should it be a good idea to use this struct for
generic container cropping. It feels very much like a hammer in search
of a nail.

> > 
> >> And what do you mean with not supporting describing arbitrary
> >> partitioning? Isn't that what variable tile dimensions achieve?
> > 
> > IIUC your tiling scheme still assumes that the partitioning is by rows
> > and columns. A completely generic partitioning could be irregular.
> 
> A new tile type that doesn't define rows and columns can be added if 
> needed. But the current variable tile type can support things like grids 
> of two rows and two columns where the second row is effectively a single 
> tile, simply by setting the second tile in said row as having a width of 0.

The problem I see here is that every consumer of this struct then has to
explicitly support every type, and adding a new type requires updating
all callers. This seems unnecessary when "list of N rectangles" covers
all possible partitionings.

That does not mean you actually have to store it that way - the struct
could be a list of N rectangles logically, while actually being
represented more efficiently (in the same way a channel layout is always
logically a list of channels, even though it's often represented by an
uint64 rather than a malloced array).

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 7/8] fftools/ffmpeg_demux: implement -bsf for input

2024-01-21 Thread Stefano Sabatini
On date Sunday 2024-01-21 19:35:01 +0100, Anton Khirnov wrote:
> Quoting Stefano Sabatini (2024-01-21 19:22:35)
> > On date Sunday 2024-01-21 18:43:36 +0100, Anton Khirnov wrote:
> > > Quoting Stefano Sabatini (2024-01-20 12:32:42)
> > [...]
> > > > When you present an example you usually start with an explanation
> > > > (what it does) and then present the command, not the other way around.
> > > 
> > > I don't, neither does most literature I can recall. Typically you first
> > > present a thing, then explain its structure. Explaning the structure of
> > > something the reader has not seen yet is backwards, unnatural, and hard
> > > to understand.
> > 
> > I still don't understand what "literature" you are referring to.
> 
> Various manuals and textbooks I've read.
> 
> > If you see most examples in the FFmpeg docs they are in the form:
> 

> Our documentation is widely considered to be somewhere between atrocious
> and unusable

nah, it's not so bad, also this applies to most documentation

Besides FFmpeg is possibly the most sophisticated existing toolkit in
terms of features/configuration, so this is somehow expected (at least
if you expect a tutorial rather than a reference).

> (and sometimes actively misleading), so the fact that it
> does something in a specific way does not at all mean that it's a good
> idea.

So what do you propose instead? The fact that it is not perfect does
not mean that everything is bad.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] lavf/dvenc: improve error messaging

2024-01-21 Thread Stefano Sabatini
On date Sunday 2024-01-21 18:39:19 +0100, Anton Khirnov wrote:
> Quoting Stefano Sabatini (2024-01-21 11:30:27)
[...]
> Also, can the second case even trigger? Seems like the block above
> ensures n_ast is never larger than 2.

Yes, this seems a miss from commit
eafa8e859297813dcf0e6b43e85720be0a5f.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

2024-01-21 Thread James Almer

On 1/21/2024 4:02 PM, Anton Khirnov wrote:

Quoting James Almer (2024-01-21 19:38:50)

On 1/21/2024 3:29 PM, Anton Khirnov wrote:

Quoting James Almer (2024-01-21 18:47:43)

On 1/21/2024 2:29 PM, Anton Khirnov wrote:

Honestly this whole new API strikes me as massively overthinking it. All
you should need to describe an arbitrary partition of an image into
sub-rectangles is an array of (x, y, width, height). Instead you're
proposing a new public header, struct, three functions, multiple "tile
types", and if I'm not mistaken it still cannot describe an arbitrary
partitioning. Plus it's in libavutil for some reason, even though
libavformat seems to be the only intended user.

Is all this complexity really warranted?


1. It needs to be usable as a Stream Group type, so a struct is
required. Said struct needs an allocator unless we want to have its size
be part of the ABI. I can remove the free function, but then the caller
needs to manually free any internal data.


If the struct lives in lavf and is always allocated as a part of
AVStreamGroup then you don't need a public constructor/destructor and
can still extend the struct.


Yes, but that would be the case if it's only meant to be allocated by
AVStreamGroup and nothing else.


That is the case right now, no?

If that ever changes then the constructor can be added.




2. We need tile dimensions (Width and height) plus row and column count,
which give you the final size of the grid, then offsets x and y to get
the actual image within the grid meant for presentation.
3. I want to support uniform tiles as well as variable tile dimensions,
hence multiple tile types. The latter currently has no use case, but
eventually might. I can if you prefer not include said type at first,
but i want to keep the union in place so it and other extensions can be
added.
4. It's in lavu because its meant to be generic. It can also be used to
transport tiling and cropping information as stream and packet side
data, which can't depend on something defined in lavf.


When would you have tiling information associated with a specific
stream?


Can't think of an example for tiling, but i can for cropping. If you
insist on not reusing this for non-HEIF cropping usage in mp4/matroska,
then ok, I'll move it to lavf.


I still don't see why should it be a good idea to use this struct for
generic container cropping. It feels very much like a hammer in search
of a nail.


Because once we support container cropping, we will be defining a 
stream/packet side data type that will contain a subset of the fields 
from this struct.


If we reuse this struct, we can export a clap box as an AVTileGrid (Or i 
can rename it to AVImageGrid, and tile to subrectangle) either as the 
stream group tile grid specific parameters if HEIF, or as stream side 
data otherwise.







And what do you mean with not supporting describing arbitrary
partitioning? Isn't that what variable tile dimensions achieve?


IIUC your tiling scheme still assumes that the partitioning is by rows
and columns. A completely generic partitioning could be irregular.


A new tile type that doesn't define rows and columns can be added if
needed. But the current variable tile type can support things like grids
of two rows and two columns where the second row is effectively a single
tile, simply by setting the second tile in said row as having a width of 0.


The problem I see here is that every consumer of this struct then has to
explicitly support every type, and adding a new type requires updating
all callers. This seems unnecessary when "list of N rectangles" covers
all possible partitionings.


Well, the variable type supports a list of N rectangles where each 
rectangle has arbitrary dimensions, and you can do things like having 
three tiles/rectangles that together still form a rectangle, while 
defining row and column count. So i don't personally see the need for a 
new type to begin with.




That does not mean you actually have to store it that way - the struct
could be a list of N rectangles logically, while actually being
represented more efficiently (in the same way a channel layout is always
logically a list of channels, even though it's often represented by an
uint64 rather than a malloced array).


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] liavcodec: add bit-rate support to RoQ video encoder

2024-01-21 Thread Victor Luchits

One can now use the bitrate option (-b) to specify bit rate of the video
stream in the RoQ encoder. The option only becomes effective for values
above 800kbit/s, which is roughly equivalent to bandwidth of a 1x-speed
CD-ROM drive, minus the bandwidth taken up by stereo DPCM stream. Values
below this threshold produce visually inadequate results.

Original patch by Joseph Fenton aka Chilly Willy

Signed-off-by: Victor Luchits 
---
 Changelog|   1 +
 libavcodec/roqvideo.h|   1 +
 libavcodec/roqvideodec.c |  15 +
 libavcodec/roqvideoenc.c | 118 ++-
 libavcodec/version.h |   2 +-
 5 files changed, 123 insertions(+), 14 deletions(-)

diff --git a/Changelog b/Changelog
index c40b6d08fd..6974312f9d 100644
--- a/Changelog
+++ b/Changelog
@@ -22,6 +22,7 @@ version :
 - ffmpeg CLI -bsf option may now be used for input as well as output
 - ffmpeg CLI options may now be used as -/opt , which is equivalent
   to -opt >
+- RoQ video bit rate option support
  version 6.1:
 - libaribcaption decoder
diff --git a/libavcodec/roqvideo.h b/libavcodec/roqvideo.h
index 2c2e42884d..6d30bcaada 100644
--- a/libavcodec/roqvideo.h
+++ b/libavcodec/roqvideo.h
@@ -43,6 +43,7 @@ typedef struct RoqContext {
 AVFrame *last_frame;
 AVFrame *current_frame;
 int width, height;
+int key_frame;
  roq_cell cb2x2[256];
 roq_qcell cb4x4[256];
diff --git a/libavcodec/roqvideodec.c b/libavcodec/roqvideodec.c
index bfc69a65c9..07d6b8bb8f 100644
--- a/libavcodec/roqvideodec.c
+++ b/libavcodec/roqvideodec.c
@@ -70,6 +70,7 @@ static void roqvideo_decode_frame(RoqContext *ri, 
GetByteContext *gb)

  chunk_start = bytestream2_tell(gb);
 xpos = ypos = 0;
+ri->key_frame = 1;
  if (chunk_size > bytestream2_get_bytes_left(gb)) {
 av_log(ri->logctx, AV_LOG_ERROR, "Chunk does not fit in input 
buffer\n");
@@ -92,12 +93,14 @@ static void roqvideo_decode_frame(RoqContext *ri, 
GetByteContext *gb)

  switch(vqid) {
 case RoQ_ID_MOT:
+ri->key_frame = 0;
 break;
 case RoQ_ID_FCC: {
 int byte = bytestream2_get_byte(gb);
 mx = 8 - (byte >> 4) - ((signed char) (chunk_arg 
>> 8));

 my = 8 - (byte & 0xf) - ((signed char) chunk_arg);
 ff_apply_motion_8x8(ri, xp, yp, mx, my);
+ri->key_frame = 0;
 break;
 }
 case RoQ_ID_SLD:
@@ -125,12 +128,14 @@ static void roqvideo_decode_frame(RoqContext *ri, 
GetByteContext *gb)

 vqflg_pos--;
 switch(vqid) {
 case RoQ_ID_MOT:
+ri->key_frame = 0;
 break;
 case RoQ_ID_FCC: {
 int byte = bytestream2_get_byte(gb);
 mx = 8 - (byte >> 4) - ((signed char) 
(chunk_arg >> 8));
 my = 8 - (byte & 0xf) - ((signed char) 
chunk_arg);

 ff_apply_motion_4x4(ri, x, y, mx, my);
+ri->key_frame = 0;
 break;
 }
 case RoQ_ID_SLD:
@@ -214,6 +219,16 @@ static int roq_decode_frame(AVCodecContext *avctx, 
AVFrame *rframe,

  if ((ret = av_frame_ref(rframe, s->current_frame)) < 0)
 return ret;
+
+/* Keyframe when no MOT or FCC codes in frame */
+if (s->key_frame) {
+av_log(avctx, AV_LOG_VERBOSE, "\nFound keyframe!\n");
+rframe->pict_type = AV_PICTURE_TYPE_I;
+avpkt->flags |= AV_PKT_FLAG_KEY;
+} else {
+rframe->pict_type = AV_PICTURE_TYPE_P;
+}
+
 *got_frame  = 1;
  /* shuffle frames */
diff --git a/libavcodec/roqvideoenc.c b/libavcodec/roqvideoenc.c
index 0933abf4f9..bcead80bbd 100644
--- a/libavcodec/roqvideoenc.c
+++ b/libavcodec/roqvideoenc.c
@@ -79,6 +79,9 @@
 /* The cast is useful when multiplying it by INT_MAX */
 #define ROQ_LAMBDA_SCALE ((uint64_t) FF_LAMBDA_SCALE)
 +/* The default minimum bitrate, set around the value of a 1x speed 
CD-ROM drive */

+#define ROQ_DEFAULT_MIN_BIT_RATE 800*1024
+
 typedef struct RoqCodebooks {
 int numCB4;
 int numCB2;
@@ -136,6 +139,8 @@ typedef struct RoqEncContext {
 struct ELBGContext *elbg;
 AVLFG randctx;
 uint64_t lambda;
+uint64_t last_lambda;
+int lambda_delta;
  motion_vect *this_motion4;
 motion_vect *last_motion4;
@@ -887,8 +892,9 @@ static int generate_new_codebooks(RoqEncContext *enc)
 return 0;
 }
 -static int roq_encode_video(RoqEncContext *enc)
+static int roq_encode_video(AVCodecContext *avctx)
 {
+RoqEncContext *const enc = avctx->priv_data;
 RoqTempData *const tempData = &enc->tmp_data;
 RoqContext *const roq = &enc->common;
 int ret;
@@ -910,14 +916,14 @@ static int r

Re: [FFmpeg-devel] [PATCH] liavcodec: add bit-rate support to RoQ video encoder

2024-01-21 Thread Michael Niedermayer
On Sun, Jan 21, 2024 at 11:19:43PM +0300, Victor Luchits wrote:
> One can now use the bitrate option (-b) to specify bit rate of the video
> stream in the RoQ encoder. The option only becomes effective for values
> above 800kbit/s, which is roughly equivalent to bandwidth of a 1x-speed
> CD-ROM drive, minus the bandwidth taken up by stereo DPCM stream. Values
> below this threshold produce visually inadequate results.
> 
> Original patch by Joseph Fenton aka Chilly Willy
> 
> Signed-off-by: Victor Luchits 

[...]

> diff --git a/libavcodec/roqvideodec.c b/libavcodec/roqvideodec.c
> index bfc69a65c9..07d6b8bb8f 100644
> --- a/libavcodec/roqvideodec.c
> +++ b/libavcodec/roqvideodec.c
> @@ -70,6 +70,7 @@ static void roqvideo_decode_frame(RoqContext *ri,
> GetByteContext *gb)
>   chunk_start = bytestream2_tell(gb);
>  xpos = ypos = 0;
> +ri->key_frame = 1;
>   if (chunk_size > bytestream2_get_bytes_left(gb)) {
>  av_log(ri->logctx, AV_LOG_ERROR, "Chunk does not fit in input
> buffer\n");
> @@ -92,12 +93,14 @@ static void roqvideo_decode_frame(RoqContext *ri,
> GetByteContext *gb)
>   switch(vqid) {

There seems to be some line wraping problem

please repost the patch without linewraping / extra newlines

Applying: liavcodec: add bit-rate support to RoQ video encoder
error: corrupt patch at line 20
error: could not build fake ancestor
Patch failed at 0001 liavcodec: add bit-rate support to RoQ video encoder
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I have never wished to cater to the crowd; for what I know they do not
approve, and what they approve I do not know. -- Epicurus


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

2024-01-21 Thread James Almer

On 1/21/2024 4:29 PM, James Almer wrote:

On 1/21/2024 4:02 PM, Anton Khirnov wrote:

Quoting James Almer (2024-01-21 19:38:50)

On 1/21/2024 3:29 PM, Anton Khirnov wrote:

Quoting James Almer (2024-01-21 18:47:43)

On 1/21/2024 2:29 PM, Anton Khirnov wrote:
Honestly this whole new API strikes me as massively overthinking 
it. All

you should need to describe an arbitrary partition of an image into
sub-rectangles is an array of (x, y, width, height). Instead you're
proposing a new public header, struct, three functions, multiple 
"tile

types", and if I'm not mistaken it still cannot describe an arbitrary
partitioning. Plus it's in libavutil for some reason, even though
libavformat seems to be the only intended user.

Is all this complexity really warranted?


1. It needs to be usable as a Stream Group type, so a struct is
required. Said struct needs an allocator unless we want to have its 
size
be part of the ABI. I can remove the free function, but then the 
caller

needs to manually free any internal data.


If the struct lives in lavf and is always allocated as a part of
AVStreamGroup then you don't need a public constructor/destructor and
can still extend the struct.


Yes, but that would be the case if it's only meant to be allocated by
AVStreamGroup and nothing else.


That is the case right now, no?

If that ever changes then the constructor can be added.



2. We need tile dimensions (Width and height) plus row and column 
count,

which give you the final size of the grid, then offsets x and y to get
the actual image within the grid meant for presentation.
3. I want to support uniform tiles as well as variable tile 
dimensions,

hence multiple tile types. The latter currently has no use case, but
eventually might. I can if you prefer not include said type at first,
but i want to keep the union in place so it and other extensions 
can be

added.
4. It's in lavu because its meant to be generic. It can also be 
used to

transport tiling and cropping information as stream and packet side
data, which can't depend on something defined in lavf.


When would you have tiling information associated with a specific
stream?


Can't think of an example for tiling, but i can for cropping. If you
insist on not reusing this for non-HEIF cropping usage in mp4/matroska,
then ok, I'll move it to lavf.


I still don't see why should it be a good idea to use this struct for
generic container cropping. It feels very much like a hammer in search
of a nail.


Because once we support container cropping, we will be defining a 
stream/packet side data type that will contain a subset of the fields 
from this struct.


If we reuse this struct, we can export a clap box as an AVTileGrid (Or i 
can rename it to AVImageGrid, and tile to subrectangle) either as the 
stream group tile grid specific parameters if HEIF, or as stream side 
data otherwise.







And what do you mean with not supporting describing arbitrary
partitioning? Isn't that what variable tile dimensions achieve?


IIUC your tiling scheme still assumes that the partitioning is by rows
and columns. A completely generic partitioning could be irregular.


A new tile type that doesn't define rows and columns can be added if
needed. But the current variable tile type can support things like grids
of two rows and two columns where the second row is effectively a single
tile, simply by setting the second tile in said row as having a width 
of 0.


The problem I see here is that every consumer of this struct then has to
explicitly support every type, and adding a new type requires updating
all callers. This seems unnecessary when "list of N rectangles" covers
all possible partitionings.


Well, the variable type supports a list of N rectangles where each 
rectangle has arbitrary dimensions, and you can do things like having 
three tiles/rectangles that together still form a rectangle, while 
defining row and column count. So i don't personally see the need for a 
new type to begin with.


I could remove the types and the union altogether and leave only the 
array even for uniform tiles if you think that simplifies the API, but 
seems like a waste of memory to allocate a rows x cols array of ints 
just to have the same value written for every entry.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] liavcodec: add bit-rate support to RoQ video encoder

2024-01-21 Thread Victor Luchits

One can now use the bitrate option (-b) to specify bit rate of the video
stream in the RoQ encoder. The option only becomes effective for values
above 800kbit/s, which is roughly equivalent to bandwidth of a 1x-speed
CD-ROM drive, minus the bandwidth taken up by stereo DPCM stream. Values
below this threshold produce visually inadequate results.

Original patch by Joseph Fenton aka Chilly Willy

Signed-off-by: Victor Luchits 
---
 Changelog|   1 +
 libavcodec/roqvideo.h|   1 +
 libavcodec/roqvideodec.c |  15 +
 libavcodec/roqvideoenc.c | 118 ++-
 libavcodec/version.h |   2 +-
 5 files changed, 123 insertions(+), 14 deletions(-)

diff --git a/Changelog b/Changelog
index c40b6d08fd..6974312f9d 100644
--- a/Changelog
+++ b/Changelog
@@ -22,6 +22,7 @@ version :
 - ffmpeg CLI -bsf option may now be used for input as well as output
 - ffmpeg CLI options may now be used as -/opt , which is equivalent
   to -opt >
+- RoQ video bit rate option support
 
 version 6.1:

 - libaribcaption decoder
diff --git a/libavcodec/roqvideo.h b/libavcodec/roqvideo.h
index 2c2e42884d..6d30bcaada 100644
--- a/libavcodec/roqvideo.h
+++ b/libavcodec/roqvideo.h
@@ -43,6 +43,7 @@ typedef struct RoqContext {
 AVFrame *last_frame;
 AVFrame *current_frame;
 int width, height;
+int key_frame;
 
 roq_cell cb2x2[256];

 roq_qcell cb4x4[256];
diff --git a/libavcodec/roqvideodec.c b/libavcodec/roqvideodec.c
index bfc69a65c9..07d6b8bb8f 100644
--- a/libavcodec/roqvideodec.c
+++ b/libavcodec/roqvideodec.c
@@ -70,6 +70,7 @@ static void roqvideo_decode_frame(RoqContext *ri, 
GetByteContext *gb)
 
 chunk_start = bytestream2_tell(gb);

 xpos = ypos = 0;
+ri->key_frame = 1;
 
 if (chunk_size > bytestream2_get_bytes_left(gb)) {

 av_log(ri->logctx, AV_LOG_ERROR, "Chunk does not fit in input 
buffer\n");
@@ -92,12 +93,14 @@ static void roqvideo_decode_frame(RoqContext *ri, 
GetByteContext *gb)
 
 switch(vqid) {

 case RoQ_ID_MOT:
+ri->key_frame = 0;
 break;
 case RoQ_ID_FCC: {
 int byte = bytestream2_get_byte(gb);
 mx = 8 - (byte >> 4) - ((signed char) (chunk_arg >> 8));
 my = 8 - (byte & 0xf) - ((signed char) chunk_arg);
 ff_apply_motion_8x8(ri, xp, yp, mx, my);
+ri->key_frame = 0;
 break;
 }
 case RoQ_ID_SLD:
@@ -125,12 +128,14 @@ static void roqvideo_decode_frame(RoqContext *ri, 
GetByteContext *gb)
 vqflg_pos--;
 switch(vqid) {
 case RoQ_ID_MOT:
+ri->key_frame = 0;
 break;
 case RoQ_ID_FCC: {
 int byte = bytestream2_get_byte(gb);
 mx = 8 - (byte >> 4) - ((signed char) (chunk_arg 
>> 8));
 my = 8 - (byte & 0xf) - ((signed char) chunk_arg);
 ff_apply_motion_4x4(ri, x, y, mx, my);
+ri->key_frame = 0;
 break;
 }
 case RoQ_ID_SLD:
@@ -214,6 +219,16 @@ static int roq_decode_frame(AVCodecContext *avctx, AVFrame 
*rframe,
 
 if ((ret = av_frame_ref(rframe, s->current_frame)) < 0)

 return ret;
+
+/* Keyframe when no MOT or FCC codes in frame */
+if (s->key_frame) {
+av_log(avctx, AV_LOG_VERBOSE, "\nFound keyframe!\n");
+rframe->pict_type = AV_PICTURE_TYPE_I;
+avpkt->flags |= AV_PKT_FLAG_KEY;
+} else {
+rframe->pict_type = AV_PICTURE_TYPE_P;
+}
+
 *got_frame  = 1;
 
 /* shuffle frames */

diff --git a/libavcodec/roqvideoenc.c b/libavcodec/roqvideoenc.c
index 0933abf4f9..bcead80bbd 100644
--- a/libavcodec/roqvideoenc.c
+++ b/libavcodec/roqvideoenc.c
@@ -79,6 +79,9 @@
 /* The cast is useful when multiplying it by INT_MAX */
 #define ROQ_LAMBDA_SCALE ((uint64_t) FF_LAMBDA_SCALE)
 
+/* The default minimum bitrate, set around the value of a 1x speed CD-ROM drive */

+#define ROQ_DEFAULT_MIN_BIT_RATE 800*1024
+
 typedef struct RoqCodebooks {
 int numCB4;
 int numCB2;
@@ -136,6 +139,8 @@ typedef struct RoqEncContext {
 struct ELBGContext *elbg;
 AVLFG randctx;
 uint64_t lambda;
+uint64_t last_lambda;
+int lambda_delta;
 
 motion_vect *this_motion4;

 motion_vect *last_motion4;
@@ -887,8 +892,9 @@ static int generate_new_codebooks(RoqEncContext *enc)
 return 0;
 }
 
-static int roq_encode_video(RoqEncContext *enc)

+static int roq_encode_video(AVCodecContext *avctx)
 {
+RoqEncContext *const enc = avctx->priv_data;
 RoqTempData *const tempData = &enc->tmp_data;
 RoqContext *const roq = &enc->common;
 int ret;
@@ -910,14 +916,14 @@ st

[FFmpeg-devel] [PATCH] lavc/aarch64: fix include for cpu.h

2024-01-21 Thread Ramiro Polla
---
 libavcodec/aarch64/idctdsp_init_aarch64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/aarch64/idctdsp_init_aarch64.c 
b/libavcodec/aarch64/idctdsp_init_aarch64.c
index eec21aa5a2..8efd5f5323 100644
--- a/libavcodec/aarch64/idctdsp_init_aarch64.c
+++ b/libavcodec/aarch64/idctdsp_init_aarch64.c
@@ -22,7 +22,7 @@
 
 #include "libavutil/attributes.h"
 #include "libavutil/cpu.h"
-#include "libavutil/arm/cpu.h"
+#include "libavutil/aarch64/cpu.h"
 #include "libavcodec/avcodec.h"
 #include "libavcodec/idctdsp.h"
 #include "idct.h"
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v1 2/2] vaapi: add vaapi_avs2 support

2024-01-21 Thread Liu Steven


> On Jan 21, 2024, at 22:47, Jianfeng Zheng  wrote:
> 
> Zhao Zhili  于2024年1月20日周六 12:22写道:
>> 
>> 
>>> -Original Message-
>>> From: ffmpeg-devel  On Behalf Of 
>>> jianfeng.zheng
>>> Sent: 2024年1月19日 23:53
>>> To: ffmpeg-devel@ffmpeg.org
>>> Cc: jianfeng.zheng 
>>> Subject: [FFmpeg-devel] [PATCH v1 2/2] vaapi: add vaapi_avs2 support
>>> 
>>> see https://github.com/intel/libva/pull/738
>>> 
>>> [Moore Threads](https://www.mthreads.com) (short for Mthreads) is a
>>> Chinese GPU manufacturer. All our products, like MTTS70/MTTS80/.. ,
>>> support AVS2 8bit/10bit HW decoding at max 8k resolution.
>>> 
>>> Signed-off-by: jianfeng.zheng 
>>> ---
>>> configure|   7 +
>>> libavcodec/Makefile  |   2 +
>>> libavcodec/allcodecs.c   |   1 +
>>> libavcodec/avs2.c| 345 ++-
>>> libavcodec/avs2.h| 460 +++-
>>> libavcodec/avs2_parser.c |   5 +-
>>> libavcodec/avs2dec.c | 569 +
>>> libavcodec/avs2dec.h |  48 +++
>>> libavcodec/avs2dec_headers.c | 787 +++
>>> libavcodec/codec_desc.c  |   5 +-
>>> libavcodec/defs.h|   4 +
>>> libavcodec/hwaccels.h|   1 +
>>> libavcodec/libdavs2.c|   2 +-
>>> libavcodec/profiles.c|   6 +
>>> libavcodec/profiles.h|   1 +
>>> libavcodec/vaapi_avs2.c  | 227 ++
>>> libavcodec/vaapi_decode.c|   5 +
>>> libavformat/matroska.c   |   1 +
>>> libavformat/mpeg.h   |   1 +
>>> 19 files changed, 2450 insertions(+), 27 deletions(-)
>>> create mode 100644 libavcodec/avs2dec.c
>>> create mode 100644 libavcodec/avs2dec.h
>>> create mode 100644 libavcodec/avs2dec_headers.c
>>> create mode 100644 libavcodec/vaapi_avs2.c
>>> 
>> 
>> Please split the patch properly. It's hard to review in a single chunk, and 
>> it can't be tested
>> without the hardware.
> 
> As a new player in the GPU market, we have always attached great importance
> to the participation of the open source community. And willing to feed back 
> our
> new features to the field of video hardware acceleration.
> 
> As a pioneer, these new features may only be supported by our hardware
> at current
> time. We are willing to provide some market selling devices for free
> to community
> accredited contributors for testing related functions.
I accredited Zhili Zhao, but i cannot sure Zhili Zhao have interest in the GPU 
card and not sure he has a computer can insert a PCI GPU card:D

> 
>> 
>> ___
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> 
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v4 1/2] avfilter: add audio overlay filter

2024-01-21 Thread Stefano Sabatini
On date Tuesday 2024-01-16 17:46:42 +0530, Harshit Karwal wrote:
> Co-authored-by: Paul B Mahol 
> Signed-off-by: Harshit Karwal 
> ---
>  doc/filters.texi  |  40 +++
>  libavfilter/Makefile  |   1 +
>  libavfilter/af_aoverlay.c | 538 ++
>  libavfilter/allfilters.c  |   1 +
>  4 files changed, 580 insertions(+)
>  create mode 100644 libavfilter/af_aoverlay.c
> 
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 20c91bab3a..79eb600ae3 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -2779,6 +2779,46 @@ This filter supports the same commands as options, 
> excluding option @code{order}
>  
>  Pass the audio source unchanged to the output.
>  
> +@section aoverlay
> +
> +Replace a specified section of an audio stream with another input audio 
> stream.
> +

> +In case no enable option for timeline editing is specified, the second audio 
> stream will

nit: @option{enable}

> +be output at sections of the first stream which have a gap in PTS 
> (Presentation TimeStamp) values
> +such that the output stream's PTS values are monotonous.
> +
> +This filter also supports linear cross fading when transitioning from one
> +input stream to another.
> +

> +The filter accepts the following option:

nit: options in case we add more

> +

> +@table @option
> +@item cf_duration
> +Set duration (in seconds) for cross fade between the inputs. Default value 
> is @code{100} milliseconds.
> +@end table
> +
> +@subsection Examples
> +
> +@itemize
> +@item
> +Replace the first stream with the second stream from @code{t=10} seconds to 
> @code{t=20} seconds:
> +@example
> +ffmpeg -i first.wav -i second.wav -filter_complex 
> "aoverlay=enable='between(t,10,20)'" output.wav
> +@end example
> +
> +@item
> +Do the same as above, but with crossfading for @code{2} seconds between the 
> streams:
> +@example
> +ffmpeg -i first.wav -i second.wav -filter_complex 
> "aoverlay=cf_duration=2:enable='between(t,10,20)'" output.wav
> +@end example
> +
> +@item
> +Introduce a PTS gap from @code{t=4} seconds to @code{t=8} seconds in the 
> first stream and output the second stream during this gap:
> +@example
> +ffmpeg -i first.wav -i second.wav -filter_complex 
> "[0]aselect='not(between(t,4,8))'[temp];[temp][1]aoverlay[out]" -map "[out]" 
> output.wav
> +@end example
> +@end itemize
> +
>  @section apad
>  
>  Pad the end of an audio stream with silence.
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index bba0219876..0f2b403441 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -81,6 +81,7 @@ OBJS-$(CONFIG_ANLMDN_FILTER) += af_anlmdn.o
>  OBJS-$(CONFIG_ANLMF_FILTER)  += af_anlms.o
>  OBJS-$(CONFIG_ANLMS_FILTER)  += af_anlms.o
>  OBJS-$(CONFIG_ANULL_FILTER)  += af_anull.o
> +OBJS-$(CONFIG_AOVERLAY_FILTER)   += af_aoverlay.o
>  OBJS-$(CONFIG_APAD_FILTER)   += af_apad.o
>  OBJS-$(CONFIG_APERMS_FILTER) += f_perms.o
>  OBJS-$(CONFIG_APHASER_FILTER)+= af_aphaser.o 
> generate_wave_table.o
> diff --git a/libavfilter/af_aoverlay.c b/libavfilter/af_aoverlay.c
> new file mode 100644
> index 00..f7ac00dda1
> --- /dev/null
> +++ b/libavfilter/af_aoverlay.c
[...]
> +static int crossfade_prepare(AOverlayContext *s, AVFilterLink *main_inlink, 
> AVFilterLink *overlay_inlink, AVFilterLink *outlink,
> + int nb_samples, AVFrame **main_buffer, AVFrame 
> **overlay_buffer, int mode)
> +{
> +int ret;
> +
> +*main_buffer = ff_get_audio_buffer(outlink, nb_samples);
> +if (!(*main_buffer))
> +return AVERROR(ENOMEM);
> +
> +(*main_buffer)->pts = s->pts;
> +s->pts += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate 
> }, outlink->time_base);
> +
> +if ((ret = av_audio_fifo_read(s->main_sample_buffers, (void 
> **)(*main_buffer)->extended_data, nb_samples)) < 0)
> +return ret;
> +

> +if (mode == 1) {
> +s->previous_samples = (*main_buffer)->nb_samples;
> +} else if (mode == -1 || (mode == 0 && s->is_disabled)) {

it would help to use an enum to describe the mode value

Also would help to introduce some debug log messages to aid
troubleshooting/debugging.

For instance, it would be very useful to show the exact time when the
overlay stream is inserted.

[...]
> +static int activate(AVFilterContext *ctx)
> +{
> +AOverlayContext *s = ctx->priv;
> +int status, ret, nb_samples;
> +int64_t pts;
> +AVFrame *out = NULL, *main_buffer = NULL, *overlay_buffer = NULL;
> +
> +AVFilterLink *main_inlink = ctx->inputs[0];
> +AVFilterLink *overlay_inlink = ctx->inputs[1];
> +AVFilterLink *outlink = ctx->outputs[0];
> +
> +FF_FILTER_FORWARD_STATUS_BACK_ALL(outlink, ctx);
> +
> +if (s->default_mode && (s->pts_gap_end - s->pts_gap_start <= 0 || 
> s->overlay_eof)) {
> +s->default_mode = 0;
> +s->transition_pts

[FFmpeg-devel] [PATCH 1/9] avcodec/vaapi_encode: move pic->input_surface initialization to encode_alloc

2024-01-21 Thread tong1 . wu-at-intel . com
From: Tong Wu 

When allocating the VAAPIEncodePicture, pic->input_surface can be
initialized right in the place. This movement simplifies the send_frame
logic and is the preparation for moving vaapi_encode_send_frame to the base 
layer.

Signed-off-by: Tong Wu 
---
 libavcodec/vaapi_encode.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
index 86f4110cd2..38d855fbd4 100644
--- a/libavcodec/vaapi_encode.c
+++ b/libavcodec/vaapi_encode.c
@@ -878,7 +878,8 @@ static int vaapi_encode_discard(AVCodecContext *avctx,
 return 0;
 }
 
-static VAAPIEncodePicture *vaapi_encode_alloc(AVCodecContext *avctx)
+static VAAPIEncodePicture *vaapi_encode_alloc(AVCodecContext *avctx,
+  const AVFrame *frame)
 {
 VAAPIEncodeContext *ctx = avctx->priv_data;
 VAAPIEncodePicture *pic;
@@ -895,7 +896,7 @@ static VAAPIEncodePicture 
*vaapi_encode_alloc(AVCodecContext *avctx)
 }
 }
 
-pic->input_surface = VA_INVALID_ID;
+pic->input_surface = (VASurfaceID)(uintptr_t)frame->data[3];
 pic->recon_surface = VA_INVALID_ID;
 pic->output_buffer = VA_INVALID_ID;
 
@@ -1332,7 +1333,7 @@ static int vaapi_encode_send_frame(AVCodecContext *avctx, 
AVFrame *frame)
 if (err < 0)
 return err;
 
-pic = vaapi_encode_alloc(avctx);
+pic = vaapi_encode_alloc(avctx, frame);
 if (!pic)
 return AVERROR(ENOMEM);
 
@@ -1345,7 +1346,6 @@ static int vaapi_encode_send_frame(AVCodecContext *avctx, 
AVFrame *frame)
 if (ctx->input_order == 0 || frame->pict_type == AV_PICTURE_TYPE_I)
 pic->force_idr = 1;
 
-pic->input_surface = (VASurfaceID)(uintptr_t)frame->data[3];
 pic->pts = frame->pts;
 pic->duration = frame->duration;
 
-- 
2.41.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 3/9] avcodec/vaapi_encode: extract set_output_property to base layer

2024-01-21 Thread tong1 . wu-at-intel . com
From: Tong Wu 

Signed-off-by: Tong Wu 
---
 libavcodec/hw_base_encode.c | 40 +
 libavcodec/hw_base_encode.h |  3 +++
 libavcodec/vaapi_encode.c   | 44 ++---
 3 files changed, 45 insertions(+), 42 deletions(-)

diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c
index 62adda2fc3..f0e4ef9655 100644
--- a/libavcodec/hw_base_encode.c
+++ b/libavcodec/hw_base_encode.c
@@ -385,6 +385,46 @@ static int hw_base_encode_clear_old(AVCodecContext *avctx)
 return 0;
 }
 
+int ff_hw_base_encode_set_output_property(AVCodecContext *avctx,
+  HWBaseEncodePicture *pic,
+  AVPacket *pkt, int flag_no_delay)
+{
+HWBaseEncodeContext *ctx = avctx->priv_data;
+
+if (pic->type == PICTURE_TYPE_IDR)
+pkt->flags |= AV_PKT_FLAG_KEY;
+
+pkt->pts = pic->pts;
+pkt->duration = pic->duration;
+
+// for no-delay encoders this is handled in generic codec
+if (avctx->codec->capabilities & AV_CODEC_CAP_DELAY &&
+avctx->flags & AV_CODEC_FLAG_COPY_OPAQUE) {
+pkt->opaque  = pic->opaque;
+pkt->opaque_ref  = pic->opaque_ref;
+pic->opaque_ref = NULL;
+}
+
+if (flag_no_delay) {
+pkt->dts = pkt->pts;
+return 0;
+}
+
+if (ctx->output_delay == 0) {
+pkt->dts = pkt->pts;
+} else if (pic->encode_order < ctx->decode_delay) {
+if (ctx->ts_ring[pic->encode_order] < INT64_MIN + ctx->dts_pts_diff)
+pkt->dts = INT64_MIN;
+else
+pkt->dts = ctx->ts_ring[pic->encode_order] - ctx->dts_pts_diff;
+} else {
+pkt->dts = ctx->ts_ring[(pic->encode_order - ctx->decode_delay) %
+(3 * ctx->output_delay + ctx->async_depth)];
+}
+
+return 0;
+}
+
 static int hw_base_encode_check_frame(AVCodecContext *avctx,
   const AVFrame *frame)
 {
diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h
index be4c6b034e..d215d6a32b 100644
--- a/libavcodec/hw_base_encode.h
+++ b/libavcodec/hw_base_encode.h
@@ -237,6 +237,9 @@ typedef struct HWBaseEncodeContext {
 AVPacket*tail_pkt;
 } HWBaseEncodeContext;
 
+int ff_hw_base_encode_set_output_property(AVCodecContext *avctx, 
HWBaseEncodePicture *pic,
+  AVPacket *pkt, int flag_no_delay);
+
 int ff_hw_base_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt);
 
 int ff_hw_base_encode_init(AVCodecContext *avctx);
diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
index e2f968c36d..2d839a1202 100644
--- a/libavcodec/vaapi_encode.c
+++ b/libavcodec/vaapi_encode.c
@@ -668,47 +668,6 @@ fail_at_end:
 return err;
 }
 
-static int vaapi_encode_set_output_property(AVCodecContext *avctx,
-HWBaseEncodePicture *pic,
-AVPacket *pkt)
-{
-HWBaseEncodeContext *base_ctx = avctx->priv_data;
-VAAPIEncodeContext *ctx = avctx->priv_data;
-
-if (pic->type == PICTURE_TYPE_IDR)
-pkt->flags |= AV_PKT_FLAG_KEY;
-
-pkt->pts = pic->pts;
-pkt->duration = pic->duration;
-
-// for no-delay encoders this is handled in generic codec
-if (avctx->codec->capabilities & AV_CODEC_CAP_DELAY &&
-avctx->flags & AV_CODEC_FLAG_COPY_OPAQUE) {
-pkt->opaque = pic->opaque;
-pkt->opaque_ref = pic->opaque_ref;
-pic->opaque_ref = NULL;
-}
-
-if (ctx->codec->flags & FLAG_TIMESTAMP_NO_DELAY) {
-pkt->dts = pkt->pts;
-return 0;
-}
-
-if (base_ctx->output_delay == 0) {
-pkt->dts = pkt->pts;
-} else if (pic->encode_order < base_ctx->decode_delay) {
-if (base_ctx->ts_ring[pic->encode_order] < INT64_MIN + 
base_ctx->dts_pts_diff)
-pkt->dts = INT64_MIN;
-else
-pkt->dts = base_ctx->ts_ring[pic->encode_order] - 
base_ctx->dts_pts_diff;
-} else {
-pkt->dts = base_ctx->ts_ring[(pic->encode_order - 
base_ctx->decode_delay) %
- (3 * base_ctx->output_delay + 
base_ctx->async_depth)];
-}
-
-return 0;
-}
-
 static int vaapi_encode_get_coded_buffer_size(AVCodecContext *avctx, 
VABufferID buf_id)
 {
 VAAPIEncodeContext *ctx = avctx->priv_data;
@@ -860,7 +819,8 @@ static int vaapi_encode_output(AVCodecContext *avctx,
 av_log(avctx, AV_LOG_DEBUG, "Output read for pic %"PRId64"/%"PRId64".\n",
base_pic->display_order, base_pic->encode_order);
 
-vaapi_encode_set_output_property(avctx, base_pic, pkt_ptr);
+ff_hw_base_encode_set_output_property(avctx, base_pic, pkt_ptr,
+  ctx->codec->flags & 
FLAG_TIMESTAMP_NO_DELAY);
 
 end:
 ff_refstruct_unref(&pic->output_buffer_ref);
-- 
2.41.0.windows.1

___
ffmpeg-devel 

[FFmpeg-devel] [PATCH 4/9] avcodec/vaapi_encode: extract rc parameter configuration to base layer

2024-01-21 Thread tong1 . wu-at-intel . com
From: Tong Wu 

VAAPI and D3D12VA can share rate control configuration codes. Hence, it
can be moved to base layer for simplification.

Signed-off-by: Tong Wu 
---
 libavcodec/hw_base_encode.c| 151 
 libavcodec/hw_base_encode.h|  34 ++
 libavcodec/vaapi_encode.c  | 210 ++---
 libavcodec/vaapi_encode.h  |  14 +--
 libavcodec/vaapi_encode_av1.c  |   2 +-
 libavcodec/vaapi_encode_h264.c |   2 +-
 libavcodec/vaapi_encode_vp9.c  |   2 +-
 7 files changed, 227 insertions(+), 188 deletions(-)

diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c
index f0e4ef9655..c20c47bf55 100644
--- a/libavcodec/hw_base_encode.c
+++ b/libavcodec/hw_base_encode.c
@@ -631,6 +631,157 @@ end:
 return 0;
 }
 
+int ff_hw_base_rc_mode_configure(AVCodecContext *avctx, const 
HWBaseEncodeRCMode *rc_mode,
+ int default_quality, HWBaseEncodeRCConfigure 
*rc_conf)
+{
+HWBaseEncodeContext *ctx = avctx->priv_data;
+
+if (!rc_mode || !rc_conf)
+return -1;
+
+if (rc_mode->bitrate) {
+if (avctx->bit_rate <= 0) {
+av_log(avctx, AV_LOG_ERROR, "Bitrate must be set for %s "
+   "RC mode.\n", rc_mode->name);
+return AVERROR(EINVAL);
+}
+
+if (rc_mode->mode == RC_MODE_AVBR) {
+// For maximum confusion AVBR is hacked into the existing API
+// by overloading some of the fields with completely different
+// meanings.
+
+// Target percentage does not apply in AVBR mode.
+rc_conf->rc_bits_per_second = avctx->bit_rate;
+
+// Accuracy tolerance range for meeting the specified target
+// bitrate.  It's very unclear how this is actually intended
+// to work - since we do want to get the specified bitrate,
+// set the accuracy to 100% for now.
+rc_conf->rc_target_percentage = 100;
+
+// Convergence period in frames.  The GOP size reflects the
+// user's intended block size for cutting, so reusing that
+// as the convergence period seems a reasonable default.
+rc_conf->rc_window_size = avctx->gop_size > 0 ? avctx->gop_size : 
60;
+
+} else if (rc_mode->maxrate) {
+if (avctx->rc_max_rate > 0) {
+if (avctx->rc_max_rate < avctx->bit_rate) {
+av_log(avctx, AV_LOG_ERROR, "Invalid bitrate settings: "
+   "bitrate (%"PRId64") must not be greater than "
+   "maxrate (%"PRId64").\n", avctx->bit_rate,
+   avctx->rc_max_rate);
+return AVERROR(EINVAL);
+}
+rc_conf->rc_bits_per_second   = avctx->rc_max_rate;
+rc_conf->rc_target_percentage = (avctx->bit_rate * 100) /
+ avctx->rc_max_rate;
+} else {
+// We only have a target bitrate, but this mode requires
+// that a maximum rate be supplied as well.  Since the
+// user does not want this to be a constraint, arbitrarily
+// pick a maximum rate of double the target rate.
+rc_conf->rc_bits_per_second   = 2 * avctx->bit_rate;
+rc_conf->rc_target_percentage = 50;
+}
+} else {
+if (avctx->rc_max_rate > avctx->bit_rate) {
+av_log(avctx, AV_LOG_WARNING, "Max bitrate is ignored "
+   "in %s RC mode.\n", rc_mode->name);
+}
+rc_conf->rc_bits_per_second   = avctx->bit_rate;
+rc_conf->rc_target_percentage = 100;
+}
+} else {
+rc_conf->rc_bits_per_second   = 0;
+rc_conf->rc_target_percentage = 100;
+}
+
+if (rc_mode->quality) {
+if (ctx->explicit_qp) {
+rc_conf->rc_quality = ctx->explicit_qp;
+} else if (avctx->global_quality > 0) {
+rc_conf->rc_quality = avctx->global_quality;
+} else {
+rc_conf->rc_quality = default_quality;
+av_log(avctx, AV_LOG_WARNING, "No quality level set; "
+   "using default (%d).\n", rc_conf->rc_quality);
+}
+} else {
+rc_conf->rc_quality = 0;
+}
+
+if (rc_mode->hrd) {
+if (avctx->rc_buffer_size)
+rc_conf->hrd_buffer_size = avctx->rc_buffer_size;
+else if (avctx->rc_max_rate > 0)
+rc_conf->hrd_buffer_size = avctx->rc_max_rate;
+else
+rc_conf->hrd_buffer_size = avctx->bit_rate;
+if (avctx->rc_initial_buffer_occupancy) {
+if (avctx->rc_initial_buffer_occupancy > rc_conf->hrd_buffer_size) 
{
+av_log(avctx, AV_LOG_ERROR, "Invalid RC buffer settings: "
+   "must have initial buffer size (%d) <= "
+   "buffer size (%"

[FFmpeg-devel] [PATCH 6/9] avcodec/vaapi_encode: extract a get_recon_format function to base layer

2024-01-21 Thread tong1 . wu-at-intel . com
From: Tong Wu 

Get constraints and set recon frame format can be shared with other HW
encoder such as D3D12. Extract this part as a new function to base
layer.

Signed-off-by: Tong Wu 
---
 libavcodec/hw_base_encode.c | 58 +
 libavcodec/hw_base_encode.h |  2 ++
 libavcodec/vaapi_encode.c   | 51 ++--
 3 files changed, 63 insertions(+), 48 deletions(-)

diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c
index bb9fe70239..7497e0397e 100644
--- a/libavcodec/hw_base_encode.c
+++ b/libavcodec/hw_base_encode.c
@@ -836,6 +836,64 @@ int ff_hw_base_init_gop_structure(AVCodecContext *avctx, 
uint32_t ref_l0, uint32
 return 0;
 }
 
+int ff_hw_base_get_recon_format(AVCodecContext *avctx, const void *hwconfig, 
enum AVPixelFormat *fmt)
+{
+HWBaseEncodeContext *ctx = avctx->priv_data;
+AVHWFramesConstraints *constraints = NULL;
+enum AVPixelFormat recon_format;
+int err, i;
+
+constraints = av_hwdevice_get_hwframe_constraints(ctx->device_ref,
+  hwconfig);
+if (!constraints) {
+err = AVERROR(ENOMEM);
+goto fail;
+}
+
+// Probably we can use the input surface format as the surface format
+// of the reconstructed frames.  If not, we just pick the first (only?)
+// format in the valid list and hope that it all works.
+recon_format = AV_PIX_FMT_NONE;
+if (constraints->valid_sw_formats) {
+for (i = 0; constraints->valid_sw_formats[i] != AV_PIX_FMT_NONE; i++) {
+if (ctx->input_frames->sw_format ==
+constraints->valid_sw_formats[i]) {
+recon_format = ctx->input_frames->sw_format;
+break;
+}
+}
+if (recon_format == AV_PIX_FMT_NONE) {
+// No match.  Just use the first in the supported list and
+// hope for the best.
+recon_format = constraints->valid_sw_formats[0];
+}
+} else {
+// No idea what to use; copy input format.
+recon_format = ctx->input_frames->sw_format;
+}
+av_log(avctx, AV_LOG_DEBUG, "Using %s as format of "
+   "reconstructed frames.\n", av_get_pix_fmt_name(recon_format));
+
+if (ctx->surface_width  < constraints->min_width  ||
+ctx->surface_height < constraints->min_height ||
+ctx->surface_width  > constraints->max_width ||
+ctx->surface_height > constraints->max_height) {
+av_log(avctx, AV_LOG_ERROR, "Hardware does not support encoding at "
+   "size %dx%d (constraints: width %d-%d height %d-%d).\n",
+   ctx->surface_width, ctx->surface_height,
+   constraints->min_width,  constraints->max_width,
+   constraints->min_height, constraints->max_height);
+err = AVERROR(EINVAL);
+goto fail;
+}
+
+*fmt = recon_format;
+err = 0;
+fail:
+av_hwframe_constraints_free(&constraints);
+return err;
+}
+
 int ff_hw_base_encode_init(AVCodecContext *avctx)
 {
 HWBaseEncodeContext *ctx = avctx->priv_data;
diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h
index d6d2fc03c5..3d026ee23e 100644
--- a/libavcodec/hw_base_encode.h
+++ b/libavcodec/hw_base_encode.h
@@ -279,6 +279,8 @@ int ff_hw_base_rc_mode_configure(AVCodecContext *avctx, 
const HWBaseEncodeRCMode
 int ff_hw_base_init_gop_structure(AVCodecContext *avctx, uint32_t ref_l0, 
uint32_t ref_l1,
   int flags, int prediction_pre_only);
 
+int ff_hw_base_get_recon_format(AVCodecContext *avctx, const void *hwconfig, 
enum AVPixelFormat *fmt);
+
 int ff_hw_base_encode_init(AVCodecContext *avctx);
 
 int ff_hw_base_encode_close(AVCodecContext *avctx);
diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
index 0bce3ce105..84a81559e1 100644
--- a/libavcodec/vaapi_encode.c
+++ b/libavcodec/vaapi_encode.c
@@ -1898,9 +1898,8 @@ static av_cold int 
vaapi_encode_create_recon_frames(AVCodecContext *avctx)
 HWBaseEncodeContext *base_ctx = avctx->priv_data;
 VAAPIEncodeContext   *ctx = avctx->priv_data;
 AVVAAPIHWConfig *hwconfig = NULL;
-AVHWFramesConstraints *constraints = NULL;
 enum AVPixelFormat recon_format;
-int err, i;
+int err;
 
 hwconfig = av_hwdevice_hwconfig_alloc(base_ctx->device_ref);
 if (!hwconfig) {
@@ -1909,52 +1908,9 @@ static av_cold int 
vaapi_encode_create_recon_frames(AVCodecContext *avctx)
 }
 hwconfig->config_id = ctx->va_config;
 
-constraints = av_hwdevice_get_hwframe_constraints(base_ctx->device_ref,
-  hwconfig);
-if (!constraints) {
-err = AVERROR(ENOMEM);
-goto fail;
-}
-
-// Probably we can use the input surface format as the surface format
-// of the reconstructed frames.  If not, we just pick the first (only?)
-// format in the valid list and hope that it all w

[FFmpeg-devel] [PATCH 7/9] avutil/hwcontext_d3d12va: add Flags for resource creation

2024-01-21 Thread tong1 . wu-at-intel . com
From: Tong Wu 

Flags field is added to support diffferent resource creation.

Signed-off-by: Tong Wu 
---
 doc/APIchanges| 3 +++
 libavutil/hwcontext_d3d12va.c | 2 +-
 libavutil/hwcontext_d3d12va.h | 5 +
 libavutil/version.h   | 2 +-
 4 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/doc/APIchanges b/doc/APIchanges
index e477ed78e0..a33e54dd3b 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -2,6 +2,9 @@ The last version increases of all libraries were on 2023-02-09
 
 API changes, most recent first:
 
+2024-01-xx - xx - lavu 58.37.100 - hwcontext_d3d12va.h
+ Add AVD3D12VAFramesContext.Flags
+
 2023-11-xx - xx - lavfi 9.16.100 - buffersink.h buffersrc.h
   Add av_buffersink_get_colorspace and av_buffersink_get_color_range.
   Add AVBufferSrcParameters.color_space and AVBufferSrcParameters.color_range.
diff --git a/libavutil/hwcontext_d3d12va.c b/libavutil/hwcontext_d3d12va.c
index 414dd44290..0d94f48543 100644
--- a/libavutil/hwcontext_d3d12va.c
+++ b/libavutil/hwcontext_d3d12va.c
@@ -237,7 +237,7 @@ static AVBufferRef *d3d12va_pool_alloc(void *opaque, size_t 
size)
 .Format   = hwctx->format,
 .SampleDesc   = {.Count = 1, .Quality = 0 },
 .Layout   = D3D12_TEXTURE_LAYOUT_UNKNOWN,
-.Flags= D3D12_RESOURCE_FLAG_NONE,
+.Flags= hwctx->Flags,
 };
 
 frame = av_mallocz(sizeof(AVD3D12VAFrame));
diff --git a/libavutil/hwcontext_d3d12va.h b/libavutil/hwcontext_d3d12va.h
index ff06e6f2ef..dc1c17d3f9 100644
--- a/libavutil/hwcontext_d3d12va.h
+++ b/libavutil/hwcontext_d3d12va.h
@@ -129,6 +129,11 @@ typedef struct AVD3D12VAFramesContext {
  * If unset, will be automatically set.
  */
 DXGI_FORMAT format;
+
+/**
+ * This field is used for resource creation.
+ */
+D3D12_RESOURCE_FLAGS Flags;
 } AVD3D12VAFramesContext;
 
 #endif /* AVUTIL_HWCONTEXT_D3D12VA_H */
diff --git a/libavutil/version.h b/libavutil/version.h
index 772c4e209c..3ad1a9446c 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -79,7 +79,7 @@
  */
 
 #define LIBAVUTIL_VERSION_MAJOR  58
-#define LIBAVUTIL_VERSION_MINOR  36
+#define LIBAVUTIL_VERSION_MINOR  37
 #define LIBAVUTIL_VERSION_MICRO 101
 
 #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
-- 
2.41.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 5/9] avcodec/vaapi_encode: extract gop configuration to base layer

2024-01-21 Thread tong1 . wu-at-intel . com
From: Tong Wu 

Signed-off-by: Tong Wu 
---
 libavcodec/hw_base_encode.c | 54 +
 libavcodec/hw_base_encode.h |  3 +++
 libavcodec/vaapi_encode.c   | 52 +++
 3 files changed, 61 insertions(+), 48 deletions(-)

diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c
index c20c47bf55..bb9fe70239 100644
--- a/libavcodec/hw_base_encode.c
+++ b/libavcodec/hw_base_encode.c
@@ -782,6 +782,60 @@ int ff_hw_base_rc_mode_configure(AVCodecContext *avctx, 
const HWBaseEncodeRCMode
 return 0;
 }
 
+int ff_hw_base_init_gop_structure(AVCodecContext *avctx, uint32_t ref_l0, 
uint32_t ref_l1,
+  int flags, int prediction_pre_only)
+{
+HWBaseEncodeContext *ctx = avctx->priv_data;
+
+if (flags & FLAG_INTRA_ONLY || avctx->gop_size <= 1) {
+av_log(avctx, AV_LOG_VERBOSE, "Using intra frames only.\n");
+ctx->gop_size = 1;
+} else if (ref_l0 < 1) {
+av_log(avctx, AV_LOG_ERROR, "Driver does not support any "
+   "reference frames.\n");
+return AVERROR(EINVAL);
+} else if (!(flags & FLAG_B_PICTURES) || ref_l1 < 1 ||
+   avctx->max_b_frames < 1 || prediction_pre_only) {
+if (ctx->p_to_gpb)
+   av_log(avctx, AV_LOG_VERBOSE, "Using intra and B-frames "
+  "(supported references: %d / %d).\n",
+  ref_l0, ref_l1);
+else
+av_log(avctx, AV_LOG_VERBOSE, "Using intra and P-frames "
+   "(supported references: %d / %d).\n", ref_l0, ref_l1);
+ctx->gop_size = avctx->gop_size;
+ctx->p_per_i  = INT_MAX;
+ctx->b_per_p  = 0;
+} else {
+   if (ctx->p_to_gpb)
+   av_log(avctx, AV_LOG_VERBOSE, "Using intra and B-frames "
+  "(supported references: %d / %d).\n",
+  ref_l0, ref_l1);
+   else
+   av_log(avctx, AV_LOG_VERBOSE, "Using intra, P- and B-frames "
+  "(supported references: %d / %d).\n", ref_l0, ref_l1);
+ctx->gop_size = avctx->gop_size;
+ctx->p_per_i  = INT_MAX;
+ctx->b_per_p  = avctx->max_b_frames;
+if (flags & FLAG_B_PICTURE_REFERENCES) {
+ctx->max_b_depth = FFMIN(ctx->desired_b_depth,
+ av_log2(ctx->b_per_p) + 1);
+} else {
+ctx->max_b_depth = 1;
+}
+}
+
+if (flags & FLAG_NON_IDR_KEY_PICTURES) {
+ctx->closed_gop  = !!(avctx->flags & AV_CODEC_FLAG_CLOSED_GOP);
+ctx->gop_per_idr = ctx->idr_interval + 1;
+} else {
+ctx->closed_gop  = 1;
+ctx->gop_per_idr = 1;
+}
+
+return 0;
+}
+
 int ff_hw_base_encode_init(AVCodecContext *avctx)
 {
 HWBaseEncodeContext *ctx = avctx->priv_data;
diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h
index 57cfa12e73..d6d2fc03c5 100644
--- a/libavcodec/hw_base_encode.h
+++ b/libavcodec/hw_base_encode.h
@@ -276,6 +276,9 @@ int ff_hw_base_encode_receive_packet(AVCodecContext *avctx, 
AVPacket *pkt);
 int ff_hw_base_rc_mode_configure(AVCodecContext *avctx, const 
HWBaseEncodeRCMode *rc_mode,
  int default_quality, HWBaseEncodeRCConfigure 
*rc_conf);
 
+int ff_hw_base_init_gop_structure(AVCodecContext *avctx, uint32_t ref_l0, 
uint32_t ref_l1,
+  int flags, int prediction_pre_only);
+
 int ff_hw_base_encode_init(AVCodecContext *avctx);
 
 int ff_hw_base_encode_close(AVCodecContext *avctx);
diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
index 30e5deac08..0bce3ce105 100644
--- a/libavcodec/vaapi_encode.c
+++ b/libavcodec/vaapi_encode.c
@@ -1443,7 +1443,7 @@ static av_cold int 
vaapi_encode_init_gop_structure(AVCodecContext *avctx)
 VAStatus vas;
 VAConfigAttrib attr = { VAConfigAttribEncMaxRefFrames };
 uint32_t ref_l0, ref_l1;
-int prediction_pre_only;
+int prediction_pre_only, err;
 
 vas = vaGetConfigAttributes(ctx->hwctx->display,
 ctx->va_profile,
@@ -1507,53 +1507,9 @@ static av_cold int 
vaapi_encode_init_gop_structure(AVCodecContext *avctx)
 }
 #endif
 
-if (ctx->codec->flags & FLAG_INTRA_ONLY ||
-avctx->gop_size <= 1) {
-av_log(avctx, AV_LOG_VERBOSE, "Using intra frames only.\n");
-base_ctx->gop_size = 1;
-} else if (ref_l0 < 1) {
-av_log(avctx, AV_LOG_ERROR, "Driver does not support any "
-   "reference frames.\n");
-return AVERROR(EINVAL);
-} else if (!(ctx->codec->flags & FLAG_B_PICTURES) ||
-   ref_l1 < 1 || avctx->max_b_frames < 1 ||
-   prediction_pre_only) {
-if (base_ctx->p_to_gpb)
-   av_log(avctx, AV_LOG_VERBOSE, "Using intra and B-frames "
-  "(supported references: %d / %d).\n",
-  ref_l0, ref_l1);
-else
-av_log(avctx, AV_LOG_VERBOSE, "Using intra and P-frames

[FFmpeg-devel] [PATCH 9/9] Changelog: add D3D12VA HEVC encoder changelog

2024-01-21 Thread tong1 . wu-at-intel . com
From: Tong Wu 

Signed-off-by: Tong Wu 
---
 Changelog | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Changelog b/Changelog
index c40b6d08fd..b3b5c16e0a 100644
--- a/Changelog
+++ b/Changelog
@@ -22,6 +22,7 @@ version :
 - ffmpeg CLI -bsf option may now be used for input as well as output
 - ffmpeg CLI options may now be used as -/opt , which is equivalent
   to -opt >
+- D3D12VA HEVC encoder
 
 version 6.1:
 - libaribcaption decoder
-- 
2.41.0.windows.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 8/9] avcodec: add D3D12VA hardware HEVC encoder

2024-01-21 Thread tong1 . wu-at-intel . com
From: Tong Wu 

This implementation is based on D3D12 Video Encoding Spec:
https://microsoft.github.io/DirectX-Specs/d3d/D3D12VideoEncoding.html

Sample command line for transcoding:
ffmpeg.exe -hwaccel d3d12va -hwaccel_output_format d3d12 -i input.mp4
-c:v hevc_d3d12va output.mp4

Signed-off-by: Tong Wu 
---
 configure|6 +
 libavcodec/Makefile  |4 +-
 libavcodec/allcodecs.c   |1 +
 libavcodec/d3d12va_encode.c  | 1441 ++
 libavcodec/d3d12va_encode.h  |  200 +
 libavcodec/d3d12va_encode_hevc.c | 1016 +
 libavcodec/hw_base_encode.h  |2 +-
 7 files changed, 2668 insertions(+), 2 deletions(-)
 create mode 100644 libavcodec/d3d12va_encode.c
 create mode 100644 libavcodec/d3d12va_encode.h
 create mode 100644 libavcodec/d3d12va_encode_hevc.c

diff --git a/configure b/configure
index c8ae0a061d..3a186a3454 100755
--- a/configure
+++ b/configure
@@ -2561,6 +2561,7 @@ CONFIG_EXTRA="
 tpeldsp
 vaapi_1
 vaapi_encode
+d3d12va_encode
 vc1dsp
 videodsp
 vp3dsp
@@ -3204,6 +3205,7 @@ wmv3_vaapi_hwaccel_select="vc1_vaapi_hwaccel"
 wmv3_vdpau_hwaccel_select="vc1_vdpau_hwaccel"
 
 # hardware-accelerated codecs
+d3d12va_encode_deps="d3d12va ID3D12VideoEncoder d3d12_encoder_feature"
 mediafoundation_deps="mftransform_h MFCreateAlignedMemoryBuffer"
 omx_deps="libdl pthreads"
 omx_rpi_select="omx"
@@ -3271,6 +3273,7 @@ h264_v4l2m2m_encoder_deps="v4l2_m2m h264_v4l2_m2m"
 hevc_amf_encoder_deps="amf"
 hevc_cuvid_decoder_deps="cuvid"
 hevc_cuvid_decoder_select="hevc_mp4toannexb_bsf"
+hevc_d3d12va_encoder_select="atsc_a53 cbs_h265 d3d12va_encode"
 hevc_mediacodec_decoder_deps="mediacodec"
 hevc_mediacodec_decoder_select="hevc_mp4toannexb_bsf hevc_parser"
 hevc_mediacodec_encoder_deps="mediacodec"
@@ -6612,6 +6615,9 @@ check_type "windows.h d3d11.h" "ID3D11VideoDecoder"
 check_type "windows.h d3d11.h" "ID3D11VideoContext"
 check_type "windows.h d3d12.h" "ID3D12Device"
 check_type "windows.h d3d12video.h" "ID3D12VideoDecoder"
+check_type "windows.h d3d12video.h" "ID3D12VideoEncoder"
+test_code cc "windows.h d3d12video.h" "D3D12_FEATURE_VIDEO feature = 
D3D12_FEATURE_VIDEO_ENCODER_CODEC" && \
+test_code cc "windows.h d3d12video.h" 
"D3D12_FEATURE_DATA_VIDEO_ENCODER_RESOURCE_REQUIREMENTS req" && enable 
d3d12_encoder_feature
 check_type "windows.h" "DPI_AWARENESS_CONTEXT" -D_WIN32_WINNT=0x0A00
 check_type "d3d9.h dxva2api.h" DXVA2_ConfigPictureDecode -D_WIN32_WINNT=0x0602
 check_func_headers mfapi.h MFCreateAlignedMemoryBuffer -lmfplat
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index f9a5c9d616..d7c24a1867 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -85,6 +85,7 @@ OBJS-$(CONFIG_CBS_MPEG2)   += cbs_mpeg2.o
 OBJS-$(CONFIG_CBS_VP8) += cbs_vp8.o vp8data.o
 OBJS-$(CONFIG_CBS_VP9) += cbs_vp9.o
 OBJS-$(CONFIG_CRYSTALHD)   += crystalhd.o
+OBJS-$(CONFIG_D3D12VA_ENCODE)  += d3d12va_encode.o hw_base_encode.o
 OBJS-$(CONFIG_DEFLATE_WRAPPER) += zlib_wrapper.o
 OBJS-$(CONFIG_DOVI_RPU)+= dovi_rpu.o
 OBJS-$(CONFIG_ERROR_RESILIENCE)+= error_resilience.o
@@ -435,6 +436,7 @@ OBJS-$(CONFIG_HEVC_DECODER)+= hevcdec.o 
hevc_mvs.o \
   h274.o
 OBJS-$(CONFIG_HEVC_AMF_ENCODER)+= amfenc_hevc.o
 OBJS-$(CONFIG_HEVC_CUVID_DECODER)  += cuviddec.o
+OBJS-$(CONFIG_HEVC_D3D12VA_ENCODER)+= d3d12va_encode_hevc.o
 OBJS-$(CONFIG_HEVC_MEDIACODEC_DECODER) += mediacodecdec.o
 OBJS-$(CONFIG_HEVC_MEDIACODEC_ENCODER) += mediacodecenc.o
 OBJS-$(CONFIG_HEVC_MF_ENCODER) += mfenc.o mf_utils.o
@@ -1304,7 +1306,7 @@ SKIPHEADERS+= %_tablegen.h
  \
 
 SKIPHEADERS-$(CONFIG_AMF)  += amfenc.h
 SKIPHEADERS-$(CONFIG_D3D11VA)  += d3d11va.h dxva2_internal.h
-SKIPHEADERS-$(CONFIG_D3D12VA)  += d3d12va_decode.h
+SKIPHEADERS-$(CONFIG_D3D12VA)  += d3d12va_decode.h d3d12va_encode.h
 SKIPHEADERS-$(CONFIG_DXVA2)+= dxva2.h dxva2_internal.h
 SKIPHEADERS-$(CONFIG_JNI)  += ffjni.h
 SKIPHEADERS-$(CONFIG_LCMS2)+= fflcms2.h
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index 93ce8e3224..b9df2b4752 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -864,6 +864,7 @@ extern const FFCodec ff_h264_vaapi_encoder;
 extern const FFCodec ff_h264_videotoolbox_encoder;
 extern const FFCodec ff_hevc_amf_encoder;
 extern const FFCodec ff_hevc_cuvid_decoder;
+extern const FFCodec ff_hevc_d3d12va_encoder;
 extern const FFCodec ff_hevc_mediacodec_decoder;
 extern const FFCodec ff_hevc_mediacodec_encoder;
 extern const FFCodec ff_hevc_mf_encoder;
diff --git a/libavcodec/d3d12va_encode.c b/libavcodec/d3d12va_encode.c
new file mode 100644
index 00..2dbf41d4b1
--- /dev/null
+++ b/libavcodec/d3d12va_encode.c
@@ -0,0 +1,1441 @@
+/*
+

[FFmpeg-devel] [PATCH] libavfi/dnn: add LibTorch as one of DNN backend

2024-01-21 Thread wenbin . chen-at-intel . com
From: Wenbin Chen 

PyTorch is an open source machine learning framework that accelerates
the path from research prototyping to production deployment. Official
websit: https://pytorch.org/. We call the C++ library of PyTorch as
LibTorch, the same below.

To build FFmpeg with LibTorch, please take following steps as reference:
1. download LibTorch C++ library in https://pytorch.org/get-started/locally/,
please select C++/Java for language, and other options as your need.
2. unzip the file to your own dir, with command
unzip libtorch-shared-with-deps-latest.zip -d your_dir
3. export libtorch_root/libtorch/include and
libtorch_root/libtorch/include/torch/csrc/api/include to $PATH
export libtorch_root/libtorch/lib/ to $LD_LIBRARY_PATH
4. config FFmpeg with ../configure --enable-libtorch 
--extra-cflag=-I/libtorch_root/libtorch/include 
--extra-cflag=-I/libtorch_root/libtorch/include/torch/csrc/api/include 
--extra-ldflags=-L/libtorch_root/libtorch/lib/
5. make

To run FFmpeg DNN inference with LibTorch backend:
./ffmpeg -i input.jpg -vf 
dnn_processing=dnn_backend=torch:model=LibTorch_model.pt -y output.jpg
The LibTorch_model.pt can be generated by Python with torch.jit.script() api. 
Please note, torch.jit.trace() is not recommanded, since it does not support 
ambiguous input size.

Signed-off-by: Ting Fu 
Signed-off-by: Wenbin Chen 
---
 configure |   5 +-
 libavfilter/dnn/Makefile  |   1 +
 libavfilter/dnn/dnn_backend_torch.cpp | 585 ++
 libavfilter/dnn/dnn_interface.c   |   5 +
 libavfilter/dnn_filter_common.c   |  31 +-
 libavfilter/dnn_interface.h   |   2 +-
 libavfilter/vf_dnn_processing.c   |   3 +
 7 files changed, 621 insertions(+), 11 deletions(-)
 create mode 100644 libavfilter/dnn/dnn_backend_torch.cpp

diff --git a/configure b/configure
index c8ae0a061d..75061692b1 100755
--- a/configure
+++ b/configure
@@ -279,6 +279,7 @@ External library support:
   --enable-libtheora   enable Theora encoding via libtheora [no]
   --enable-libtls  enable LibreSSL (via libtls), needed for https 
support
if openssl, gnutls or mbedtls is not used [no]
+  --enable-libtorchenable Torch as one DNN backend [no]
   --enable-libtwolame  enable MP2 encoding via libtwolame [no]
   --enable-libuavs3d   enable AVS3 decoding via libuavs3d [no]
   --enable-libv4l2 enable libv4l2/v4l-utils [no]
@@ -1901,6 +1902,7 @@ EXTERNAL_LIBRARY_LIST="
 libtensorflow
 libtesseract
 libtheora
+libtorch
 libtwolame
 libuavs3d
 libv4l2
@@ -2776,7 +2778,7 @@ cbs_vp9_select="cbs"
 deflate_wrapper_deps="zlib"
 dirac_parse_select="golomb"
 dovi_rpu_select="golomb"
-dnn_suggest="libtensorflow libopenvino"
+dnn_suggest="libtensorflow libopenvino libtorch"
 dnn_deps="avformat swscale"
 error_resilience_select="me_cmp"
 evcparse_select="golomb"
@@ -6872,6 +6874,7 @@ enabled libtensorflow && require libtensorflow 
tensorflow/c/c_api.h TF_Versi
 enabled libtesseract  && require_pkg_config libtesseract tesseract 
tesseract/capi.h TessBaseAPICreate
 enabled libtheora && require libtheora theora/theoraenc.h th_info_init 
-ltheoraenc -ltheoradec -logg
 enabled libtls&& require_pkg_config libtls libtls tls.h 
tls_configure
+enabled libtorch  && check_cxxflags -std=c++14 && require_cpp libtorch 
torch/torch.h "torch::Tensor" -ltorch -lc10 -ltorch_cpu -lstdc++ -lpthread
 enabled libtwolame&& require libtwolame twolame.h twolame_init 
-ltwolame &&
  { check_lib libtwolame twolame.h 
twolame_encode_buffer_float32_interleaved -ltwolame ||
die "ERROR: libtwolame must be installed and 
version must be >= 0.3.10"; }
diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
index 5d5697ea42..3d09927c98 100644
--- a/libavfilter/dnn/Makefile
+++ b/libavfilter/dnn/Makefile
@@ -6,5 +6,6 @@ OBJS-$(CONFIG_DNN)   += 
dnn/dnn_backend_common.o
 
 DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
 DNN-OBJS-$(CONFIG_LIBOPENVINO)   += dnn/dnn_backend_openvino.o
+DNN-OBJS-$(CONFIG_LIBTORCH)  += dnn/dnn_backend_torch.o
 
 OBJS-$(CONFIG_DNN)   += $(DNN-OBJS-yes)
diff --git a/libavfilter/dnn/dnn_backend_torch.cpp 
b/libavfilter/dnn/dnn_backend_torch.cpp
new file mode 100644
index 00..4fc76d0ce4
--- /dev/null
+++ b/libavfilter/dnn/dnn_backend_torch.cpp
@@ -0,0 +1,585 @@
+/*
+ * Copyright (c) 2024
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even th

[FFmpeg-devel] [PATCH] configure: autodetect libglslang ldflags

2024-01-21 Thread Matthew White via ffmpeg-devel
Since glslang 14.0.0, OGLCompiler and HLSL stub libraries have been
fully removed from the build.

This fixes the configuration by detecting if the stub libraries are
still present (glslang releases before version 14.0.0).

ffbuild/config.log:
/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: 
cannot find -lOSDependent: No such file or directory
/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: 
cannot find -lHLSL: No such file or directory
/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: 
cannot find -lOGLCompiler: No such file or directory

Addresses https://trac.ffmpeg.org/ticket/10713
See https://bugs.gentoo.org/show_bug.cgi?id=918989
Should fix https://ffmpeg.org/pipermail/ffmpeg-devel/2023-August/313666.html

Signed-off-by: Matthew White 
---
 configure | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index c8ae0a061d..abff488dc0 100755
--- a/configure
+++ b/configure
@@ -2626,6 +2626,7 @@ CMDLINE_SET="
 ignore_tests
 install
 ld
+libglslang_ldflags
 ln_s
 logfile
 malloc_prefix
@@ -6652,6 +6653,24 @@ if enabled_all libglslang libshaderc; then
 die "ERROR: libshaderc and libglslang are mutually exclusive, if in doubt, 
disable libglslang"
 fi
 
+if enabled libglslang; then
+if [ -x "$(command -v glslang)" ]; then
+# https://github.com/KhronosGroup/glslang
+# commit 6be56e45e574b375d759b89dad35f780bbd4792f: Remove 
`OGLCompiler` and `HLSL` stub libraries from build
+# StandAlone/StandAlone.cpp: 
"SpirvGeneratorVersion:GLSLANG_VERSION_MAJOR.GLSLANG_VERSION_MINOR.GLSLANG_VERSION_PATCH
 GLSLANG_VERSION_FLAVOR"
+glslang_version="$(glslang -dumpversion)"
+glslang_major="${glslang_version%%.*}"
+glslang_major="${glslang_major#*:}"
+if test ${glslang_major} -le 13; then
+libglslang_ldflags=" -lOSDependent -lHLSL -lOGLCompiler"
+elif ! [[ ${glslang_major} =~ ^[0-9]+$ ]]; then
+die "ERROR: glslang's computed major version isn't a number: 
'${glslang_major}'"
+fi
+else
+die "ERROR: glslang binary not found, impossible to determine 
installed glslang's version"
+fi
+fi
+
 check_cpp_condition winrt windows.h 
"!WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_DESKTOP)"
 
 if ! disabled w32threads && ! enabled pthreads; then
@@ -6771,10 +6790,10 @@ enabled libfreetype   && require_pkg_config 
libfreetype freetype2 "ft2build.
 enabled libfribidi&& require_pkg_config libfribidi fribidi fribidi.h 
fribidi_version_info
 enabled libharfbuzz   && require_pkg_config libharfbuzz harfbuzz hb.h 
hb_buffer_create
 enabled libglslang && { check_lib spirv_compiler 
glslang/Include/glslang_c_interface.h glslang_initialize_process \
--lglslang -lMachineIndependent -lOSDependent 
-lHLSL -lOGLCompiler -lGenericCodeGen \
+-lglslang -lMachineIndependent 
"${libglslang_ldflags}" -lGenericCodeGen \
 -lSPVRemapper -lSPIRV -lSPIRV-Tools-opt 
-lSPIRV-Tools -lpthread -lstdc++ -lm ||
 require spirv_compiler 
glslang/Include/glslang_c_interface.h glslang_initialize_process \
--lglslang -lOSDependent -lHLSL -lOGLCompiler \
+-lglslang "${libglslang_ldflags}" \
 -lSPVRemapper -lSPIRV -lSPIRV-Tools-opt 
-lSPIRV-Tools -lpthread -lstdc++ -lm; }
 enabled libgme&& { check_pkg_config libgme libgme gme/gme.h 
gme_new_emu ||
require libgme gme/gme.h gme_new_emu -lgme 
-lstdc++; }
-- 
2.43.0



pgpco13ImOOVM.pgp
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".