Re: [FFmpeg-devel] [PATCH 1/2] lavf/dvenc: improve error messaging
On date Sunday 2024-01-21 07:36:48 +0100, Anton Khirnov wrote: > Quoting Stefano Sabatini (2024-01-20 16:24:07) > > if ((c->sys->time_base.den != 25 && c->sys->time_base.den != 50) || > > c->sys->time_base.num != 1) { > > -if (c->ast[0] && c->ast[0]->codecpar->sample_rate != 48000) > > -goto bail_out; > > -if (c->ast[1] && c->ast[1]->codecpar->sample_rate != 48000) > > -goto bail_out; > > +int j; > > No need to declare a loop variable outside of the loop. Also, there's > already i. fixed locally > > -if (((c->n_ast > 1) && (c->sys->n_difchan < 2)) || > > -((c->n_ast > 2) && (c->sys->n_difchan < 4))) { > > -/* only 2 stereo pairs allowed in 50Mbps mode */ > > -goto bail_out; > > +if ((c->n_ast > 1) && (c->sys->n_difchan < 2)) { > > +av_log(s, AV_LOG_ERROR, > > + "Invalid number of channels %d, only 1 stereo pairs is > > allowed in 25Mps mode.\n", > > + c->n_ast); > > +return AVERROR_INVALIDDATA; > > +} > > +if ((c->n_ast > 2) && (c->sys->n_difchan < 4)) { > > +av_log(s, AV_LOG_ERROR, > > + "Invalid number of channels %d, only 2 stereo pairs are > > allowed in 50Mps mode.\n", > > + c->n_ast); > > +return AVERROR_INVALIDDATA; > > Surely this can be done in one log statement. Yes, but this would complicate the logic for small gain. > > > } > > > > /* Ok, everything seems to be in working order */ > > @@ -376,14 +427,14 @@ static DVMuxContext* dv_init_mux(AVFormatContext* s) > > if (!c->ast[i]) > > continue; > > c->audio_data[i] = av_fifo_alloc2(100 * MAX_AUDIO_FRAME_SIZE, 1, > > 0); > > -if (!c->audio_data[i]) > > -goto bail_out; > > +if (!c->audio_data[i]) { > > +av_log(s, AV_LOG_ERROR, > > + "Failed to allocate internal buffer.\n"); > > Dedicated log messages for small malloc failures are useless bloat. Dropped. > > > +return AVERROR(ENOMEM); > > +} > > } > > > > -return c; > > - > > -bail_out: > > -return NULL; > > +return 0; > > } > > > > static int dv_write_header(AVFormatContext *s) > > @@ -392,10 +443,10 @@ static int dv_write_header(AVFormatContext *s) > > DVMuxContext *dvc = s->priv_data; > > AVDictionaryEntry *tcr = av_dict_get(s->metadata, "timecode", NULL, 0); > > > > -if (!dv_init_mux(s)) { > > +if (dv_init_mux(s) < 0) { > > av_log(s, AV_LOG_ERROR, "Can't initialize DV format!\n" > > "Make sure that you supply exactly two streams:\n" > > This seems inconsistent with the other checks. Yes, probably it's better to drop this entirely since we have more puntual reporting now (and "exactly two stream" is wrong) > > > -" video: 25fps or 29.97fps, audio: > > 2ch/48|44|32kHz/PCM\n" > > +" video: 25fps or 29.97fps, audio: > > 2ch/48000|44100|32000Hz/PCM\n" > > This does not seem like an improvement. 44kHz != 44100 I could use 44.1 but this is not the unit used when setting the option, so better to be explicit. Thanks. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API
On 1/21/2024 3:27 AM, Anton Khirnov wrote: Quoting James Almer (2024-01-20 23:04:06) This includes a struct and helpers. It will be used to support container level cropping and tiled image formats, but should be generic enough for general usage. Signed-off-by: James Almer --- Extended to include fields used for cropping. Should make the struct reusable even for non tiled images, e.g. setting both rows and tiles to 1, in which case tile width and height would become analogous to coded_{witdh,height}. But why? What does cropping have to do with tiling? What advantage is there to handling them in one struct? The struct does not need to be used for non tiled image scenarios, but could if we decide we don't want to add another struct that would only contain a subset of the fields present here. As to why said fields here present here, HEIF may use a clap box to define cropping for the final image, not for the tiles. This needs to be propagated, and the previous version of this API, which only defined cropping from right and bottom edges if output dimensions were smaller than the grid (standard case for tiled heif with no clap box), was not enough. Hence this change. I can rename this struct to Image Grid or something else, which might make it feel less awkward if we decide to reuse it. We still need to propagate container cropping from clap boxes and from Matroska elements after all. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [vaapi-cavs 1/7] cavs: add cavs profile defs
Signed-off-by: jianfeng.zheng --- libavcodec/defs.h | 3 +++ libavcodec/profiles.c | 6 ++ libavcodec/profiles.h | 1 + 3 files changed, 10 insertions(+) diff --git a/libavcodec/defs.h b/libavcodec/defs.h index 00d840ec19..d59816a70f 100644 --- a/libavcodec/defs.h +++ b/libavcodec/defs.h @@ -192,6 +192,9 @@ #define AV_PROFILE_EVC_BASELINE 0 #define AV_PROFILE_EVC_MAIN 1 +#define AV_PROFILE_CAVS_JIZHUN 0x20 +#define AV_PROFILE_CAVS_GUANGDIAN 0x48 + #define AV_LEVEL_UNKNOWN -99 diff --git a/libavcodec/profiles.c b/libavcodec/profiles.c index 5bb8f150e6..b312f12281 100644 --- a/libavcodec/profiles.c +++ b/libavcodec/profiles.c @@ -200,4 +200,10 @@ const AVProfile ff_evc_profiles[] = { { AV_PROFILE_UNKNOWN }, }; +const AVProfile ff_cavs_profiles[] = { +{ AV_PROFILE_CAVS_JIZHUN, "Jizhun"}, +{ AV_PROFILE_CAVS_GUANGDIAN,"Guangdian" }, +{ AV_PROFILE_UNKNOWN }, +}; + #endif /* !CONFIG_SMALL */ diff --git a/libavcodec/profiles.h b/libavcodec/profiles.h index 270430a48b..9a2b348ad4 100644 --- a/libavcodec/profiles.h +++ b/libavcodec/profiles.h @@ -75,5 +75,6 @@ extern const AVProfile ff_prores_profiles[]; extern const AVProfile ff_mjpeg_profiles[]; extern const AVProfile ff_arib_caption_profiles[]; extern const AVProfile ff_evc_profiles[]; +extern const AVProfile ff_cavs_profiles[]; #endif /* AVCODEC_PROFILES_H */ -- 2.25.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [vaapi-cavs 2/7] cavs: skip bits between pic header and slc header
Signed-off-by: jianfeng.zheng --- libavcodec/cavs.h| 2 + libavcodec/cavsdec.c | 87 2 files changed, 89 insertions(+) diff --git a/libavcodec/cavs.h b/libavcodec/cavs.h index 244c322b35..ad49abff92 100644 --- a/libavcodec/cavs.h +++ b/libavcodec/cavs.h @@ -39,8 +39,10 @@ #define EXT_START_CODE 0x01b5 #define USER_START_CODE 0x01b2 #define CAVS_START_CODE 0x01b0 +#define VIDEO_SEQ_END_CODE 0x01b1 #define PIC_I_START_CODE0x01b3 #define PIC_PB_START_CODE 0x01b6 +#define VIDEO_EDIT_CODE 0x01b7 #define A_AVAIL 1 #define B_AVAIL 2 diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c index b356da0b04..9742bd1011 100644 --- a/libavcodec/cavsdec.c +++ b/libavcodec/cavsdec.c @@ -954,6 +954,80 @@ static inline int decode_slice_header(AVSContext *h, GetBitContext *gb) return 0; } +/** + * skip stuffing bits before next start code "0x01" + * @return '0' no stuffing bits placed at h->gb being skip, else '1'. + */ +static inline int skip_stuffing_bits(AVSContext *h) +{ +GetBitContext gb0 = h->gb; +GetBitContext *gb = &h->gb; +const uint8_t *start; +const uint8_t *ptr; +const uint8_t *end; +int align; +int stuffing_zeros; + +/** + * According to spec, there should be one stuffing_bit '1' and + * 0~7 stuffing_bit '0'. But seems not all the stream follow + * "next_start_code()" strictly. + */ +align = (-get_bits_count(gb)) & 7; +if (align == 0 && show_bits_long(gb, 8) == 0x80) { +skip_bits_long(gb, 8); +} + +/** + * skip leading zero bytes before 0x 00 00 01 stc + */ +ptr = start = align_get_bits(gb); +end = gb->buffer_end; +while (ptr < end && *ptr == 0) +ptr++; + +if ((ptr >= end) || (*ptr == 1 && ptr - start >= 2)) { +stuffing_zeros = (ptr >= end ? end - start : ptr - start - 2); +if (stuffing_zeros > 0) +av_log(h->avctx, AV_LOG_DEBUG, "Skip 0x%x stuffing zeros @0x%x.\n", +stuffing_zeros, (int)(start - gb->buffer)); +skip_bits_long(gb, stuffing_zeros * 8); +return 1; +} else { +av_log(h->avctx, AV_LOG_DEBUG, "No next_start_code() found @0x%x.\n", +(int)(start - gb->buffer)); +goto restore_get_bits; +} + +restore_get_bits: +h->gb = gb0; +return 0; +} + +static inline int skip_extension_and_user_data(AVSContext *h) +{ +int stc = -1; +const uint8_t *start = align_get_bits(&h->gb); +const uint8_t *end = h->gb.buffer_end; +const uint8_t *ptr, *next; + +for (ptr = start; ptr + 4 < end; ptr = next) { +stc = show_bits_long(&h->gb, 32); +if (stc != EXT_START_CODE && stc != USER_START_CODE) { +break; +} +next = avpriv_find_start_code(ptr + 4, end, &stc); +if (next < end) { +next -= 4; +} +skip_bits(&h->gb, (next - ptr) * 8); +av_log(h->avctx, AV_LOG_DEBUG, "skip %d byte ext/user data\n", +(int)(next - ptr)); +} + +return ptr > start; +} + static inline int check_for_slice(AVSContext *h) { GetBitContext *gb = &h->gb; @@ -1019,6 +1093,8 @@ static int decode_pic(AVSContext *h) h->stream_revision = 1; if (h->stream_revision > 0) skip_bits(&h->gb, 1); //marker_bit + +av_log(h->avctx, AV_LOG_DEBUG, "stream_revision: %d\n", h->stream_revision); } if (get_bits_left(&h->gb) < 23) @@ -1096,6 +1172,11 @@ static int decode_pic(AVSContext *h) h->alpha_offset = h->beta_offset = 0; } +if (h->stream_revision > 0) { +skip_stuffing_bits(h); +skip_extension_and_user_data(h); +} + ret = 0; if (h->cur.f->pict_type == AV_PICTURE_TYPE_I) { do { @@ -1309,6 +1390,12 @@ static int cavs_decode_frame(AVCodecContext *avctx, AVFrame *rframe, case USER_START_CODE: //mpeg_decode_user_data(avctx, buf_ptr, input_size); break; +case VIDEO_EDIT_CODE: +av_log(h->avctx, AV_LOG_WARNING, "Skip video_edit_code\n"); +break; +case VIDEO_SEQ_END_CODE: +av_log(h->avctx, AV_LOG_WARNING, "Skip video_sequence_end_code\n"); +break; default: if (stc <= SLICE_MAX_START_CODE) { init_get_bits(&h->gb, buf_ptr, input_size); -- 2.25.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [vaapi-cavs 3/7] cavs: time code debug
Signed-off-by: jianfeng.zheng --- libavcodec/cavsdec.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c index 9742bd1011..9ad0f29b01 100644 --- a/libavcodec/cavsdec.c +++ b/libavcodec/cavsdec.c @@ -1061,6 +1061,7 @@ static int decode_pic(AVSContext *h) int ret; int skip_count= -1; enum cavs_mb mb_type; +char tc[4]; if (!h->top_qp) { av_log(h->avctx, AV_LOG_ERROR, "No sequence header decoded yet\n"); @@ -1082,8 +1083,16 @@ static int decode_pic(AVSContext *h) return AVERROR_INVALIDDATA; } else { h->cur.f->pict_type = AV_PICTURE_TYPE_I; -if (get_bits1(&h->gb)) -skip_bits(&h->gb, 24);//time_code +if (get_bits1(&h->gb)) {//time_code +skip_bits(&h->gb, 1); +tc[0] = get_bits(&h->gb, 5); +tc[1] = get_bits(&h->gb, 6); +tc[2] = get_bits(&h->gb, 6); +tc[3] = get_bits(&h->gb, 6); +av_log(h->avctx, AV_LOG_DEBUG, "timecode: %d:%d:%d.%d\n", +tc[0], tc[1], tc[2], tc[3]); +} + /* old sample clips were all progressive and no low_delay, bump stream revision if detected otherwise */ if (h->low_delay || !(show_bits(&h->gb, 9) & 1)) -- 2.25.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [vaapi-cavs 4/7] cavs: fix dpb reorder issues when 'low_delay' is varied
Consider multi sequences in one stream, 'low_delay' may change between sequences. Signed-off-by: jianfeng.zheng --- libavcodec/cavs.c| 12 + libavcodec/cavs.h| 2 + libavcodec/cavsdec.c | 105 +-- 3 files changed, 95 insertions(+), 24 deletions(-) diff --git a/libavcodec/cavs.c b/libavcodec/cavs.c index fdd577f7fb..ed7b278336 100644 --- a/libavcodec/cavs.c +++ b/libavcodec/cavs.c @@ -810,6 +810,14 @@ av_cold int ff_cavs_init(AVCodecContext *avctx) if (!h->cur.f || !h->DPB[0].f || !h->DPB[1].f) return AVERROR(ENOMEM); +h->out[0].f = av_frame_alloc(); +h->out[1].f = av_frame_alloc(); +h->out[2].f = av_frame_alloc(); +if (!h->out[0].f || !h->out[1].f || !h->out[2].f) { +ff_cavs_end(avctx); +return AVERROR(ENOMEM); +} + h->luma_scan[0] = 0; h->luma_scan[1] = 8; h->intra_pred_l[INTRA_L_VERT] = intra_pred_vert; @@ -840,6 +848,10 @@ av_cold int ff_cavs_end(AVCodecContext *avctx) av_frame_free(&h->DPB[0].f); av_frame_free(&h->DPB[1].f); +av_frame_free(&h->out[0].f); +av_frame_free(&h->out[1].f); +av_frame_free(&h->out[2].f); + av_freep(&h->top_qp); av_freep(&h->top_mv[0]); av_freep(&h->top_mv[1]); diff --git a/libavcodec/cavs.h b/libavcodec/cavs.h index ad49abff92..f490657959 100644 --- a/libavcodec/cavs.h +++ b/libavcodec/cavs.h @@ -166,6 +166,7 @@ struct dec_2dvlc { typedef struct AVSFrame { AVFrame *f; int poc; +int outputed; } AVSFrame; typedef struct AVSContext { @@ -177,6 +178,7 @@ typedef struct AVSContext { GetBitContext gb; AVSFrame cur; ///< currently decoded frame AVSFrame DPB[2]; ///< reference frames +AVSFrame out[3]; ///< output queue, size 2 maybe enough int dist[2]; ///< temporal distances from current frame to ref frames int low_delay; int profile, level; diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c index 9ad0f29b01..6f462d861c 100644 --- a/libavcodec/cavsdec.c +++ b/libavcodec/cavsdec.c @@ -1056,6 +1056,44 @@ static inline int check_for_slice(AVSContext *h) * / +/** + * @brief remove frame out of dpb + */ +static void cavs_frame_unref(AVSFrame *frame) +{ +/* frame->f can be NULL if context init failed */ +if (!frame->f || !frame->f->buf[0]) +return; + +av_frame_unref(frame->f); +} + +static int output_one_frame(AVSContext *h, AVFrame *data, int *got_frame) +{ +if (h->out[0].f->buf[0]) { +av_log(h->avctx, AV_LOG_DEBUG, "output frame: poc=%d\n", h->out[0].poc); +av_frame_move_ref(data, h->out[0].f); +*got_frame = 1; + +// out[0] <- out[1] <- out[2] <- out[0] +cavs_frame_unref(&h->out[2]); +FFSWAP(AVSFrame, h->out[0], h->out[2]); +FFSWAP(AVSFrame, h->out[0], h->out[1]); + +return 1; +} + +return 0; +} + +static void queue_one_frame(AVSContext *h, AVSFrame *out) +{ +int idx = !h->out[0].f->buf[0] ? 0 : (!h->out[1].f->buf[0] ? 1 : 2); +av_log(h->avctx, AV_LOG_DEBUG, "queue in out[%d]: poc=%d\n", idx, out->poc); +av_frame_ref(h->out[idx].f, out->f); +h->out[idx].poc = out->poc; +} + static int decode_pic(AVSContext *h) { int ret; @@ -1068,7 +1106,7 @@ static int decode_pic(AVSContext *h) return AVERROR_INVALIDDATA; } -av_frame_unref(h->cur.f); +cavs_frame_unref(&h->cur); skip_bits(&h->gb, 16);//bbv_dwlay if (h->stc == PIC_PB_START_CODE) { @@ -1077,10 +1115,13 @@ static int decode_pic(AVSContext *h) av_log(h->avctx, AV_LOG_ERROR, "illegal picture type\n"); return AVERROR_INVALIDDATA; } + /* make sure we have the reference frames we need */ -if (!h->DPB[0].f->data[0] || - (!h->DPB[1].f->data[0] && h->cur.f->pict_type == AV_PICTURE_TYPE_B)) +if (!h->DPB[0].f->buf[0] || +(!h->DPB[1].f->buf[0] && h->cur.f->pict_type == AV_PICTURE_TYPE_B)) { +av_log(h->avctx, AV_LOG_ERROR, "Invalid reference frame\n"); return AVERROR_INVALIDDATA; +} } else { h->cur.f->pict_type = AV_PICTURE_TYPE_I; if (get_bits1(&h->gb)) {//time_code @@ -1124,6 +1165,8 @@ static int decode_pic(AVSContext *h) if ((ret = ff_cavs_init_pic(h)) < 0) return ret; h->cur.poc = get_bits(&h->gb, 8) * 2; +av_log(h->avctx, AV_LOG_DEBUG, "poc=%d, type=%d\n", +h->cur.poc, h->cur.f->pict_type); /* get temporal distances and MV scaling factors */ if (h->cur.f->pict_type != AV_PICTURE_TYPE_B) { @@ -1137,6 +1180,8 @@ static int decode_pic(AVSContext *h) if (h->cur.f->pict_type == AV_PICTURE_TYPE_B) { h->sym_factor = h->dist[0] * h->scale_den[1]; if (FFABS(h->sym_factor) > 32768) { +av_log(h->avctx, AV_LOG_ERROR, "poc=%d/%d/
[FFmpeg-devel] [vaapi-cavs 5/7] cavs: decode wqm and slice weighting for future usage
Signed-off-by: jianfeng.zheng --- libavcodec/cavs.h| 26 +++- libavcodec/cavsdec.c | 142 +-- 2 files changed, 147 insertions(+), 21 deletions(-) diff --git a/libavcodec/cavs.h b/libavcodec/cavs.h index f490657959..33ef10e850 100644 --- a/libavcodec/cavs.h +++ b/libavcodec/cavs.h @@ -186,12 +186,36 @@ typedef struct AVSContext { int mb_width, mb_height; int width, height; int stream_revision; ///<0 for samples from 2006, 1 for rm52j encoder -int progressive; +int progressive_seq; +int progressive_frame; int pic_structure; int skip_mode_flag; ///< select between skip_count or one skip_flag per MB int loop_filter_disable; int alpha_offset, beta_offset; int ref_flag; + +/** \defgroup guangdian profile + * @{ + */ +int aec_flag; +int weight_quant_flag; +int chroma_quant_param_delta_cb; +int chroma_quant_param_delta_cr; +uint8_t wqm_8x8[64]; +/**@}*/ + +/** \defgroup slice weighting + * FFmpeg don't support slice weighting natively, but maybe needed for HWaccel. + * @{ + */ +uint32_t slice_weight_pred_flag : 1; +uint32_t mb_weight_pred_flag: 1; +uint8_t luma_scale[4]; +int8_t luma_shift[4]; +uint8_t chroma_scale[4]; +int8_t chroma_shift[4]; +/**@}*/ + int mbx, mby, mbidx; ///< macroblock coordinates int flags; ///< availability flags of neighbouring macroblocks int stc; ///< last start code diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c index 6f462d861c..8d3ba530a6 100644 --- a/libavcodec/cavsdec.c +++ b/libavcodec/cavsdec.c @@ -30,6 +30,7 @@ #include "avcodec.h" #include "get_bits.h" #include "golomb.h" +#include "profiles.h" #include "cavs.h" #include "codec_internal.h" #include "decode.h" @@ -37,6 +38,43 @@ #include "mpeg12data.h" #include "startcode.h" +static const uint8_t default_wq_param[4][6] = { +{128, 98, 106, 116, 116, 128}, +{135, 143, 143, 160, 160, 213}, +{128, 98, 106, 116, 116, 128}, +{128, 128, 128, 128, 128, 128}, +}; +static const uint8_t wq_model_2_param[4][64] = { +{ +0, 0, 0, 4, 4, 4, 5, 5, +0, 0, 3, 3, 3, 3, 5, 5, +0, 3, 2, 2, 1, 1, 5, 5, +4, 3, 2, 2, 1, 5, 5, 5, +4, 3, 1, 1, 5, 5, 5, 5, +4, 3, 1, 5, 5, 5, 5, 5, +5, 5, 5, 5, 5, 5, 5, 5, +5, 5, 5, 5, 5, 5, 5, 5, +}, { +0, 0, 0, 4, 4, 4, 5, 5, +0, 0, 4, 4, 4, 4, 5, 5, +0, 3, 2, 2, 2, 1, 5, 5, +3, 3, 2, 2, 1, 5, 5, 5, +3, 3, 2, 1, 5, 5, 5, 5, +3, 3, 1, 5, 5, 5, 5, 5, +5, 5, 5, 5, 5, 5, 5, 5, +5, 5, 5, 5, 5, 5, 5, 5, +}, { +0, 0, 0, 4, 4, 3, 5, 5, +0, 0, 4, 4, 3, 2, 5, 5, +0, 4, 4, 3, 2, 1, 5, 5, +4, 4, 3, 2, 1, 5, 5, 5, +4, 3, 2, 1, 5, 5, 5, 5, +3, 2, 1, 5, 5, 5, 5, 5, +5, 5, 5, 5, 5, 5, 5, 5, +5, 5, 5, 5, 5, 5, 5, 5, +} +}; + static const uint8_t mv_scan[4] = { MV_FWD_X0, MV_FWD_X1, MV_FWD_X2, MV_FWD_X3 @@ -927,7 +965,11 @@ static int decode_mb_b(AVSContext *h, enum cavs_mb mb_type) static inline int decode_slice_header(AVSContext *h, GetBitContext *gb) { -if (h->stc > 0xAF) +int i, nref; + +av_log(h->avctx, AV_LOG_TRACE, "slice start code 0x%02x\n", h->stc); + +if (h->stc > SLICE_MAX_START_CODE) av_log(h->avctx, AV_LOG_ERROR, "unexpected start code 0x%02x\n", h->stc); if (h->stc >= h->mb_height) { @@ -946,11 +988,29 @@ static inline int decode_slice_header(AVSContext *h, GetBitContext *gb) } /* inter frame or second slice can have weighting params */ if ((h->cur.f->pict_type != AV_PICTURE_TYPE_I) || -(!h->pic_structure && h->mby >= h->mb_width / 2)) -if (get_bits1(gb)) { //slice_weighting_flag +(!h->pic_structure && h->mby >= h->mb_height / 2)) { +h->slice_weight_pred_flag = get_bits1(gb); +if (h->slice_weight_pred_flag) { +nref = h->cur.f->pict_type == AV_PICTURE_TYPE_I ? 1 : (h->pic_structure ? 2 : 4); +for (i = 0; i < nref; i++) { +h->luma_scale[i] = get_bits(gb, 8); +h->luma_shift[i] = get_sbits(gb, 8); +skip_bits1(gb); +h->chroma_scale[i] = get_bits(gb, 8); +h->chroma_shift[i] = get_sbits(gb, 8); +skip_bits1(gb); +} +h->mb_weight_pred_flag = get_bits1(gb); +if (!h->avctx->hwaccel) { av_log(h->avctx, AV_LOG_ERROR, "weighted prediction not yet supported\n"); } +} +} +if (h->aec_flag) { +align_get_bits(gb); +} + return 0; } @@ -1108,7 +1168,11 @@ static int decode_pic(AVSContext *h) cavs_frame_unref(&h->cur); -skip_bits(&h->gb, 16);//bbv_dwlay +skip_bits(&h->gb, 16);//bbv_delay +if (h->profile == AV_PROFILE_CAVS_GUANGDIAN) {
[FFmpeg-devel] [vaapi-cavs 7/7] cavs: support vaapi hwaccel decoding
see https://github.com/intel/libva/pull/738 Signed-off-by: jianfeng.zheng --- configure | 14 libavcodec/Makefile | 1 + libavcodec/cavs.h | 4 + libavcodec/cavsdec.c | 101 +-- libavcodec/hwaccels.h | 1 + libavcodec/vaapi_cavs.c | 164 ++ libavcodec/vaapi_decode.c | 4 + 7 files changed, 284 insertions(+), 5 deletions(-) create mode 100644 libavcodec/vaapi_cavs.c diff --git a/configure b/configure index c8ae0a061d..89759eda5d 100755 --- a/configure +++ b/configure @@ -2463,6 +2463,7 @@ HAVE_LIST=" xmllint zlib_gzip openvino2 +va_profile_avs " # options emitted with CONFIG_ prefix but not available on the command line @@ -3202,6 +3203,7 @@ wmv3_dxva2_hwaccel_select="vc1_dxva2_hwaccel" wmv3_nvdec_hwaccel_select="vc1_nvdec_hwaccel" wmv3_vaapi_hwaccel_select="vc1_vaapi_hwaccel" wmv3_vdpau_hwaccel_select="vc1_vdpau_hwaccel" +cavs_vaapi_hwaccel_deps="vaapi va_profile_avs VAPictureParameterBufferAVS" # hardware-accelerated codecs mediafoundation_deps="mftransform_h MFCreateAlignedMemoryBuffer" @@ -7175,6 +7177,18 @@ if enabled vaapi; then check_type "va/va.h va/va_enc_vp8.h" "VAEncPictureParameterBufferVP8" check_type "va/va.h va/va_enc_vp9.h" "VAEncPictureParameterBufferVP9" check_type "va/va.h va/va_enc_av1.h" "VAEncPictureParameterBufferAV1" + +# +# Using 'VA_CHECK_VERSION' in source codes make things easy. But we have to wait +# until newly added VAProfile being distributed by VAAPI released version. +# +# Before or after that, we can use auto-detection to keep version compatibility. +# It always works. +# +disable va_profile_avs && +test_code cc va/va.h "VAProfile p1 = VAProfileAVSJizhun, p2 = VAProfileAVSGuangdian;" && +enable va_profile_avs +enabled va_profile_avs && check_type "va/va.h va/va_dec_avs.h" "VAPictureParameterBufferAVS" fi if enabled_all opencl libdrm ; then diff --git a/libavcodec/Makefile b/libavcodec/Makefile index bb42095165..7d92375fed 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -1055,6 +1055,7 @@ OBJS-$(CONFIG_VP9_VAAPI_HWACCEL) += vaapi_vp9.o OBJS-$(CONFIG_VP9_VDPAU_HWACCEL) += vdpau_vp9.o OBJS-$(CONFIG_VP9_VIDEOTOOLBOX_HWACCEL) += videotoolbox_vp9.o OBJS-$(CONFIG_VP8_QSV_HWACCEL)+= qsvdec.o +OBJS-$(CONFIG_CAVS_VAAPI_HWACCEL) += vaapi_cavs.o # Objects duplicated from other libraries for shared builds SHLIBOBJS += log2_tab.o reverse.o diff --git a/libavcodec/cavs.h b/libavcodec/cavs.h index 33ef10e850..4a0918da5a 100644 --- a/libavcodec/cavs.h +++ b/libavcodec/cavs.h @@ -167,10 +167,14 @@ typedef struct AVSFrame { AVFrame *f; int poc; int outputed; + +AVBufferRef *hwaccel_priv_buf; +void *hwaccel_picture_private; } AVSFrame; typedef struct AVSContext { AVCodecContext *avctx; +int got_pix_fmt; BlockDSPContext bdsp; H264ChromaContext h264chroma; VideoDSPContext vdsp; diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c index 5036ef50f7..5ca021c098 100644 --- a/libavcodec/cavsdec.c +++ b/libavcodec/cavsdec.c @@ -25,11 +25,14 @@ * @author Stefan Gehrer */ +#include "config_components.h" #include "libavutil/avassert.h" #include "libavutil/emms.h" #include "avcodec.h" #include "get_bits.h" #include "golomb.h" +#include "hwaccel_internal.h" +#include "hwconfig.h" #include "profiles.h" #include "cavs.h" #include "codec_internal.h" @@ -1002,9 +1005,9 @@ static inline int decode_slice_header(AVSContext *h, GetBitContext *gb) } h->mb_weight_pred_flag = get_bits1(gb); if (!h->avctx->hwaccel) { -av_log(h->avctx, AV_LOG_ERROR, - "weighted prediction not yet supported\n"); -} +av_log(h->avctx, AV_LOG_ERROR, +"weighted prediction not yet supported\n"); +} } } if (h->aec_flag) { @@ -1115,6 +1118,46 @@ static inline int check_for_slice(AVSContext *h) * frame level * / +static int hwaccel_pic(AVSContext *h) +{ +int ret = 0; +int stc = -1; +const uint8_t *frm_start = align_get_bits(&h->gb); +const uint8_t *frm_end = h->gb.buffer_end; +const uint8_t *slc_start = frm_start; +const uint8_t *slc_end = frm_end; +GetBitContext gb = h->gb; +const FFHWAccel *hwaccel = ffhwaccel(h->avctx->hwaccel); + +ret = hwaccel->start_frame(h->avctx, NULL, 0); +if (ret < 0) +return ret; + +for (slc_start = frm_start; slc_start + 4 < frm_end; slc_start = slc_end) { +slc_end = avpriv_find_start_code(slc_start + 4, frm_end, &stc); +if (slc_end < frm_end) { +slc_end -= 4; +} + +init_get_bits(&h->gb, slc_start, (slc_end - s
[FFmpeg-devel] [vaapi-cavs 6/7] cavs: set profile & level for AVCodecContext
Signed-off-by: jianfeng.zheng --- libavcodec/cavsdec.c | 5 - tests/ref/fate/cavs-demux | 2 +- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/libavcodec/cavsdec.c b/libavcodec/cavsdec.c index 8d3ba530a6..5036ef50f7 100644 --- a/libavcodec/cavsdec.c +++ b/libavcodec/cavsdec.c @@ -1499,7 +1499,10 @@ static int cavs_decode_frame(AVCodecContext *avctx, AVFrame *rframe, switch (stc) { case CAVS_START_CODE: init_get_bits(&h->gb, buf_ptr, input_size); -decode_seq_header(h); +if ((ret = decode_seq_header(h)) < 0) +return ret; +avctx->profile = h->profile; +avctx->level = h->level; break; case PIC_I_START_CODE: if (!h->got_keyframe) { diff --git a/tests/ref/fate/cavs-demux b/tests/ref/fate/cavs-demux index 000b32ab05..6381f2075b 100644 --- a/tests/ref/fate/cavs-demux +++ b/tests/ref/fate/cavs-demux @@ -58,5 +58,5 @@ packet|codec_type=video|stream_index=0|pts=228|pts_time=1.90|dts=228 packet|codec_type=video|stream_index=0|pts=232|pts_time=1.93|dts=232|dts_time=1.93|duration=4|duration_time=0.03|size=67|pos=172185|flags=K__|data_hash=CRC32:42484449 packet|codec_type=video|stream_index=0|pts=236|pts_time=1.97|dts=236|dts_time=1.97|duration=4|duration_time=0.03|size=83|pos=172252|flags=K__|data_hash=CRC32:a941bdf0 packet|codec_type=video|stream_index=0|pts=240|pts_time=2.00|dts=240|dts_time=2.00|duration=4|duration_time=0.03|size=5417|pos=172335|flags=K__|data_hash=CRC32:9d0d503b -stream|index=0|codec_name=cavs|profile=unknown|codec_type=video|codec_tag_string=[0][0][0][0]|codec_tag=0x|width=1280|height=720|coded_width=1280|coded_height=720|closed_captions=0|film_grain=0|has_b_frames=0|sample_aspect_ratio=N/A|display_aspect_ratio=N/A|pix_fmt=yuv420p|level=-99|color_range=unknown|color_space=unknown|color_transfer=unknown|color_primaries=unknown|chroma_location=unspecified|field_order=unknown|refs=1|id=N/A|r_frame_rate=30/1|avg_frame_rate=25/1|time_base=1/120|start_pts=N/A|start_time=N/A|duration_ts=N/A|duration=N/A|bit_rate=N/A|max_bit_rate=N/A|bits_per_raw_sample=N/A|nb_frames=N/A|nb_read_frames=N/A|nb_read_packets=60|extradata_size=18|extradata_hash=CRC32:1255d52e|disposition:default=0|disposition:dub=0|disposition:original=0|disposition:comment=0|disposition:lyrics=0|disposition:karaoke=0|disposition:forced=0|disposition:hearing_impaired=0|disposition:visual_impaired=0|disposition:clean_effects=0|disposition:attached_pic=0|disposition:timed_thumbna ils=0|disposition:non_diegetic=0|disposition:captions=0|disposition:descriptions=0|disposition:metadata=0|disposition:dependent=0|disposition:still_image=0 +stream|index=0|codec_name=cavs|profile=32|codec_type=video|codec_tag_string=[0][0][0][0]|codec_tag=0x|width=1280|height=720|coded_width=1280|coded_height=720|closed_captions=0|film_grain=0|has_b_frames=0|sample_aspect_ratio=N/A|display_aspect_ratio=N/A|pix_fmt=yuv420p|level=64|color_range=unknown|color_space=unknown|color_transfer=unknown|color_primaries=unknown|chroma_location=unspecified|field_order=unknown|refs=1|id=N/A|r_frame_rate=30/1|avg_frame_rate=25/1|time_base=1/120|start_pts=N/A|start_time=N/A|duration_ts=N/A|duration=N/A|bit_rate=N/A|max_bit_rate=N/A|bits_per_raw_sample=N/A|nb_frames=N/A|nb_read_frames=N/A|nb_read_packets=60|extradata_size=18|extradata_hash=CRC32:1255d52e|disposition:default=0|disposition:dub=0|disposition:original=0|disposition:comment=0|disposition:lyrics=0|disposition:karaoke=0|disposition:forced=0|disposition:hearing_impaired=0|disposition:visual_impaired=0|disposition:clean_effects=0|disposition:attached_pic=0|disposition:timed_thumbnails=0| disposition:non_diegetic=0|disposition:captions=0|disposition:descriptions=0|disposition:metadata=0|disposition:dependent=0|disposition:still_image=0 format|filename=bunny.mp4|nb_streams=1|nb_programs=0|format_name=cavsvideo|start_time=N/A|duration=N/A|size=177752|bit_rate=N/A|probe_score=51 -- 2.25.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v1 2/2] vaapi: add vaapi_avs2 support
Zhao Zhili 于2024年1月20日周六 12:22写道: > > > > -Original Message- > > From: ffmpeg-devel On Behalf Of > > jianfeng.zheng > > Sent: 2024年1月19日 23:53 > > To: ffmpeg-devel@ffmpeg.org > > Cc: jianfeng.zheng > > Subject: [FFmpeg-devel] [PATCH v1 2/2] vaapi: add vaapi_avs2 support > > > > see https://github.com/intel/libva/pull/738 > > > > [Moore Threads](https://www.mthreads.com) (short for Mthreads) is a > > Chinese GPU manufacturer. All our products, like MTTS70/MTTS80/.. , > > support AVS2 8bit/10bit HW decoding at max 8k resolution. > > > > Signed-off-by: jianfeng.zheng > > --- > > configure| 7 + > > libavcodec/Makefile | 2 + > > libavcodec/allcodecs.c | 1 + > > libavcodec/avs2.c| 345 ++- > > libavcodec/avs2.h| 460 +++- > > libavcodec/avs2_parser.c | 5 +- > > libavcodec/avs2dec.c | 569 + > > libavcodec/avs2dec.h | 48 +++ > > libavcodec/avs2dec_headers.c | 787 +++ > > libavcodec/codec_desc.c | 5 +- > > libavcodec/defs.h| 4 + > > libavcodec/hwaccels.h| 1 + > > libavcodec/libdavs2.c| 2 +- > > libavcodec/profiles.c| 6 + > > libavcodec/profiles.h| 1 + > > libavcodec/vaapi_avs2.c | 227 ++ > > libavcodec/vaapi_decode.c| 5 + > > libavformat/matroska.c | 1 + > > libavformat/mpeg.h | 1 + > > 19 files changed, 2450 insertions(+), 27 deletions(-) > > create mode 100644 libavcodec/avs2dec.c > > create mode 100644 libavcodec/avs2dec.h > > create mode 100644 libavcodec/avs2dec_headers.c > > create mode 100644 libavcodec/vaapi_avs2.c > > > > Please split the patch properly. It's hard to review in a single chunk, and > it can't be tested > without the hardware. As a new player in the GPU market, we have always attached great importance to the participation of the open source community. And willing to feed back our new features to the field of video hardware acceleration. As a pioneer, these new features may only be supported by our hardware at current time. We are willing to provide some market selling devices for free to community accredited contributors for testing related functions. > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] flv: fix stereo flag when writing PCMA/PCMU
Currently, when writing PCMA or PCMU tracks with FLV or RTMP, the stereo flag and sample rate flag inside RTMP audio messages are overridden, making impossible to distinguish between mono and stereo tracks. This patch fixes the issue by restoring the same flag mechanism of all other codecs, that takes into consideration the right channel count and sample rate. Signed-off-by: Alessandro Ros --- libavformat/flvenc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavformat/flvenc.c b/libavformat/flvenc.c index 874560fac1..772d891136 100644 --- a/libavformat/flvenc.c +++ b/libavformat/flvenc.c @@ -208,10 +208,10 @@ error: flags |= FLV_CODECID_NELLYMOSER| FLV_SAMPLESSIZE_16BIT; break; case AV_CODEC_ID_PCM_MULAW: -flags = FLV_CODECID_PCM_MULAW | FLV_SAMPLERATE_SPECIAL | FLV_SAMPLESSIZE_16BIT; +flags |= FLV_CODECID_PCM_MULAW | FLV_SAMPLESSIZE_16BIT; break; case AV_CODEC_ID_PCM_ALAW: -flags = FLV_CODECID_PCM_ALAW | FLV_SAMPLERATE_SPECIAL | FLV_SAMPLESSIZE_16BIT; +flags |= FLV_CODECID_PCM_ALAW | FLV_SAMPLESSIZE_16BIT; break; case 0: flags |= par->codec_tag << 4; -- 2.34.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v4 0/2] GSoC 2023: Add Audio Overlay Filter
Ping On Tue, 16 Jan 2024 at 5:46 PM, Harshit Karwal wrote: > Includes some fixes authored by Paul over the v3 patch I sent earlier, and > FATE tests for the filter. > > Harshit Karwal (2): > avfilter: add audio overlay filter > fate: Add tests for aoverlay filter > > doc/filters.texi | 40 ++ > libavfilter/Makefile | 1 + > libavfilter/af_aoverlay.c | 538 + > libavfilter/allfilters.c | 1 + > tests/fate/filter-audio.mak| 22 + > tests/ref/fate/filter-aoverlay-crossfade-d | 224 + > tests/ref/fate/filter-aoverlay-crossfade-t | 202 > tests/ref/fate/filter-aoverlay-default | 259 ++ > tests/ref/fate/filter-aoverlay-timeline| 254 ++ > 9 files changed, 1541 insertions(+) > create mode 100644 libavfilter/af_aoverlay.c > create mode 100644 tests/ref/fate/filter-aoverlay-crossfade-d > create mode 100644 tests/ref/fate/filter-aoverlay-crossfade-t > create mode 100644 tests/ref/fate/filter-aoverlay-default > create mode 100644 tests/ref/fate/filter-aoverlay-timeline > > -- > 2.39.3 (Apple Git-145) > > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API
Quoting James Almer (2024-01-21 13:06:28) > On 1/21/2024 3:27 AM, Anton Khirnov wrote: > > Quoting James Almer (2024-01-20 23:04:06) > >> This includes a struct and helpers. It will be used to support container > >> level > >> cropping and tiled image formats, but should be generic enough for general > >> usage. > >> > >> Signed-off-by: James Almer > >> --- > >> Extended to include fields used for cropping. Should make the struct > >> reusable > >> even for non tiled images, e.g. setting both rows and tiles to 1, in which > >> case > >> tile width and height would become analogous to coded_{witdh,height}. > > > > But why? What does cropping have to do with tiling? What advantage is > > there to handling them in one struct? > > The struct does not need to be used for non tiled image scenarios, but > could if we decide we don't want to add another struct that would only > contain a subset of the fields present here. > > As to why said fields here present here, HEIF may use a clap box to > define cropping for the final image, not for the tiles. This needs to be > propagated, and the previous version of this API, which only defined > cropping from right and bottom edges if output dimensions were smaller > than the grid (standard case for tiled heif with no clap box), was not > enough. Hence this change. > > I can rename this struct to Image Grid or something else, which might > make it feel less awkward if we decide to reuse it. We still need to > propagate container cropping from clap boxes and from Matroska elements > after all. Honestly this whole new API strikes me as massively overthinking it. All you should need to describe an arbitrary partition of an image into sub-rectangles is an array of (x, y, width, height). Instead you're proposing a new public header, struct, three functions, multiple "tile types", and if I'm not mistaken it still cannot describe an arbitrary partitioning. Plus it's in libavutil for some reason, even though libavformat seems to be the only intended user. Is all this complexity really warranted? -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2] lavf/dvenc: improve error messaging
Quoting Stefano Sabatini (2024-01-21 11:30:27) > > > -if (((c->n_ast > 1) && (c->sys->n_difchan < 2)) || > > > -((c->n_ast > 2) && (c->sys->n_difchan < 4))) { > > > -/* only 2 stereo pairs allowed in 50Mbps mode */ > > > -goto bail_out; > > > +if ((c->n_ast > 1) && (c->sys->n_difchan < 2)) { > > > +av_log(s, AV_LOG_ERROR, > > > + "Invalid number of channels %d, only 1 stereo pairs is > > > allowed in 25Mps mode.\n", > > > + c->n_ast); > > > +return AVERROR_INVALIDDATA; > > > +} > > > +if ((c->n_ast > 2) && (c->sys->n_difchan < 4)) { > > > +av_log(s, AV_LOG_ERROR, > > > + "Invalid number of channels %d, only 2 stereo pairs are > > > allowed in 50Mps mode.\n", > > > + c->n_ast); > > > +return AVERROR_INVALIDDATA; > > > > > Surely this can be done in one log statement. > > Yes, but this would complicate the logic for small gain. More complicated than duplicating 5 lines? I wouldn't say so, not to mention the string also has to be duplicated in the binary. Also, can the second case even trigger? Seems like the block above ensures n_ast is never larger than 2. > > > -" video: 25fps or 29.97fps, audio: > > > 2ch/48|44|32kHz/PCM\n" > > > > +" video: 25fps or 29.97fps, audio: > > > 2ch/48000|44100|32000Hz/PCM\n" > > > > This does not seem like an improvement. > > 44kHz != 44100 > > I could use 44.1 but this is not the unit used when setting the > option It can be. -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 7/8] fftools/ffmpeg_demux: implement -bsf for input
Quoting Stefano Sabatini (2024-01-20 12:32:42) > On date Wednesday 2024-01-17 10:02:31 +0100, Anton Khirnov wrote: > > Quoting Stefano Sabatini (2024-01-06 13:12:19) > > > > > > This looks spurious, since this suggests the example is about the > > > listing, and it's applying a weird order of example/explanation > > > (rather than the opposite). > > Use the @code{-bsfs} option to get the list of bitstream filters. E.g. > @example > ... > > The problem here is that "E.g." is placed close to a statement about > the listing, therefore it might sound like the example is about the > listing (which is not). I moved it to a new paragraph. > > I see nothing weird about this order, it's the standard way it is done > > in most literature I encounter. I find the reverse order you're > > suggesting far more weird and unnatural. > > When you present an example you usually start with an explanation > (what it does) and then present the command, not the other way around. I don't, neither does most literature I can recall. Typically you first present a thing, then explain its structure. Explaning the structure of something the reader has not seen yet is backwards, unnatural, and hard to understand. > > Also the following: > -- > ffmpeg -bsf:v h264_mp4toannexb -i h264.mp4 -c:v copy -an out.h264 > @end example > applies the @code{h264_mp4toannexb} bitstream filter (which converts > MP4-encapsulated H.264 stream to Annex B) to the @emph{input} video stream. > > On the other hand, > @example > ffmpeg -i file.mov -an -vn -bsf:s mov2textsub -c:s copy -f rawvideo sub.txt > @end example > applies the @code{mov2textsub} bitstream filter (which extracts text from MOV > subtitles) to the @emph{output} subtitle stream. Note, however, that since > both > examples use @code{-c copy}, it matters little whether the filters are applied > on input or output - that would change if transcoding was hapenning. > --- > > this makes the reader need to correlate the two examples to figure > them out, that's why I reworked the presentation in my suggestion as a > more linear sequence of presentation/command/presentation/command. > > In general examples should focus on how a task can be done, not on the > explanation of the command itself. I disagree. Examples should focus on whatever can be usefully explained with an example. -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API
On 1/21/2024 2:29 PM, Anton Khirnov wrote: Quoting James Almer (2024-01-21 13:06:28) On 1/21/2024 3:27 AM, Anton Khirnov wrote: Quoting James Almer (2024-01-20 23:04:06) This includes a struct and helpers. It will be used to support container level cropping and tiled image formats, but should be generic enough for general usage. Signed-off-by: James Almer --- Extended to include fields used for cropping. Should make the struct reusable even for non tiled images, e.g. setting both rows and tiles to 1, in which case tile width and height would become analogous to coded_{witdh,height}. But why? What does cropping have to do with tiling? What advantage is there to handling them in one struct? The struct does not need to be used for non tiled image scenarios, but could if we decide we don't want to add another struct that would only contain a subset of the fields present here. As to why said fields here present here, HEIF may use a clap box to define cropping for the final image, not for the tiles. This needs to be propagated, and the previous version of this API, which only defined cropping from right and bottom edges if output dimensions were smaller than the grid (standard case for tiled heif with no clap box), was not enough. Hence this change. I can rename this struct to Image Grid or something else, which might make it feel less awkward if we decide to reuse it. We still need to propagate container cropping from clap boxes and from Matroska elements after all. Honestly this whole new API strikes me as massively overthinking it. All you should need to describe an arbitrary partition of an image into sub-rectangles is an array of (x, y, width, height). Instead you're proposing a new public header, struct, three functions, multiple "tile types", and if I'm not mistaken it still cannot describe an arbitrary partitioning. Plus it's in libavutil for some reason, even though libavformat seems to be the only intended user. Is all this complexity really warranted? 1. It needs to be usable as a Stream Group type, so a struct is required. Said struct needs an allocator unless we want to have its size be part of the ABI. I can remove the free function, but then the caller needs to manually free any internal data. 2. We need tile dimensions (Width and height) plus row and column count, which give you the final size of the grid, then offsets x and y to get the actual image within the grid meant for presentation. 3. I want to support uniform tiles as well as variable tile dimensions, hence multiple tile types. The latter currently has no use case, but eventually might. I can if you prefer not include said type at first, but i want to keep the union in place so it and other extensions can be added. 4. It's in lavu because its meant to be generic. It can also be used to transport tiling and cropping information as stream and packet side data, which can't depend on something defined in lavf. And what do you mean with not supporting describing arbitrary partitioning? Isn't that what variable tile dimensions achieve? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 7/8] fftools/ffmpeg_demux: implement -bsf for input
On date Sunday 2024-01-21 18:43:36 +0100, Anton Khirnov wrote: > Quoting Stefano Sabatini (2024-01-20 12:32:42) [...] > > When you present an example you usually start with an explanation > > (what it does) and then present the command, not the other way around. > > I don't, neither does most literature I can recall. Typically you first > present a thing, then explain its structure. Explaning the structure of > something the reader has not seen yet is backwards, unnatural, and hard > to understand. I still don't understand what "literature" you are referring to. If you see most examples in the FFmpeg docs they are in the form: @item This does this and that...: @example ... @end example An explanation is presented *before* introducing the example itself, in other words plain English before the actual command/code. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API
Quoting James Almer (2024-01-21 18:47:43) > On 1/21/2024 2:29 PM, Anton Khirnov wrote: > > Honestly this whole new API strikes me as massively overthinking it. All > > you should need to describe an arbitrary partition of an image into > > sub-rectangles is an array of (x, y, width, height). Instead you're > > proposing a new public header, struct, three functions, multiple "tile > > types", and if I'm not mistaken it still cannot describe an arbitrary > > partitioning. Plus it's in libavutil for some reason, even though > > libavformat seems to be the only intended user. > > > > Is all this complexity really warranted? > > 1. It needs to be usable as a Stream Group type, so a struct is > required. Said struct needs an allocator unless we want to have its size > be part of the ABI. I can remove the free function, but then the caller > needs to manually free any internal data. If the struct lives in lavf and is always allocated as a part of AVStreamGroup then you don't need a public constructor/destructor and can still extend the struct. > 2. We need tile dimensions (Width and height) plus row and column count, > which give you the final size of the grid, then offsets x and y to get > the actual image within the grid meant for presentation. > 3. I want to support uniform tiles as well as variable tile dimensions, > hence multiple tile types. The latter currently has no use case, but > eventually might. I can if you prefer not include said type at first, > but i want to keep the union in place so it and other extensions can be > added. > 4. It's in lavu because its meant to be generic. It can also be used to > transport tiling and cropping information as stream and packet side > data, which can't depend on something defined in lavf. When would you have tiling information associated with a specific stream? > And what do you mean with not supporting describing arbitrary > partitioning? Isn't that what variable tile dimensions achieve? IIUC your tiling scheme still assumes that the partitioning is by rows and columns. A completely generic partitioning could be irregular. -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] 0001-fix-segment-fault-in-function-decode
On date Saturday 2024-01-13 05:57:18 +0800, 陈督 wrote: > > > /*When it is not a planar arrangement, data[1] is empty, > > and all the data is interleaved in data[0]. > > This can result in a segmentation fault when accessing data[ch] .*/ > > //So I delete the code below: > > for (i = 0; i < frame->nb_samples; i++) > > for (ch = 0; ch < dec_ctx->ch_layout.nb_channels; ch++) > > fwrite(frame->data[ch] + data_size*i, 1, data_size, outfile); > > > > > //And I write this instead > > // L R data order > > if (av_sample_fmt_is_planar(dec_ctx->sample_fmt)) > > { > > // planar:LLL...RRR... in different data[ch] > > for (ch = 0; ch < dec_ctx->ch_layout.nb_channels; ch++) > > { > > fwrite(frame->data[ch], 1, frame->linesize[0], outfile); // only > linesize[0] has data. > The problem with this approach is that this is generating output in a format which cannot be played by ffplay, which is assuming packed (i.e. non planar) format. So it is expecting the output file to be written as: LRLRLR... rather than as: LLR also because ffplay does not know the linesize. ... But I see the example code should be fixed, it was designed with the assumption that the input sample format was always packed, which is not the case anymore. [...] ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 7/8] fftools/ffmpeg_demux: implement -bsf for input
Quoting Stefano Sabatini (2024-01-21 19:22:35) > On date Sunday 2024-01-21 18:43:36 +0100, Anton Khirnov wrote: > > Quoting Stefano Sabatini (2024-01-20 12:32:42) > [...] > > > When you present an example you usually start with an explanation > > > (what it does) and then present the command, not the other way around. > > > > I don't, neither does most literature I can recall. Typically you first > > present a thing, then explain its structure. Explaning the structure of > > something the reader has not seen yet is backwards, unnatural, and hard > > to understand. > > I still don't understand what "literature" you are referring to. Various manuals and textbooks I've read. > If you see most examples in the FFmpeg docs they are in the form: Our documentation is widely considered to be somewhere between atrocious and unusable (and sometimes actively misleading), so the fact that it does something in a specific way does not at all mean that it's a good idea. I have also personally seen (and fixed) countless instances of mindlessly perpetuated cargo cults in it. -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API
On 1/21/2024 3:29 PM, Anton Khirnov wrote: Quoting James Almer (2024-01-21 18:47:43) On 1/21/2024 2:29 PM, Anton Khirnov wrote: Honestly this whole new API strikes me as massively overthinking it. All you should need to describe an arbitrary partition of an image into sub-rectangles is an array of (x, y, width, height). Instead you're proposing a new public header, struct, three functions, multiple "tile types", and if I'm not mistaken it still cannot describe an arbitrary partitioning. Plus it's in libavutil for some reason, even though libavformat seems to be the only intended user. Is all this complexity really warranted? 1. It needs to be usable as a Stream Group type, so a struct is required. Said struct needs an allocator unless we want to have its size be part of the ABI. I can remove the free function, but then the caller needs to manually free any internal data. If the struct lives in lavf and is always allocated as a part of AVStreamGroup then you don't need a public constructor/destructor and can still extend the struct. Yes, but that would be the case if it's only meant to be allocated by AVStreamGroup and nothing else. 2. We need tile dimensions (Width and height) plus row and column count, which give you the final size of the grid, then offsets x and y to get the actual image within the grid meant for presentation. 3. I want to support uniform tiles as well as variable tile dimensions, hence multiple tile types. The latter currently has no use case, but eventually might. I can if you prefer not include said type at first, but i want to keep the union in place so it and other extensions can be added. 4. It's in lavu because its meant to be generic. It can also be used to transport tiling and cropping information as stream and packet side data, which can't depend on something defined in lavf. When would you have tiling information associated with a specific stream? Can't think of an example for tiling, but i can for cropping. If you insist on not reusing this for non-HEIF cropping usage in mp4/matroska, then ok, I'll move it to lavf. And what do you mean with not supporting describing arbitrary partitioning? Isn't that what variable tile dimensions achieve? IIUC your tiling scheme still assumes that the partitioning is by rows and columns. A completely generic partitioning could be irregular. A new tile type that doesn't define rows and columns can be added if needed. But the current variable tile type can support things like grids of two rows and two columns where the second row is effectively a single tile, simply by setting the second tile in said row as having a width of 0. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API
Quoting James Almer (2024-01-21 19:38:50) > On 1/21/2024 3:29 PM, Anton Khirnov wrote: > > Quoting James Almer (2024-01-21 18:47:43) > >> On 1/21/2024 2:29 PM, Anton Khirnov wrote: > >>> Honestly this whole new API strikes me as massively overthinking it. All > >>> you should need to describe an arbitrary partition of an image into > >>> sub-rectangles is an array of (x, y, width, height). Instead you're > >>> proposing a new public header, struct, three functions, multiple "tile > >>> types", and if I'm not mistaken it still cannot describe an arbitrary > >>> partitioning. Plus it's in libavutil for some reason, even though > >>> libavformat seems to be the only intended user. > >>> > >>> Is all this complexity really warranted? > >> > >> 1. It needs to be usable as a Stream Group type, so a struct is > >> required. Said struct needs an allocator unless we want to have its size > >> be part of the ABI. I can remove the free function, but then the caller > >> needs to manually free any internal data. > > > > If the struct lives in lavf and is always allocated as a part of > > AVStreamGroup then you don't need a public constructor/destructor and > > can still extend the struct. > > Yes, but that would be the case if it's only meant to be allocated by > AVStreamGroup and nothing else. That is the case right now, no? If that ever changes then the constructor can be added. > > > >> 2. We need tile dimensions (Width and height) plus row and column count, > >> which give you the final size of the grid, then offsets x and y to get > >> the actual image within the grid meant for presentation. > >> 3. I want to support uniform tiles as well as variable tile dimensions, > >> hence multiple tile types. The latter currently has no use case, but > >> eventually might. I can if you prefer not include said type at first, > >> but i want to keep the union in place so it and other extensions can be > >> added. > >> 4. It's in lavu because its meant to be generic. It can also be used to > >> transport tiling and cropping information as stream and packet side > >> data, which can't depend on something defined in lavf. > > > > When would you have tiling information associated with a specific > > stream? > > Can't think of an example for tiling, but i can for cropping. If you > insist on not reusing this for non-HEIF cropping usage in mp4/matroska, > then ok, I'll move it to lavf. I still don't see why should it be a good idea to use this struct for generic container cropping. It feels very much like a hammer in search of a nail. > > > >> And what do you mean with not supporting describing arbitrary > >> partitioning? Isn't that what variable tile dimensions achieve? > > > > IIUC your tiling scheme still assumes that the partitioning is by rows > > and columns. A completely generic partitioning could be irregular. > > A new tile type that doesn't define rows and columns can be added if > needed. But the current variable tile type can support things like grids > of two rows and two columns where the second row is effectively a single > tile, simply by setting the second tile in said row as having a width of 0. The problem I see here is that every consumer of this struct then has to explicitly support every type, and adding a new type requires updating all callers. This seems unnecessary when "list of N rectangles" covers all possible partitionings. That does not mean you actually have to store it that way - the struct could be a list of N rectangles logically, while actually being represented more efficiently (in the same way a channel layout is always logically a list of channels, even though it's often represented by an uint64 rather than a malloced array). -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 7/8] fftools/ffmpeg_demux: implement -bsf for input
On date Sunday 2024-01-21 19:35:01 +0100, Anton Khirnov wrote: > Quoting Stefano Sabatini (2024-01-21 19:22:35) > > On date Sunday 2024-01-21 18:43:36 +0100, Anton Khirnov wrote: > > > Quoting Stefano Sabatini (2024-01-20 12:32:42) > > [...] > > > > When you present an example you usually start with an explanation > > > > (what it does) and then present the command, not the other way around. > > > > > > I don't, neither does most literature I can recall. Typically you first > > > present a thing, then explain its structure. Explaning the structure of > > > something the reader has not seen yet is backwards, unnatural, and hard > > > to understand. > > > > I still don't understand what "literature" you are referring to. > > Various manuals and textbooks I've read. > > > If you see most examples in the FFmpeg docs they are in the form: > > Our documentation is widely considered to be somewhere between atrocious > and unusable nah, it's not so bad, also this applies to most documentation Besides FFmpeg is possibly the most sophisticated existing toolkit in terms of features/configuration, so this is somehow expected (at least if you expect a tutorial rather than a reference). > (and sometimes actively misleading), so the fact that it > does something in a specific way does not at all mean that it's a good > idea. So what do you propose instead? The fact that it is not perfect does not mean that everything is bad. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2] lavf/dvenc: improve error messaging
On date Sunday 2024-01-21 18:39:19 +0100, Anton Khirnov wrote: > Quoting Stefano Sabatini (2024-01-21 11:30:27) [...] > Also, can the second case even trigger? Seems like the block above > ensures n_ast is never larger than 2. Yes, this seems a miss from commit eafa8e859297813dcf0e6b43e85720be0a5f. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API
On 1/21/2024 4:02 PM, Anton Khirnov wrote: Quoting James Almer (2024-01-21 19:38:50) On 1/21/2024 3:29 PM, Anton Khirnov wrote: Quoting James Almer (2024-01-21 18:47:43) On 1/21/2024 2:29 PM, Anton Khirnov wrote: Honestly this whole new API strikes me as massively overthinking it. All you should need to describe an arbitrary partition of an image into sub-rectangles is an array of (x, y, width, height). Instead you're proposing a new public header, struct, three functions, multiple "tile types", and if I'm not mistaken it still cannot describe an arbitrary partitioning. Plus it's in libavutil for some reason, even though libavformat seems to be the only intended user. Is all this complexity really warranted? 1. It needs to be usable as a Stream Group type, so a struct is required. Said struct needs an allocator unless we want to have its size be part of the ABI. I can remove the free function, but then the caller needs to manually free any internal data. If the struct lives in lavf and is always allocated as a part of AVStreamGroup then you don't need a public constructor/destructor and can still extend the struct. Yes, but that would be the case if it's only meant to be allocated by AVStreamGroup and nothing else. That is the case right now, no? If that ever changes then the constructor can be added. 2. We need tile dimensions (Width and height) plus row and column count, which give you the final size of the grid, then offsets x and y to get the actual image within the grid meant for presentation. 3. I want to support uniform tiles as well as variable tile dimensions, hence multiple tile types. The latter currently has no use case, but eventually might. I can if you prefer not include said type at first, but i want to keep the union in place so it and other extensions can be added. 4. It's in lavu because its meant to be generic. It can also be used to transport tiling and cropping information as stream and packet side data, which can't depend on something defined in lavf. When would you have tiling information associated with a specific stream? Can't think of an example for tiling, but i can for cropping. If you insist on not reusing this for non-HEIF cropping usage in mp4/matroska, then ok, I'll move it to lavf. I still don't see why should it be a good idea to use this struct for generic container cropping. It feels very much like a hammer in search of a nail. Because once we support container cropping, we will be defining a stream/packet side data type that will contain a subset of the fields from this struct. If we reuse this struct, we can export a clap box as an AVTileGrid (Or i can rename it to AVImageGrid, and tile to subrectangle) either as the stream group tile grid specific parameters if HEIF, or as stream side data otherwise. And what do you mean with not supporting describing arbitrary partitioning? Isn't that what variable tile dimensions achieve? IIUC your tiling scheme still assumes that the partitioning is by rows and columns. A completely generic partitioning could be irregular. A new tile type that doesn't define rows and columns can be added if needed. But the current variable tile type can support things like grids of two rows and two columns where the second row is effectively a single tile, simply by setting the second tile in said row as having a width of 0. The problem I see here is that every consumer of this struct then has to explicitly support every type, and adding a new type requires updating all callers. This seems unnecessary when "list of N rectangles" covers all possible partitionings. Well, the variable type supports a list of N rectangles where each rectangle has arbitrary dimensions, and you can do things like having three tiles/rectangles that together still form a rectangle, while defining row and column count. So i don't personally see the need for a new type to begin with. That does not mean you actually have to store it that way - the struct could be a list of N rectangles logically, while actually being represented more efficiently (in the same way a channel layout is always logically a list of channels, even though it's often represented by an uint64 rather than a malloced array). ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] liavcodec: add bit-rate support to RoQ video encoder
One can now use the bitrate option (-b) to specify bit rate of the video stream in the RoQ encoder. The option only becomes effective for values above 800kbit/s, which is roughly equivalent to bandwidth of a 1x-speed CD-ROM drive, minus the bandwidth taken up by stereo DPCM stream. Values below this threshold produce visually inadequate results. Original patch by Joseph Fenton aka Chilly Willy Signed-off-by: Victor Luchits --- Changelog| 1 + libavcodec/roqvideo.h| 1 + libavcodec/roqvideodec.c | 15 + libavcodec/roqvideoenc.c | 118 ++- libavcodec/version.h | 2 +- 5 files changed, 123 insertions(+), 14 deletions(-) diff --git a/Changelog b/Changelog index c40b6d08fd..6974312f9d 100644 --- a/Changelog +++ b/Changelog @@ -22,6 +22,7 @@ version : - ffmpeg CLI -bsf option may now be used for input as well as output - ffmpeg CLI options may now be used as -/opt , which is equivalent to -opt > +- RoQ video bit rate option support version 6.1: - libaribcaption decoder diff --git a/libavcodec/roqvideo.h b/libavcodec/roqvideo.h index 2c2e42884d..6d30bcaada 100644 --- a/libavcodec/roqvideo.h +++ b/libavcodec/roqvideo.h @@ -43,6 +43,7 @@ typedef struct RoqContext { AVFrame *last_frame; AVFrame *current_frame; int width, height; +int key_frame; roq_cell cb2x2[256]; roq_qcell cb4x4[256]; diff --git a/libavcodec/roqvideodec.c b/libavcodec/roqvideodec.c index bfc69a65c9..07d6b8bb8f 100644 --- a/libavcodec/roqvideodec.c +++ b/libavcodec/roqvideodec.c @@ -70,6 +70,7 @@ static void roqvideo_decode_frame(RoqContext *ri, GetByteContext *gb) chunk_start = bytestream2_tell(gb); xpos = ypos = 0; +ri->key_frame = 1; if (chunk_size > bytestream2_get_bytes_left(gb)) { av_log(ri->logctx, AV_LOG_ERROR, "Chunk does not fit in input buffer\n"); @@ -92,12 +93,14 @@ static void roqvideo_decode_frame(RoqContext *ri, GetByteContext *gb) switch(vqid) { case RoQ_ID_MOT: +ri->key_frame = 0; break; case RoQ_ID_FCC: { int byte = bytestream2_get_byte(gb); mx = 8 - (byte >> 4) - ((signed char) (chunk_arg >> 8)); my = 8 - (byte & 0xf) - ((signed char) chunk_arg); ff_apply_motion_8x8(ri, xp, yp, mx, my); +ri->key_frame = 0; break; } case RoQ_ID_SLD: @@ -125,12 +128,14 @@ static void roqvideo_decode_frame(RoqContext *ri, GetByteContext *gb) vqflg_pos--; switch(vqid) { case RoQ_ID_MOT: +ri->key_frame = 0; break; case RoQ_ID_FCC: { int byte = bytestream2_get_byte(gb); mx = 8 - (byte >> 4) - ((signed char) (chunk_arg >> 8)); my = 8 - (byte & 0xf) - ((signed char) chunk_arg); ff_apply_motion_4x4(ri, x, y, mx, my); +ri->key_frame = 0; break; } case RoQ_ID_SLD: @@ -214,6 +219,16 @@ static int roq_decode_frame(AVCodecContext *avctx, AVFrame *rframe, if ((ret = av_frame_ref(rframe, s->current_frame)) < 0) return ret; + +/* Keyframe when no MOT or FCC codes in frame */ +if (s->key_frame) { +av_log(avctx, AV_LOG_VERBOSE, "\nFound keyframe!\n"); +rframe->pict_type = AV_PICTURE_TYPE_I; +avpkt->flags |= AV_PKT_FLAG_KEY; +} else { +rframe->pict_type = AV_PICTURE_TYPE_P; +} + *got_frame = 1; /* shuffle frames */ diff --git a/libavcodec/roqvideoenc.c b/libavcodec/roqvideoenc.c index 0933abf4f9..bcead80bbd 100644 --- a/libavcodec/roqvideoenc.c +++ b/libavcodec/roqvideoenc.c @@ -79,6 +79,9 @@ /* The cast is useful when multiplying it by INT_MAX */ #define ROQ_LAMBDA_SCALE ((uint64_t) FF_LAMBDA_SCALE) +/* The default minimum bitrate, set around the value of a 1x speed CD-ROM drive */ +#define ROQ_DEFAULT_MIN_BIT_RATE 800*1024 + typedef struct RoqCodebooks { int numCB4; int numCB2; @@ -136,6 +139,8 @@ typedef struct RoqEncContext { struct ELBGContext *elbg; AVLFG randctx; uint64_t lambda; +uint64_t last_lambda; +int lambda_delta; motion_vect *this_motion4; motion_vect *last_motion4; @@ -887,8 +892,9 @@ static int generate_new_codebooks(RoqEncContext *enc) return 0; } -static int roq_encode_video(RoqEncContext *enc) +static int roq_encode_video(AVCodecContext *avctx) { +RoqEncContext *const enc = avctx->priv_data; RoqTempData *const tempData = &enc->tmp_data; RoqContext *const roq = &enc->common; int ret; @@ -910,14 +916,14 @@ static int r
Re: [FFmpeg-devel] [PATCH] liavcodec: add bit-rate support to RoQ video encoder
On Sun, Jan 21, 2024 at 11:19:43PM +0300, Victor Luchits wrote: > One can now use the bitrate option (-b) to specify bit rate of the video > stream in the RoQ encoder. The option only becomes effective for values > above 800kbit/s, which is roughly equivalent to bandwidth of a 1x-speed > CD-ROM drive, minus the bandwidth taken up by stereo DPCM stream. Values > below this threshold produce visually inadequate results. > > Original patch by Joseph Fenton aka Chilly Willy > > Signed-off-by: Victor Luchits [...] > diff --git a/libavcodec/roqvideodec.c b/libavcodec/roqvideodec.c > index bfc69a65c9..07d6b8bb8f 100644 > --- a/libavcodec/roqvideodec.c > +++ b/libavcodec/roqvideodec.c > @@ -70,6 +70,7 @@ static void roqvideo_decode_frame(RoqContext *ri, > GetByteContext *gb) > chunk_start = bytestream2_tell(gb); > xpos = ypos = 0; > +ri->key_frame = 1; > if (chunk_size > bytestream2_get_bytes_left(gb)) { > av_log(ri->logctx, AV_LOG_ERROR, "Chunk does not fit in input > buffer\n"); > @@ -92,12 +93,14 @@ static void roqvideo_decode_frame(RoqContext *ri, > GetByteContext *gb) > switch(vqid) { There seems to be some line wraping problem please repost the patch without linewraping / extra newlines Applying: liavcodec: add bit-rate support to RoQ video encoder error: corrupt patch at line 20 error: could not build fake ancestor Patch failed at 0001 liavcodec: add bit-rate support to RoQ video encoder Use 'git am --show-current-patch' to see the failed patch When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB I have never wished to cater to the crowd; for what I know they do not approve, and what they approve I do not know. -- Epicurus signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API
On 1/21/2024 4:29 PM, James Almer wrote: On 1/21/2024 4:02 PM, Anton Khirnov wrote: Quoting James Almer (2024-01-21 19:38:50) On 1/21/2024 3:29 PM, Anton Khirnov wrote: Quoting James Almer (2024-01-21 18:47:43) On 1/21/2024 2:29 PM, Anton Khirnov wrote: Honestly this whole new API strikes me as massively overthinking it. All you should need to describe an arbitrary partition of an image into sub-rectangles is an array of (x, y, width, height). Instead you're proposing a new public header, struct, three functions, multiple "tile types", and if I'm not mistaken it still cannot describe an arbitrary partitioning. Plus it's in libavutil for some reason, even though libavformat seems to be the only intended user. Is all this complexity really warranted? 1. It needs to be usable as a Stream Group type, so a struct is required. Said struct needs an allocator unless we want to have its size be part of the ABI. I can remove the free function, but then the caller needs to manually free any internal data. If the struct lives in lavf and is always allocated as a part of AVStreamGroup then you don't need a public constructor/destructor and can still extend the struct. Yes, but that would be the case if it's only meant to be allocated by AVStreamGroup and nothing else. That is the case right now, no? If that ever changes then the constructor can be added. 2. We need tile dimensions (Width and height) plus row and column count, which give you the final size of the grid, then offsets x and y to get the actual image within the grid meant for presentation. 3. I want to support uniform tiles as well as variable tile dimensions, hence multiple tile types. The latter currently has no use case, but eventually might. I can if you prefer not include said type at first, but i want to keep the union in place so it and other extensions can be added. 4. It's in lavu because its meant to be generic. It can also be used to transport tiling and cropping information as stream and packet side data, which can't depend on something defined in lavf. When would you have tiling information associated with a specific stream? Can't think of an example for tiling, but i can for cropping. If you insist on not reusing this for non-HEIF cropping usage in mp4/matroska, then ok, I'll move it to lavf. I still don't see why should it be a good idea to use this struct for generic container cropping. It feels very much like a hammer in search of a nail. Because once we support container cropping, we will be defining a stream/packet side data type that will contain a subset of the fields from this struct. If we reuse this struct, we can export a clap box as an AVTileGrid (Or i can rename it to AVImageGrid, and tile to subrectangle) either as the stream group tile grid specific parameters if HEIF, or as stream side data otherwise. And what do you mean with not supporting describing arbitrary partitioning? Isn't that what variable tile dimensions achieve? IIUC your tiling scheme still assumes that the partitioning is by rows and columns. A completely generic partitioning could be irregular. A new tile type that doesn't define rows and columns can be added if needed. But the current variable tile type can support things like grids of two rows and two columns where the second row is effectively a single tile, simply by setting the second tile in said row as having a width of 0. The problem I see here is that every consumer of this struct then has to explicitly support every type, and adding a new type requires updating all callers. This seems unnecessary when "list of N rectangles" covers all possible partitionings. Well, the variable type supports a list of N rectangles where each rectangle has arbitrary dimensions, and you can do things like having three tiles/rectangles that together still form a rectangle, while defining row and column count. So i don't personally see the need for a new type to begin with. I could remove the types and the union altogether and leave only the array even for uniform tiles if you think that simplifies the API, but seems like a waste of memory to allocate a rows x cols array of ints just to have the same value written for every entry. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] liavcodec: add bit-rate support to RoQ video encoder
One can now use the bitrate option (-b) to specify bit rate of the video stream in the RoQ encoder. The option only becomes effective for values above 800kbit/s, which is roughly equivalent to bandwidth of a 1x-speed CD-ROM drive, minus the bandwidth taken up by stereo DPCM stream. Values below this threshold produce visually inadequate results. Original patch by Joseph Fenton aka Chilly Willy Signed-off-by: Victor Luchits --- Changelog| 1 + libavcodec/roqvideo.h| 1 + libavcodec/roqvideodec.c | 15 + libavcodec/roqvideoenc.c | 118 ++- libavcodec/version.h | 2 +- 5 files changed, 123 insertions(+), 14 deletions(-) diff --git a/Changelog b/Changelog index c40b6d08fd..6974312f9d 100644 --- a/Changelog +++ b/Changelog @@ -22,6 +22,7 @@ version : - ffmpeg CLI -bsf option may now be used for input as well as output - ffmpeg CLI options may now be used as -/opt , which is equivalent to -opt > +- RoQ video bit rate option support version 6.1: - libaribcaption decoder diff --git a/libavcodec/roqvideo.h b/libavcodec/roqvideo.h index 2c2e42884d..6d30bcaada 100644 --- a/libavcodec/roqvideo.h +++ b/libavcodec/roqvideo.h @@ -43,6 +43,7 @@ typedef struct RoqContext { AVFrame *last_frame; AVFrame *current_frame; int width, height; +int key_frame; roq_cell cb2x2[256]; roq_qcell cb4x4[256]; diff --git a/libavcodec/roqvideodec.c b/libavcodec/roqvideodec.c index bfc69a65c9..07d6b8bb8f 100644 --- a/libavcodec/roqvideodec.c +++ b/libavcodec/roqvideodec.c @@ -70,6 +70,7 @@ static void roqvideo_decode_frame(RoqContext *ri, GetByteContext *gb) chunk_start = bytestream2_tell(gb); xpos = ypos = 0; +ri->key_frame = 1; if (chunk_size > bytestream2_get_bytes_left(gb)) { av_log(ri->logctx, AV_LOG_ERROR, "Chunk does not fit in input buffer\n"); @@ -92,12 +93,14 @@ static void roqvideo_decode_frame(RoqContext *ri, GetByteContext *gb) switch(vqid) { case RoQ_ID_MOT: +ri->key_frame = 0; break; case RoQ_ID_FCC: { int byte = bytestream2_get_byte(gb); mx = 8 - (byte >> 4) - ((signed char) (chunk_arg >> 8)); my = 8 - (byte & 0xf) - ((signed char) chunk_arg); ff_apply_motion_8x8(ri, xp, yp, mx, my); +ri->key_frame = 0; break; } case RoQ_ID_SLD: @@ -125,12 +128,14 @@ static void roqvideo_decode_frame(RoqContext *ri, GetByteContext *gb) vqflg_pos--; switch(vqid) { case RoQ_ID_MOT: +ri->key_frame = 0; break; case RoQ_ID_FCC: { int byte = bytestream2_get_byte(gb); mx = 8 - (byte >> 4) - ((signed char) (chunk_arg >> 8)); my = 8 - (byte & 0xf) - ((signed char) chunk_arg); ff_apply_motion_4x4(ri, x, y, mx, my); +ri->key_frame = 0; break; } case RoQ_ID_SLD: @@ -214,6 +219,16 @@ static int roq_decode_frame(AVCodecContext *avctx, AVFrame *rframe, if ((ret = av_frame_ref(rframe, s->current_frame)) < 0) return ret; + +/* Keyframe when no MOT or FCC codes in frame */ +if (s->key_frame) { +av_log(avctx, AV_LOG_VERBOSE, "\nFound keyframe!\n"); +rframe->pict_type = AV_PICTURE_TYPE_I; +avpkt->flags |= AV_PKT_FLAG_KEY; +} else { +rframe->pict_type = AV_PICTURE_TYPE_P; +} + *got_frame = 1; /* shuffle frames */ diff --git a/libavcodec/roqvideoenc.c b/libavcodec/roqvideoenc.c index 0933abf4f9..bcead80bbd 100644 --- a/libavcodec/roqvideoenc.c +++ b/libavcodec/roqvideoenc.c @@ -79,6 +79,9 @@ /* The cast is useful when multiplying it by INT_MAX */ #define ROQ_LAMBDA_SCALE ((uint64_t) FF_LAMBDA_SCALE) +/* The default minimum bitrate, set around the value of a 1x speed CD-ROM drive */ +#define ROQ_DEFAULT_MIN_BIT_RATE 800*1024 + typedef struct RoqCodebooks { int numCB4; int numCB2; @@ -136,6 +139,8 @@ typedef struct RoqEncContext { struct ELBGContext *elbg; AVLFG randctx; uint64_t lambda; +uint64_t last_lambda; +int lambda_delta; motion_vect *this_motion4; motion_vect *last_motion4; @@ -887,8 +892,9 @@ static int generate_new_codebooks(RoqEncContext *enc) return 0; } -static int roq_encode_video(RoqEncContext *enc) +static int roq_encode_video(AVCodecContext *avctx) { +RoqEncContext *const enc = avctx->priv_data; RoqTempData *const tempData = &enc->tmp_data; RoqContext *const roq = &enc->common; int ret; @@ -910,14 +916,14 @@ st
[FFmpeg-devel] [PATCH] lavc/aarch64: fix include for cpu.h
--- libavcodec/aarch64/idctdsp_init_aarch64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/aarch64/idctdsp_init_aarch64.c b/libavcodec/aarch64/idctdsp_init_aarch64.c index eec21aa5a2..8efd5f5323 100644 --- a/libavcodec/aarch64/idctdsp_init_aarch64.c +++ b/libavcodec/aarch64/idctdsp_init_aarch64.c @@ -22,7 +22,7 @@ #include "libavutil/attributes.h" #include "libavutil/cpu.h" -#include "libavutil/arm/cpu.h" +#include "libavutil/aarch64/cpu.h" #include "libavcodec/avcodec.h" #include "libavcodec/idctdsp.h" #include "idct.h" -- 2.30.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v1 2/2] vaapi: add vaapi_avs2 support
> On Jan 21, 2024, at 22:47, Jianfeng Zheng wrote: > > Zhao Zhili 于2024年1月20日周六 12:22写道: >> >> >>> -Original Message- >>> From: ffmpeg-devel On Behalf Of >>> jianfeng.zheng >>> Sent: 2024年1月19日 23:53 >>> To: ffmpeg-devel@ffmpeg.org >>> Cc: jianfeng.zheng >>> Subject: [FFmpeg-devel] [PATCH v1 2/2] vaapi: add vaapi_avs2 support >>> >>> see https://github.com/intel/libva/pull/738 >>> >>> [Moore Threads](https://www.mthreads.com) (short for Mthreads) is a >>> Chinese GPU manufacturer. All our products, like MTTS70/MTTS80/.. , >>> support AVS2 8bit/10bit HW decoding at max 8k resolution. >>> >>> Signed-off-by: jianfeng.zheng >>> --- >>> configure| 7 + >>> libavcodec/Makefile | 2 + >>> libavcodec/allcodecs.c | 1 + >>> libavcodec/avs2.c| 345 ++- >>> libavcodec/avs2.h| 460 +++- >>> libavcodec/avs2_parser.c | 5 +- >>> libavcodec/avs2dec.c | 569 + >>> libavcodec/avs2dec.h | 48 +++ >>> libavcodec/avs2dec_headers.c | 787 +++ >>> libavcodec/codec_desc.c | 5 +- >>> libavcodec/defs.h| 4 + >>> libavcodec/hwaccels.h| 1 + >>> libavcodec/libdavs2.c| 2 +- >>> libavcodec/profiles.c| 6 + >>> libavcodec/profiles.h| 1 + >>> libavcodec/vaapi_avs2.c | 227 ++ >>> libavcodec/vaapi_decode.c| 5 + >>> libavformat/matroska.c | 1 + >>> libavformat/mpeg.h | 1 + >>> 19 files changed, 2450 insertions(+), 27 deletions(-) >>> create mode 100644 libavcodec/avs2dec.c >>> create mode 100644 libavcodec/avs2dec.h >>> create mode 100644 libavcodec/avs2dec_headers.c >>> create mode 100644 libavcodec/vaapi_avs2.c >>> >> >> Please split the patch properly. It's hard to review in a single chunk, and >> it can't be tested >> without the hardware. > > As a new player in the GPU market, we have always attached great importance > to the participation of the open source community. And willing to feed back > our > new features to the field of video hardware acceleration. > > As a pioneer, these new features may only be supported by our hardware > at current > time. We are willing to provide some market selling devices for free > to community > accredited contributors for testing related functions. I accredited Zhili Zhao, but i cannot sure Zhili Zhao have interest in the GPU card and not sure he has a computer can insert a PCI GPU card:D > >> >> ___ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v4 1/2] avfilter: add audio overlay filter
On date Tuesday 2024-01-16 17:46:42 +0530, Harshit Karwal wrote: > Co-authored-by: Paul B Mahol > Signed-off-by: Harshit Karwal > --- > doc/filters.texi | 40 +++ > libavfilter/Makefile | 1 + > libavfilter/af_aoverlay.c | 538 ++ > libavfilter/allfilters.c | 1 + > 4 files changed, 580 insertions(+) > create mode 100644 libavfilter/af_aoverlay.c > > diff --git a/doc/filters.texi b/doc/filters.texi > index 20c91bab3a..79eb600ae3 100644 > --- a/doc/filters.texi > +++ b/doc/filters.texi > @@ -2779,6 +2779,46 @@ This filter supports the same commands as options, > excluding option @code{order} > > Pass the audio source unchanged to the output. > > +@section aoverlay > + > +Replace a specified section of an audio stream with another input audio > stream. > + > +In case no enable option for timeline editing is specified, the second audio > stream will nit: @option{enable} > +be output at sections of the first stream which have a gap in PTS > (Presentation TimeStamp) values > +such that the output stream's PTS values are monotonous. > + > +This filter also supports linear cross fading when transitioning from one > +input stream to another. > + > +The filter accepts the following option: nit: options in case we add more > + > +@table @option > +@item cf_duration > +Set duration (in seconds) for cross fade between the inputs. Default value > is @code{100} milliseconds. > +@end table > + > +@subsection Examples > + > +@itemize > +@item > +Replace the first stream with the second stream from @code{t=10} seconds to > @code{t=20} seconds: > +@example > +ffmpeg -i first.wav -i second.wav -filter_complex > "aoverlay=enable='between(t,10,20)'" output.wav > +@end example > + > +@item > +Do the same as above, but with crossfading for @code{2} seconds between the > streams: > +@example > +ffmpeg -i first.wav -i second.wav -filter_complex > "aoverlay=cf_duration=2:enable='between(t,10,20)'" output.wav > +@end example > + > +@item > +Introduce a PTS gap from @code{t=4} seconds to @code{t=8} seconds in the > first stream and output the second stream during this gap: > +@example > +ffmpeg -i first.wav -i second.wav -filter_complex > "[0]aselect='not(between(t,4,8))'[temp];[temp][1]aoverlay[out]" -map "[out]" > output.wav > +@end example > +@end itemize > + > @section apad > > Pad the end of an audio stream with silence. > diff --git a/libavfilter/Makefile b/libavfilter/Makefile > index bba0219876..0f2b403441 100644 > --- a/libavfilter/Makefile > +++ b/libavfilter/Makefile > @@ -81,6 +81,7 @@ OBJS-$(CONFIG_ANLMDN_FILTER) += af_anlmdn.o > OBJS-$(CONFIG_ANLMF_FILTER) += af_anlms.o > OBJS-$(CONFIG_ANLMS_FILTER) += af_anlms.o > OBJS-$(CONFIG_ANULL_FILTER) += af_anull.o > +OBJS-$(CONFIG_AOVERLAY_FILTER) += af_aoverlay.o > OBJS-$(CONFIG_APAD_FILTER) += af_apad.o > OBJS-$(CONFIG_APERMS_FILTER) += f_perms.o > OBJS-$(CONFIG_APHASER_FILTER)+= af_aphaser.o > generate_wave_table.o > diff --git a/libavfilter/af_aoverlay.c b/libavfilter/af_aoverlay.c > new file mode 100644 > index 00..f7ac00dda1 > --- /dev/null > +++ b/libavfilter/af_aoverlay.c [...] > +static int crossfade_prepare(AOverlayContext *s, AVFilterLink *main_inlink, > AVFilterLink *overlay_inlink, AVFilterLink *outlink, > + int nb_samples, AVFrame **main_buffer, AVFrame > **overlay_buffer, int mode) > +{ > +int ret; > + > +*main_buffer = ff_get_audio_buffer(outlink, nb_samples); > +if (!(*main_buffer)) > +return AVERROR(ENOMEM); > + > +(*main_buffer)->pts = s->pts; > +s->pts += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate > }, outlink->time_base); > + > +if ((ret = av_audio_fifo_read(s->main_sample_buffers, (void > **)(*main_buffer)->extended_data, nb_samples)) < 0) > +return ret; > + > +if (mode == 1) { > +s->previous_samples = (*main_buffer)->nb_samples; > +} else if (mode == -1 || (mode == 0 && s->is_disabled)) { it would help to use an enum to describe the mode value Also would help to introduce some debug log messages to aid troubleshooting/debugging. For instance, it would be very useful to show the exact time when the overlay stream is inserted. [...] > +static int activate(AVFilterContext *ctx) > +{ > +AOverlayContext *s = ctx->priv; > +int status, ret, nb_samples; > +int64_t pts; > +AVFrame *out = NULL, *main_buffer = NULL, *overlay_buffer = NULL; > + > +AVFilterLink *main_inlink = ctx->inputs[0]; > +AVFilterLink *overlay_inlink = ctx->inputs[1]; > +AVFilterLink *outlink = ctx->outputs[0]; > + > +FF_FILTER_FORWARD_STATUS_BACK_ALL(outlink, ctx); > + > +if (s->default_mode && (s->pts_gap_end - s->pts_gap_start <= 0 || > s->overlay_eof)) { > +s->default_mode = 0; > +s->transition_pts
[FFmpeg-devel] [PATCH 1/9] avcodec/vaapi_encode: move pic->input_surface initialization to encode_alloc
From: Tong Wu When allocating the VAAPIEncodePicture, pic->input_surface can be initialized right in the place. This movement simplifies the send_frame logic and is the preparation for moving vaapi_encode_send_frame to the base layer. Signed-off-by: Tong Wu --- libavcodec/vaapi_encode.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c index 86f4110cd2..38d855fbd4 100644 --- a/libavcodec/vaapi_encode.c +++ b/libavcodec/vaapi_encode.c @@ -878,7 +878,8 @@ static int vaapi_encode_discard(AVCodecContext *avctx, return 0; } -static VAAPIEncodePicture *vaapi_encode_alloc(AVCodecContext *avctx) +static VAAPIEncodePicture *vaapi_encode_alloc(AVCodecContext *avctx, + const AVFrame *frame) { VAAPIEncodeContext *ctx = avctx->priv_data; VAAPIEncodePicture *pic; @@ -895,7 +896,7 @@ static VAAPIEncodePicture *vaapi_encode_alloc(AVCodecContext *avctx) } } -pic->input_surface = VA_INVALID_ID; +pic->input_surface = (VASurfaceID)(uintptr_t)frame->data[3]; pic->recon_surface = VA_INVALID_ID; pic->output_buffer = VA_INVALID_ID; @@ -1332,7 +1333,7 @@ static int vaapi_encode_send_frame(AVCodecContext *avctx, AVFrame *frame) if (err < 0) return err; -pic = vaapi_encode_alloc(avctx); +pic = vaapi_encode_alloc(avctx, frame); if (!pic) return AVERROR(ENOMEM); @@ -1345,7 +1346,6 @@ static int vaapi_encode_send_frame(AVCodecContext *avctx, AVFrame *frame) if (ctx->input_order == 0 || frame->pict_type == AV_PICTURE_TYPE_I) pic->force_idr = 1; -pic->input_surface = (VASurfaceID)(uintptr_t)frame->data[3]; pic->pts = frame->pts; pic->duration = frame->duration; -- 2.41.0.windows.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 3/9] avcodec/vaapi_encode: extract set_output_property to base layer
From: Tong Wu Signed-off-by: Tong Wu --- libavcodec/hw_base_encode.c | 40 + libavcodec/hw_base_encode.h | 3 +++ libavcodec/vaapi_encode.c | 44 ++--- 3 files changed, 45 insertions(+), 42 deletions(-) diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c index 62adda2fc3..f0e4ef9655 100644 --- a/libavcodec/hw_base_encode.c +++ b/libavcodec/hw_base_encode.c @@ -385,6 +385,46 @@ static int hw_base_encode_clear_old(AVCodecContext *avctx) return 0; } +int ff_hw_base_encode_set_output_property(AVCodecContext *avctx, + HWBaseEncodePicture *pic, + AVPacket *pkt, int flag_no_delay) +{ +HWBaseEncodeContext *ctx = avctx->priv_data; + +if (pic->type == PICTURE_TYPE_IDR) +pkt->flags |= AV_PKT_FLAG_KEY; + +pkt->pts = pic->pts; +pkt->duration = pic->duration; + +// for no-delay encoders this is handled in generic codec +if (avctx->codec->capabilities & AV_CODEC_CAP_DELAY && +avctx->flags & AV_CODEC_FLAG_COPY_OPAQUE) { +pkt->opaque = pic->opaque; +pkt->opaque_ref = pic->opaque_ref; +pic->opaque_ref = NULL; +} + +if (flag_no_delay) { +pkt->dts = pkt->pts; +return 0; +} + +if (ctx->output_delay == 0) { +pkt->dts = pkt->pts; +} else if (pic->encode_order < ctx->decode_delay) { +if (ctx->ts_ring[pic->encode_order] < INT64_MIN + ctx->dts_pts_diff) +pkt->dts = INT64_MIN; +else +pkt->dts = ctx->ts_ring[pic->encode_order] - ctx->dts_pts_diff; +} else { +pkt->dts = ctx->ts_ring[(pic->encode_order - ctx->decode_delay) % +(3 * ctx->output_delay + ctx->async_depth)]; +} + +return 0; +} + static int hw_base_encode_check_frame(AVCodecContext *avctx, const AVFrame *frame) { diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h index be4c6b034e..d215d6a32b 100644 --- a/libavcodec/hw_base_encode.h +++ b/libavcodec/hw_base_encode.h @@ -237,6 +237,9 @@ typedef struct HWBaseEncodeContext { AVPacket*tail_pkt; } HWBaseEncodeContext; +int ff_hw_base_encode_set_output_property(AVCodecContext *avctx, HWBaseEncodePicture *pic, + AVPacket *pkt, int flag_no_delay); + int ff_hw_base_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt); int ff_hw_base_encode_init(AVCodecContext *avctx); diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c index e2f968c36d..2d839a1202 100644 --- a/libavcodec/vaapi_encode.c +++ b/libavcodec/vaapi_encode.c @@ -668,47 +668,6 @@ fail_at_end: return err; } -static int vaapi_encode_set_output_property(AVCodecContext *avctx, -HWBaseEncodePicture *pic, -AVPacket *pkt) -{ -HWBaseEncodeContext *base_ctx = avctx->priv_data; -VAAPIEncodeContext *ctx = avctx->priv_data; - -if (pic->type == PICTURE_TYPE_IDR) -pkt->flags |= AV_PKT_FLAG_KEY; - -pkt->pts = pic->pts; -pkt->duration = pic->duration; - -// for no-delay encoders this is handled in generic codec -if (avctx->codec->capabilities & AV_CODEC_CAP_DELAY && -avctx->flags & AV_CODEC_FLAG_COPY_OPAQUE) { -pkt->opaque = pic->opaque; -pkt->opaque_ref = pic->opaque_ref; -pic->opaque_ref = NULL; -} - -if (ctx->codec->flags & FLAG_TIMESTAMP_NO_DELAY) { -pkt->dts = pkt->pts; -return 0; -} - -if (base_ctx->output_delay == 0) { -pkt->dts = pkt->pts; -} else if (pic->encode_order < base_ctx->decode_delay) { -if (base_ctx->ts_ring[pic->encode_order] < INT64_MIN + base_ctx->dts_pts_diff) -pkt->dts = INT64_MIN; -else -pkt->dts = base_ctx->ts_ring[pic->encode_order] - base_ctx->dts_pts_diff; -} else { -pkt->dts = base_ctx->ts_ring[(pic->encode_order - base_ctx->decode_delay) % - (3 * base_ctx->output_delay + base_ctx->async_depth)]; -} - -return 0; -} - static int vaapi_encode_get_coded_buffer_size(AVCodecContext *avctx, VABufferID buf_id) { VAAPIEncodeContext *ctx = avctx->priv_data; @@ -860,7 +819,8 @@ static int vaapi_encode_output(AVCodecContext *avctx, av_log(avctx, AV_LOG_DEBUG, "Output read for pic %"PRId64"/%"PRId64".\n", base_pic->display_order, base_pic->encode_order); -vaapi_encode_set_output_property(avctx, base_pic, pkt_ptr); +ff_hw_base_encode_set_output_property(avctx, base_pic, pkt_ptr, + ctx->codec->flags & FLAG_TIMESTAMP_NO_DELAY); end: ff_refstruct_unref(&pic->output_buffer_ref); -- 2.41.0.windows.1 ___ ffmpeg-devel
[FFmpeg-devel] [PATCH 4/9] avcodec/vaapi_encode: extract rc parameter configuration to base layer
From: Tong Wu VAAPI and D3D12VA can share rate control configuration codes. Hence, it can be moved to base layer for simplification. Signed-off-by: Tong Wu --- libavcodec/hw_base_encode.c| 151 libavcodec/hw_base_encode.h| 34 ++ libavcodec/vaapi_encode.c | 210 ++--- libavcodec/vaapi_encode.h | 14 +-- libavcodec/vaapi_encode_av1.c | 2 +- libavcodec/vaapi_encode_h264.c | 2 +- libavcodec/vaapi_encode_vp9.c | 2 +- 7 files changed, 227 insertions(+), 188 deletions(-) diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c index f0e4ef9655..c20c47bf55 100644 --- a/libavcodec/hw_base_encode.c +++ b/libavcodec/hw_base_encode.c @@ -631,6 +631,157 @@ end: return 0; } +int ff_hw_base_rc_mode_configure(AVCodecContext *avctx, const HWBaseEncodeRCMode *rc_mode, + int default_quality, HWBaseEncodeRCConfigure *rc_conf) +{ +HWBaseEncodeContext *ctx = avctx->priv_data; + +if (!rc_mode || !rc_conf) +return -1; + +if (rc_mode->bitrate) { +if (avctx->bit_rate <= 0) { +av_log(avctx, AV_LOG_ERROR, "Bitrate must be set for %s " + "RC mode.\n", rc_mode->name); +return AVERROR(EINVAL); +} + +if (rc_mode->mode == RC_MODE_AVBR) { +// For maximum confusion AVBR is hacked into the existing API +// by overloading some of the fields with completely different +// meanings. + +// Target percentage does not apply in AVBR mode. +rc_conf->rc_bits_per_second = avctx->bit_rate; + +// Accuracy tolerance range for meeting the specified target +// bitrate. It's very unclear how this is actually intended +// to work - since we do want to get the specified bitrate, +// set the accuracy to 100% for now. +rc_conf->rc_target_percentage = 100; + +// Convergence period in frames. The GOP size reflects the +// user's intended block size for cutting, so reusing that +// as the convergence period seems a reasonable default. +rc_conf->rc_window_size = avctx->gop_size > 0 ? avctx->gop_size : 60; + +} else if (rc_mode->maxrate) { +if (avctx->rc_max_rate > 0) { +if (avctx->rc_max_rate < avctx->bit_rate) { +av_log(avctx, AV_LOG_ERROR, "Invalid bitrate settings: " + "bitrate (%"PRId64") must not be greater than " + "maxrate (%"PRId64").\n", avctx->bit_rate, + avctx->rc_max_rate); +return AVERROR(EINVAL); +} +rc_conf->rc_bits_per_second = avctx->rc_max_rate; +rc_conf->rc_target_percentage = (avctx->bit_rate * 100) / + avctx->rc_max_rate; +} else { +// We only have a target bitrate, but this mode requires +// that a maximum rate be supplied as well. Since the +// user does not want this to be a constraint, arbitrarily +// pick a maximum rate of double the target rate. +rc_conf->rc_bits_per_second = 2 * avctx->bit_rate; +rc_conf->rc_target_percentage = 50; +} +} else { +if (avctx->rc_max_rate > avctx->bit_rate) { +av_log(avctx, AV_LOG_WARNING, "Max bitrate is ignored " + "in %s RC mode.\n", rc_mode->name); +} +rc_conf->rc_bits_per_second = avctx->bit_rate; +rc_conf->rc_target_percentage = 100; +} +} else { +rc_conf->rc_bits_per_second = 0; +rc_conf->rc_target_percentage = 100; +} + +if (rc_mode->quality) { +if (ctx->explicit_qp) { +rc_conf->rc_quality = ctx->explicit_qp; +} else if (avctx->global_quality > 0) { +rc_conf->rc_quality = avctx->global_quality; +} else { +rc_conf->rc_quality = default_quality; +av_log(avctx, AV_LOG_WARNING, "No quality level set; " + "using default (%d).\n", rc_conf->rc_quality); +} +} else { +rc_conf->rc_quality = 0; +} + +if (rc_mode->hrd) { +if (avctx->rc_buffer_size) +rc_conf->hrd_buffer_size = avctx->rc_buffer_size; +else if (avctx->rc_max_rate > 0) +rc_conf->hrd_buffer_size = avctx->rc_max_rate; +else +rc_conf->hrd_buffer_size = avctx->bit_rate; +if (avctx->rc_initial_buffer_occupancy) { +if (avctx->rc_initial_buffer_occupancy > rc_conf->hrd_buffer_size) { +av_log(avctx, AV_LOG_ERROR, "Invalid RC buffer settings: " + "must have initial buffer size (%d) <= " + "buffer size (%"
[FFmpeg-devel] [PATCH 6/9] avcodec/vaapi_encode: extract a get_recon_format function to base layer
From: Tong Wu Get constraints and set recon frame format can be shared with other HW encoder such as D3D12. Extract this part as a new function to base layer. Signed-off-by: Tong Wu --- libavcodec/hw_base_encode.c | 58 + libavcodec/hw_base_encode.h | 2 ++ libavcodec/vaapi_encode.c | 51 ++-- 3 files changed, 63 insertions(+), 48 deletions(-) diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c index bb9fe70239..7497e0397e 100644 --- a/libavcodec/hw_base_encode.c +++ b/libavcodec/hw_base_encode.c @@ -836,6 +836,64 @@ int ff_hw_base_init_gop_structure(AVCodecContext *avctx, uint32_t ref_l0, uint32 return 0; } +int ff_hw_base_get_recon_format(AVCodecContext *avctx, const void *hwconfig, enum AVPixelFormat *fmt) +{ +HWBaseEncodeContext *ctx = avctx->priv_data; +AVHWFramesConstraints *constraints = NULL; +enum AVPixelFormat recon_format; +int err, i; + +constraints = av_hwdevice_get_hwframe_constraints(ctx->device_ref, + hwconfig); +if (!constraints) { +err = AVERROR(ENOMEM); +goto fail; +} + +// Probably we can use the input surface format as the surface format +// of the reconstructed frames. If not, we just pick the first (only?) +// format in the valid list and hope that it all works. +recon_format = AV_PIX_FMT_NONE; +if (constraints->valid_sw_formats) { +for (i = 0; constraints->valid_sw_formats[i] != AV_PIX_FMT_NONE; i++) { +if (ctx->input_frames->sw_format == +constraints->valid_sw_formats[i]) { +recon_format = ctx->input_frames->sw_format; +break; +} +} +if (recon_format == AV_PIX_FMT_NONE) { +// No match. Just use the first in the supported list and +// hope for the best. +recon_format = constraints->valid_sw_formats[0]; +} +} else { +// No idea what to use; copy input format. +recon_format = ctx->input_frames->sw_format; +} +av_log(avctx, AV_LOG_DEBUG, "Using %s as format of " + "reconstructed frames.\n", av_get_pix_fmt_name(recon_format)); + +if (ctx->surface_width < constraints->min_width || +ctx->surface_height < constraints->min_height || +ctx->surface_width > constraints->max_width || +ctx->surface_height > constraints->max_height) { +av_log(avctx, AV_LOG_ERROR, "Hardware does not support encoding at " + "size %dx%d (constraints: width %d-%d height %d-%d).\n", + ctx->surface_width, ctx->surface_height, + constraints->min_width, constraints->max_width, + constraints->min_height, constraints->max_height); +err = AVERROR(EINVAL); +goto fail; +} + +*fmt = recon_format; +err = 0; +fail: +av_hwframe_constraints_free(&constraints); +return err; +} + int ff_hw_base_encode_init(AVCodecContext *avctx) { HWBaseEncodeContext *ctx = avctx->priv_data; diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h index d6d2fc03c5..3d026ee23e 100644 --- a/libavcodec/hw_base_encode.h +++ b/libavcodec/hw_base_encode.h @@ -279,6 +279,8 @@ int ff_hw_base_rc_mode_configure(AVCodecContext *avctx, const HWBaseEncodeRCMode int ff_hw_base_init_gop_structure(AVCodecContext *avctx, uint32_t ref_l0, uint32_t ref_l1, int flags, int prediction_pre_only); +int ff_hw_base_get_recon_format(AVCodecContext *avctx, const void *hwconfig, enum AVPixelFormat *fmt); + int ff_hw_base_encode_init(AVCodecContext *avctx); int ff_hw_base_encode_close(AVCodecContext *avctx); diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c index 0bce3ce105..84a81559e1 100644 --- a/libavcodec/vaapi_encode.c +++ b/libavcodec/vaapi_encode.c @@ -1898,9 +1898,8 @@ static av_cold int vaapi_encode_create_recon_frames(AVCodecContext *avctx) HWBaseEncodeContext *base_ctx = avctx->priv_data; VAAPIEncodeContext *ctx = avctx->priv_data; AVVAAPIHWConfig *hwconfig = NULL; -AVHWFramesConstraints *constraints = NULL; enum AVPixelFormat recon_format; -int err, i; +int err; hwconfig = av_hwdevice_hwconfig_alloc(base_ctx->device_ref); if (!hwconfig) { @@ -1909,52 +1908,9 @@ static av_cold int vaapi_encode_create_recon_frames(AVCodecContext *avctx) } hwconfig->config_id = ctx->va_config; -constraints = av_hwdevice_get_hwframe_constraints(base_ctx->device_ref, - hwconfig); -if (!constraints) { -err = AVERROR(ENOMEM); -goto fail; -} - -// Probably we can use the input surface format as the surface format -// of the reconstructed frames. If not, we just pick the first (only?) -// format in the valid list and hope that it all w
[FFmpeg-devel] [PATCH 7/9] avutil/hwcontext_d3d12va: add Flags for resource creation
From: Tong Wu Flags field is added to support diffferent resource creation. Signed-off-by: Tong Wu --- doc/APIchanges| 3 +++ libavutil/hwcontext_d3d12va.c | 2 +- libavutil/hwcontext_d3d12va.h | 5 + libavutil/version.h | 2 +- 4 files changed, 10 insertions(+), 2 deletions(-) diff --git a/doc/APIchanges b/doc/APIchanges index e477ed78e0..a33e54dd3b 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2023-02-09 API changes, most recent first: +2024-01-xx - xx - lavu 58.37.100 - hwcontext_d3d12va.h + Add AVD3D12VAFramesContext.Flags + 2023-11-xx - xx - lavfi 9.16.100 - buffersink.h buffersrc.h Add av_buffersink_get_colorspace and av_buffersink_get_color_range. Add AVBufferSrcParameters.color_space and AVBufferSrcParameters.color_range. diff --git a/libavutil/hwcontext_d3d12va.c b/libavutil/hwcontext_d3d12va.c index 414dd44290..0d94f48543 100644 --- a/libavutil/hwcontext_d3d12va.c +++ b/libavutil/hwcontext_d3d12va.c @@ -237,7 +237,7 @@ static AVBufferRef *d3d12va_pool_alloc(void *opaque, size_t size) .Format = hwctx->format, .SampleDesc = {.Count = 1, .Quality = 0 }, .Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN, -.Flags= D3D12_RESOURCE_FLAG_NONE, +.Flags= hwctx->Flags, }; frame = av_mallocz(sizeof(AVD3D12VAFrame)); diff --git a/libavutil/hwcontext_d3d12va.h b/libavutil/hwcontext_d3d12va.h index ff06e6f2ef..dc1c17d3f9 100644 --- a/libavutil/hwcontext_d3d12va.h +++ b/libavutil/hwcontext_d3d12va.h @@ -129,6 +129,11 @@ typedef struct AVD3D12VAFramesContext { * If unset, will be automatically set. */ DXGI_FORMAT format; + +/** + * This field is used for resource creation. + */ +D3D12_RESOURCE_FLAGS Flags; } AVD3D12VAFramesContext; #endif /* AVUTIL_HWCONTEXT_D3D12VA_H */ diff --git a/libavutil/version.h b/libavutil/version.h index 772c4e209c..3ad1a9446c 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -79,7 +79,7 @@ */ #define LIBAVUTIL_VERSION_MAJOR 58 -#define LIBAVUTIL_VERSION_MINOR 36 +#define LIBAVUTIL_VERSION_MINOR 37 #define LIBAVUTIL_VERSION_MICRO 101 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ -- 2.41.0.windows.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 5/9] avcodec/vaapi_encode: extract gop configuration to base layer
From: Tong Wu Signed-off-by: Tong Wu --- libavcodec/hw_base_encode.c | 54 + libavcodec/hw_base_encode.h | 3 +++ libavcodec/vaapi_encode.c | 52 +++ 3 files changed, 61 insertions(+), 48 deletions(-) diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c index c20c47bf55..bb9fe70239 100644 --- a/libavcodec/hw_base_encode.c +++ b/libavcodec/hw_base_encode.c @@ -782,6 +782,60 @@ int ff_hw_base_rc_mode_configure(AVCodecContext *avctx, const HWBaseEncodeRCMode return 0; } +int ff_hw_base_init_gop_structure(AVCodecContext *avctx, uint32_t ref_l0, uint32_t ref_l1, + int flags, int prediction_pre_only) +{ +HWBaseEncodeContext *ctx = avctx->priv_data; + +if (flags & FLAG_INTRA_ONLY || avctx->gop_size <= 1) { +av_log(avctx, AV_LOG_VERBOSE, "Using intra frames only.\n"); +ctx->gop_size = 1; +} else if (ref_l0 < 1) { +av_log(avctx, AV_LOG_ERROR, "Driver does not support any " + "reference frames.\n"); +return AVERROR(EINVAL); +} else if (!(flags & FLAG_B_PICTURES) || ref_l1 < 1 || + avctx->max_b_frames < 1 || prediction_pre_only) { +if (ctx->p_to_gpb) + av_log(avctx, AV_LOG_VERBOSE, "Using intra and B-frames " + "(supported references: %d / %d).\n", + ref_l0, ref_l1); +else +av_log(avctx, AV_LOG_VERBOSE, "Using intra and P-frames " + "(supported references: %d / %d).\n", ref_l0, ref_l1); +ctx->gop_size = avctx->gop_size; +ctx->p_per_i = INT_MAX; +ctx->b_per_p = 0; +} else { + if (ctx->p_to_gpb) + av_log(avctx, AV_LOG_VERBOSE, "Using intra and B-frames " + "(supported references: %d / %d).\n", + ref_l0, ref_l1); + else + av_log(avctx, AV_LOG_VERBOSE, "Using intra, P- and B-frames " + "(supported references: %d / %d).\n", ref_l0, ref_l1); +ctx->gop_size = avctx->gop_size; +ctx->p_per_i = INT_MAX; +ctx->b_per_p = avctx->max_b_frames; +if (flags & FLAG_B_PICTURE_REFERENCES) { +ctx->max_b_depth = FFMIN(ctx->desired_b_depth, + av_log2(ctx->b_per_p) + 1); +} else { +ctx->max_b_depth = 1; +} +} + +if (flags & FLAG_NON_IDR_KEY_PICTURES) { +ctx->closed_gop = !!(avctx->flags & AV_CODEC_FLAG_CLOSED_GOP); +ctx->gop_per_idr = ctx->idr_interval + 1; +} else { +ctx->closed_gop = 1; +ctx->gop_per_idr = 1; +} + +return 0; +} + int ff_hw_base_encode_init(AVCodecContext *avctx) { HWBaseEncodeContext *ctx = avctx->priv_data; diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h index 57cfa12e73..d6d2fc03c5 100644 --- a/libavcodec/hw_base_encode.h +++ b/libavcodec/hw_base_encode.h @@ -276,6 +276,9 @@ int ff_hw_base_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt); int ff_hw_base_rc_mode_configure(AVCodecContext *avctx, const HWBaseEncodeRCMode *rc_mode, int default_quality, HWBaseEncodeRCConfigure *rc_conf); +int ff_hw_base_init_gop_structure(AVCodecContext *avctx, uint32_t ref_l0, uint32_t ref_l1, + int flags, int prediction_pre_only); + int ff_hw_base_encode_init(AVCodecContext *avctx); int ff_hw_base_encode_close(AVCodecContext *avctx); diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c index 30e5deac08..0bce3ce105 100644 --- a/libavcodec/vaapi_encode.c +++ b/libavcodec/vaapi_encode.c @@ -1443,7 +1443,7 @@ static av_cold int vaapi_encode_init_gop_structure(AVCodecContext *avctx) VAStatus vas; VAConfigAttrib attr = { VAConfigAttribEncMaxRefFrames }; uint32_t ref_l0, ref_l1; -int prediction_pre_only; +int prediction_pre_only, err; vas = vaGetConfigAttributes(ctx->hwctx->display, ctx->va_profile, @@ -1507,53 +1507,9 @@ static av_cold int vaapi_encode_init_gop_structure(AVCodecContext *avctx) } #endif -if (ctx->codec->flags & FLAG_INTRA_ONLY || -avctx->gop_size <= 1) { -av_log(avctx, AV_LOG_VERBOSE, "Using intra frames only.\n"); -base_ctx->gop_size = 1; -} else if (ref_l0 < 1) { -av_log(avctx, AV_LOG_ERROR, "Driver does not support any " - "reference frames.\n"); -return AVERROR(EINVAL); -} else if (!(ctx->codec->flags & FLAG_B_PICTURES) || - ref_l1 < 1 || avctx->max_b_frames < 1 || - prediction_pre_only) { -if (base_ctx->p_to_gpb) - av_log(avctx, AV_LOG_VERBOSE, "Using intra and B-frames " - "(supported references: %d / %d).\n", - ref_l0, ref_l1); -else -av_log(avctx, AV_LOG_VERBOSE, "Using intra and P-frames
[FFmpeg-devel] [PATCH 9/9] Changelog: add D3D12VA HEVC encoder changelog
From: Tong Wu Signed-off-by: Tong Wu --- Changelog | 1 + 1 file changed, 1 insertion(+) diff --git a/Changelog b/Changelog index c40b6d08fd..b3b5c16e0a 100644 --- a/Changelog +++ b/Changelog @@ -22,6 +22,7 @@ version : - ffmpeg CLI -bsf option may now be used for input as well as output - ffmpeg CLI options may now be used as -/opt , which is equivalent to -opt > +- D3D12VA HEVC encoder version 6.1: - libaribcaption decoder -- 2.41.0.windows.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 8/9] avcodec: add D3D12VA hardware HEVC encoder
From: Tong Wu This implementation is based on D3D12 Video Encoding Spec: https://microsoft.github.io/DirectX-Specs/d3d/D3D12VideoEncoding.html Sample command line for transcoding: ffmpeg.exe -hwaccel d3d12va -hwaccel_output_format d3d12 -i input.mp4 -c:v hevc_d3d12va output.mp4 Signed-off-by: Tong Wu --- configure|6 + libavcodec/Makefile |4 +- libavcodec/allcodecs.c |1 + libavcodec/d3d12va_encode.c | 1441 ++ libavcodec/d3d12va_encode.h | 200 + libavcodec/d3d12va_encode_hevc.c | 1016 + libavcodec/hw_base_encode.h |2 +- 7 files changed, 2668 insertions(+), 2 deletions(-) create mode 100644 libavcodec/d3d12va_encode.c create mode 100644 libavcodec/d3d12va_encode.h create mode 100644 libavcodec/d3d12va_encode_hevc.c diff --git a/configure b/configure index c8ae0a061d..3a186a3454 100755 --- a/configure +++ b/configure @@ -2561,6 +2561,7 @@ CONFIG_EXTRA=" tpeldsp vaapi_1 vaapi_encode +d3d12va_encode vc1dsp videodsp vp3dsp @@ -3204,6 +3205,7 @@ wmv3_vaapi_hwaccel_select="vc1_vaapi_hwaccel" wmv3_vdpau_hwaccel_select="vc1_vdpau_hwaccel" # hardware-accelerated codecs +d3d12va_encode_deps="d3d12va ID3D12VideoEncoder d3d12_encoder_feature" mediafoundation_deps="mftransform_h MFCreateAlignedMemoryBuffer" omx_deps="libdl pthreads" omx_rpi_select="omx" @@ -3271,6 +3273,7 @@ h264_v4l2m2m_encoder_deps="v4l2_m2m h264_v4l2_m2m" hevc_amf_encoder_deps="amf" hevc_cuvid_decoder_deps="cuvid" hevc_cuvid_decoder_select="hevc_mp4toannexb_bsf" +hevc_d3d12va_encoder_select="atsc_a53 cbs_h265 d3d12va_encode" hevc_mediacodec_decoder_deps="mediacodec" hevc_mediacodec_decoder_select="hevc_mp4toannexb_bsf hevc_parser" hevc_mediacodec_encoder_deps="mediacodec" @@ -6612,6 +6615,9 @@ check_type "windows.h d3d11.h" "ID3D11VideoDecoder" check_type "windows.h d3d11.h" "ID3D11VideoContext" check_type "windows.h d3d12.h" "ID3D12Device" check_type "windows.h d3d12video.h" "ID3D12VideoDecoder" +check_type "windows.h d3d12video.h" "ID3D12VideoEncoder" +test_code cc "windows.h d3d12video.h" "D3D12_FEATURE_VIDEO feature = D3D12_FEATURE_VIDEO_ENCODER_CODEC" && \ +test_code cc "windows.h d3d12video.h" "D3D12_FEATURE_DATA_VIDEO_ENCODER_RESOURCE_REQUIREMENTS req" && enable d3d12_encoder_feature check_type "windows.h" "DPI_AWARENESS_CONTEXT" -D_WIN32_WINNT=0x0A00 check_type "d3d9.h dxva2api.h" DXVA2_ConfigPictureDecode -D_WIN32_WINNT=0x0602 check_func_headers mfapi.h MFCreateAlignedMemoryBuffer -lmfplat diff --git a/libavcodec/Makefile b/libavcodec/Makefile index f9a5c9d616..d7c24a1867 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -85,6 +85,7 @@ OBJS-$(CONFIG_CBS_MPEG2) += cbs_mpeg2.o OBJS-$(CONFIG_CBS_VP8) += cbs_vp8.o vp8data.o OBJS-$(CONFIG_CBS_VP9) += cbs_vp9.o OBJS-$(CONFIG_CRYSTALHD) += crystalhd.o +OBJS-$(CONFIG_D3D12VA_ENCODE) += d3d12va_encode.o hw_base_encode.o OBJS-$(CONFIG_DEFLATE_WRAPPER) += zlib_wrapper.o OBJS-$(CONFIG_DOVI_RPU)+= dovi_rpu.o OBJS-$(CONFIG_ERROR_RESILIENCE)+= error_resilience.o @@ -435,6 +436,7 @@ OBJS-$(CONFIG_HEVC_DECODER)+= hevcdec.o hevc_mvs.o \ h274.o OBJS-$(CONFIG_HEVC_AMF_ENCODER)+= amfenc_hevc.o OBJS-$(CONFIG_HEVC_CUVID_DECODER) += cuviddec.o +OBJS-$(CONFIG_HEVC_D3D12VA_ENCODER)+= d3d12va_encode_hevc.o OBJS-$(CONFIG_HEVC_MEDIACODEC_DECODER) += mediacodecdec.o OBJS-$(CONFIG_HEVC_MEDIACODEC_ENCODER) += mediacodecenc.o OBJS-$(CONFIG_HEVC_MF_ENCODER) += mfenc.o mf_utils.o @@ -1304,7 +1306,7 @@ SKIPHEADERS+= %_tablegen.h \ SKIPHEADERS-$(CONFIG_AMF) += amfenc.h SKIPHEADERS-$(CONFIG_D3D11VA) += d3d11va.h dxva2_internal.h -SKIPHEADERS-$(CONFIG_D3D12VA) += d3d12va_decode.h +SKIPHEADERS-$(CONFIG_D3D12VA) += d3d12va_decode.h d3d12va_encode.h SKIPHEADERS-$(CONFIG_DXVA2)+= dxva2.h dxva2_internal.h SKIPHEADERS-$(CONFIG_JNI) += ffjni.h SKIPHEADERS-$(CONFIG_LCMS2)+= fflcms2.h diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c index 93ce8e3224..b9df2b4752 100644 --- a/libavcodec/allcodecs.c +++ b/libavcodec/allcodecs.c @@ -864,6 +864,7 @@ extern const FFCodec ff_h264_vaapi_encoder; extern const FFCodec ff_h264_videotoolbox_encoder; extern const FFCodec ff_hevc_amf_encoder; extern const FFCodec ff_hevc_cuvid_decoder; +extern const FFCodec ff_hevc_d3d12va_encoder; extern const FFCodec ff_hevc_mediacodec_decoder; extern const FFCodec ff_hevc_mediacodec_encoder; extern const FFCodec ff_hevc_mf_encoder; diff --git a/libavcodec/d3d12va_encode.c b/libavcodec/d3d12va_encode.c new file mode 100644 index 00..2dbf41d4b1 --- /dev/null +++ b/libavcodec/d3d12va_encode.c @@ -0,0 +1,1441 @@ +/* +
[FFmpeg-devel] [PATCH] libavfi/dnn: add LibTorch as one of DNN backend
From: Wenbin Chen PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment. Official websit: https://pytorch.org/. We call the C++ library of PyTorch as LibTorch, the same below. To build FFmpeg with LibTorch, please take following steps as reference: 1. download LibTorch C++ library in https://pytorch.org/get-started/locally/, please select C++/Java for language, and other options as your need. 2. unzip the file to your own dir, with command unzip libtorch-shared-with-deps-latest.zip -d your_dir 3. export libtorch_root/libtorch/include and libtorch_root/libtorch/include/torch/csrc/api/include to $PATH export libtorch_root/libtorch/lib/ to $LD_LIBRARY_PATH 4. config FFmpeg with ../configure --enable-libtorch --extra-cflag=-I/libtorch_root/libtorch/include --extra-cflag=-I/libtorch_root/libtorch/include/torch/csrc/api/include --extra-ldflags=-L/libtorch_root/libtorch/lib/ 5. make To run FFmpeg DNN inference with LibTorch backend: ./ffmpeg -i input.jpg -vf dnn_processing=dnn_backend=torch:model=LibTorch_model.pt -y output.jpg The LibTorch_model.pt can be generated by Python with torch.jit.script() api. Please note, torch.jit.trace() is not recommanded, since it does not support ambiguous input size. Signed-off-by: Ting Fu Signed-off-by: Wenbin Chen --- configure | 5 +- libavfilter/dnn/Makefile | 1 + libavfilter/dnn/dnn_backend_torch.cpp | 585 ++ libavfilter/dnn/dnn_interface.c | 5 + libavfilter/dnn_filter_common.c | 31 +- libavfilter/dnn_interface.h | 2 +- libavfilter/vf_dnn_processing.c | 3 + 7 files changed, 621 insertions(+), 11 deletions(-) create mode 100644 libavfilter/dnn/dnn_backend_torch.cpp diff --git a/configure b/configure index c8ae0a061d..75061692b1 100755 --- a/configure +++ b/configure @@ -279,6 +279,7 @@ External library support: --enable-libtheora enable Theora encoding via libtheora [no] --enable-libtls enable LibreSSL (via libtls), needed for https support if openssl, gnutls or mbedtls is not used [no] + --enable-libtorchenable Torch as one DNN backend [no] --enable-libtwolame enable MP2 encoding via libtwolame [no] --enable-libuavs3d enable AVS3 decoding via libuavs3d [no] --enable-libv4l2 enable libv4l2/v4l-utils [no] @@ -1901,6 +1902,7 @@ EXTERNAL_LIBRARY_LIST=" libtensorflow libtesseract libtheora +libtorch libtwolame libuavs3d libv4l2 @@ -2776,7 +2778,7 @@ cbs_vp9_select="cbs" deflate_wrapper_deps="zlib" dirac_parse_select="golomb" dovi_rpu_select="golomb" -dnn_suggest="libtensorflow libopenvino" +dnn_suggest="libtensorflow libopenvino libtorch" dnn_deps="avformat swscale" error_resilience_select="me_cmp" evcparse_select="golomb" @@ -6872,6 +6874,7 @@ enabled libtensorflow && require libtensorflow tensorflow/c/c_api.h TF_Versi enabled libtesseract && require_pkg_config libtesseract tesseract tesseract/capi.h TessBaseAPICreate enabled libtheora && require libtheora theora/theoraenc.h th_info_init -ltheoraenc -ltheoradec -logg enabled libtls&& require_pkg_config libtls libtls tls.h tls_configure +enabled libtorch && check_cxxflags -std=c++14 && require_cpp libtorch torch/torch.h "torch::Tensor" -ltorch -lc10 -ltorch_cpu -lstdc++ -lpthread enabled libtwolame&& require libtwolame twolame.h twolame_init -ltwolame && { check_lib libtwolame twolame.h twolame_encode_buffer_float32_interleaved -ltwolame || die "ERROR: libtwolame must be installed and version must be >= 0.3.10"; } diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile index 5d5697ea42..3d09927c98 100644 --- a/libavfilter/dnn/Makefile +++ b/libavfilter/dnn/Makefile @@ -6,5 +6,6 @@ OBJS-$(CONFIG_DNN) += dnn/dnn_backend_common.o DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o DNN-OBJS-$(CONFIG_LIBOPENVINO) += dnn/dnn_backend_openvino.o +DNN-OBJS-$(CONFIG_LIBTORCH) += dnn/dnn_backend_torch.o OBJS-$(CONFIG_DNN) += $(DNN-OBJS-yes) diff --git a/libavfilter/dnn/dnn_backend_torch.cpp b/libavfilter/dnn/dnn_backend_torch.cpp new file mode 100644 index 00..4fc76d0ce4 --- /dev/null +++ b/libavfilter/dnn/dnn_backend_torch.cpp @@ -0,0 +1,585 @@ +/* + * Copyright (c) 2024 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even th
[FFmpeg-devel] [PATCH] configure: autodetect libglslang ldflags
Since glslang 14.0.0, OGLCompiler and HLSL stub libraries have been fully removed from the build. This fixes the configuration by detecting if the stub libraries are still present (glslang releases before version 14.0.0). ffbuild/config.log: /usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lOSDependent: No such file or directory /usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lHLSL: No such file or directory /usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lOGLCompiler: No such file or directory Addresses https://trac.ffmpeg.org/ticket/10713 See https://bugs.gentoo.org/show_bug.cgi?id=918989 Should fix https://ffmpeg.org/pipermail/ffmpeg-devel/2023-August/313666.html Signed-off-by: Matthew White --- configure | 23 +-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/configure b/configure index c8ae0a061d..abff488dc0 100755 --- a/configure +++ b/configure @@ -2626,6 +2626,7 @@ CMDLINE_SET=" ignore_tests install ld +libglslang_ldflags ln_s logfile malloc_prefix @@ -6652,6 +6653,24 @@ if enabled_all libglslang libshaderc; then die "ERROR: libshaderc and libglslang are mutually exclusive, if in doubt, disable libglslang" fi +if enabled libglslang; then +if [ -x "$(command -v glslang)" ]; then +# https://github.com/KhronosGroup/glslang +# commit 6be56e45e574b375d759b89dad35f780bbd4792f: Remove `OGLCompiler` and `HLSL` stub libraries from build +# StandAlone/StandAlone.cpp: "SpirvGeneratorVersion:GLSLANG_VERSION_MAJOR.GLSLANG_VERSION_MINOR.GLSLANG_VERSION_PATCH GLSLANG_VERSION_FLAVOR" +glslang_version="$(glslang -dumpversion)" +glslang_major="${glslang_version%%.*}" +glslang_major="${glslang_major#*:}" +if test ${glslang_major} -le 13; then +libglslang_ldflags=" -lOSDependent -lHLSL -lOGLCompiler" +elif ! [[ ${glslang_major} =~ ^[0-9]+$ ]]; then +die "ERROR: glslang's computed major version isn't a number: '${glslang_major}'" +fi +else +die "ERROR: glslang binary not found, impossible to determine installed glslang's version" +fi +fi + check_cpp_condition winrt windows.h "!WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_DESKTOP)" if ! disabled w32threads && ! enabled pthreads; then @@ -6771,10 +6790,10 @@ enabled libfreetype && require_pkg_config libfreetype freetype2 "ft2build. enabled libfribidi&& require_pkg_config libfribidi fribidi fribidi.h fribidi_version_info enabled libharfbuzz && require_pkg_config libharfbuzz harfbuzz hb.h hb_buffer_create enabled libglslang && { check_lib spirv_compiler glslang/Include/glslang_c_interface.h glslang_initialize_process \ --lglslang -lMachineIndependent -lOSDependent -lHLSL -lOGLCompiler -lGenericCodeGen \ +-lglslang -lMachineIndependent "${libglslang_ldflags}" -lGenericCodeGen \ -lSPVRemapper -lSPIRV -lSPIRV-Tools-opt -lSPIRV-Tools -lpthread -lstdc++ -lm || require spirv_compiler glslang/Include/glslang_c_interface.h glslang_initialize_process \ --lglslang -lOSDependent -lHLSL -lOGLCompiler \ +-lglslang "${libglslang_ldflags}" \ -lSPVRemapper -lSPIRV -lSPIRV-Tools-opt -lSPIRV-Tools -lpthread -lstdc++ -lm; } enabled libgme&& { check_pkg_config libgme libgme gme/gme.h gme_new_emu || require libgme gme/gme.h gme_new_emu -lgme -lstdc++; } -- 2.43.0 pgpco13ImOOVM.pgp Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".