[FFmpeg-devel] suggested patch: avfilter/vf_subtitles: add support for subtitles font scaling
Hello. Recently I used ffmpeg to embed subtitles, and I needed to scale them. I thought "original_size" option scales subtitles, but it does not. So I wrote a short patch for it to do that (attached) If that is considered too complex/bad idea I've also attached another patch that adds a "font_scale" option instead. I hope you like one of these. -- diff -ur ffmpeg-HEAD-649b7a9/libavfilter/vf_subtitles.c ffmpeg-HEAD-649b7a9.new/libavfilter/vf_subtitles.c --- ffmpeg-HEAD-649b7a9/libavfilter/vf_subtitles.c 2014-09-10 00:07:59.0 + +++ ffmpeg-HEAD-649b7a9.new/libavfilter/vf_subtitles.c 2014-09-10 00:11:26.0 + @@ -136,9 +136,11 @@ ff_draw_init(&ass->draw, inlink->format, 0); ass_set_frame_size (ass->renderer, inlink->w, inlink->h); -if (ass->original_w && ass->original_h) +if (ass->original_w && ass->original_h) { ass_set_aspect_ratio(ass->renderer, (double)inlink->w / inlink->h, (double)ass->original_w / ass->original_h); +ass_set_font_scale(ass->renderer, (inlink->w + inlink->h) * 1.0 / (ass->original_w + ass->original_h)); +} return 0; } diff -ur ffmpeg-HEAD-649b7a9/libavfilter/vf_subtitles.c ffmpeg-HEAD-649b7a9.new/libavfilter/vf_subtitles.c --- ffmpeg-HEAD-649b7a9/libavfilter/vf_subtitles.c 2014-09-10 01:02:49.0 + +++ ffmpeg-HEAD-649b7a9.new/libavfilter/vf_subtitles.c 2014-09-10 03:34:19.0 + @@ -55,6 +55,7 @@ uint8_t rgba_map[4]; int pix_step[4]; ///< steps per pixel for each plane of the main output int original_w, original_h; +double font_scale; FFDrawContext draw; } AssContext; @@ -65,6 +66,7 @@ {"filename", "set the filename of file to read", OFFSET(filename), AV_OPT_TYPE_STRING, {.str = NULL}, CHAR_MIN, CHAR_MAX, FLAGS }, \ {"f", "set the filename of file to read", OFFSET(filename), AV_OPT_TYPE_STRING, {.str = NULL}, CHAR_MIN, CHAR_MAX, FLAGS }, \ {"original_size", "set the size of the original video (used to scale fonts)", OFFSET(original_w), AV_OPT_TYPE_IMAGE_SIZE, {.str = NULL}, CHAR_MIN, CHAR_MAX, FLAGS }, \ +{"font_scale", "set font scale", OFFSET(font_scale), AV_OPT_TYPE_DOUBLE, {.dbl = 1}, 0.01, 99, FLAGS }, \ /* libass supports a log level ranging from 0 to 7 */ static const int ass_libavfilter_log_level_map[] = { @@ -139,6 +141,7 @@ if (ass->original_w && ass->original_h) ass_set_aspect_ratio(ass->renderer, (double)inlink->w / inlink->h, (double)ass->original_w / ass->original_h); +ass_set_font_scale(ass->renderer, ass->font_scale); return 0; } ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] h264: fix RTSP stream decoding
> The error code returned by decode_extradata_ps() is inconsistent after this > its not "if any failed" it is returning an error if the last failed Sorry, I don't get how it is supposed to work. I just found the previous implementation and checked which commit broke it. The other possible solution on upper level: --- From 9fcd003a095b19b9e2fb5f6af3cc57a9e131f308 Mon Sep 17 00:00:00 2001 From: Sergey Gavrushkin Date: Wed, 3 Jan 2018 12:51:15 +0300 Subject: [PATCH] libavcodec/h264: fix decoding Fixes ticket #6422. It is a regression fix for an issue that was introduced in commit 98c97994c5b90bdae02accb155eeceeb5224b8ef. Variable err_recognition is ignored while extradata is decoded and the whole decoding process is failed due to timeout. Signed-off-by: Sergey Gavrushkin --- libavcodec/h264_parse.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/h264_parse.c b/libavcodec/h264_parse.c index fee28d9..403fd39 100644 --- a/libavcodec/h264_parse.c +++ b/libavcodec/h264_parse.c @@ -487,7 +487,7 @@ int ff_h264_decode_extradata(const uint8_t *data, int size, H264ParamSets *ps, } else { *is_avc = 0; ret = decode_extradata_ps(data, size, ps, 0, logctx); -if (ret < 0) +if (ret < 0 && (err_recognition & AV_EF_EXPLODE)) return ret; } return size; -- 2.6.4 Thank you, Sergey ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] cuviddec: improved way of finding out if a frame is interlaced or progressive
There are 2 types of problems when using adaptive deinterlace with cuvid: 1. Sometimes, in the middle of transcoding, cuvid outputs frames with visible horizontal lines (as though weave deinterlace method was chosen); 2. Occasionally, on scene changes, cuvid outputs a wrong frame, which should have been shown several seconds before (as if the frame was assigned some wrong PTS value). The reason is that sometimes CUVIDPARSERDISPINFO has property progressive_frame equal to 1 with interlaced videos. In order to fix the problem we should check if the video is interlaced or progressive in the beginning of a video sequence (cuvid_handle_video_sequence). And then we just use this information instead of the property progressive_frame in CUVIDPARSERDISPINFO (which is unreliable). More info, samples and reproduction steps are here https://github.com/Svechnikov/ffmpeg-cuda-deinterlace-problems --- libavcodec/cuviddec.c | 5 + 1 file changed, 5 insertions(+) diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c index 2aecb45..671fc8c 100644 --- a/libavcodec/cuviddec.c +++ b/libavcodec/cuviddec.c @@ -77,6 +77,7 @@ typedef struct CuvidContext int deint_mode; int deint_mode_current; int64_t prev_pts; +unsigned char progressive_sequence; int internal_error; int decoder_flushing; @@ -216,6 +217,8 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form ? cudaVideoDeinterlaceMode_Weave : ctx->deint_mode; +ctx->progressive_sequence = format->progressive_sequence; + if (!format->progressive_sequence && ctx->deint_mode_current == cudaVideoDeinterlaceMode_Weave) avctx->flags |= AV_CODEC_FLAG_INTERLACED_DCT; else @@ -509,6 +512,8 @@ static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame) av_fifo_generic_read(ctx->frame_queue, &parsed_frame, sizeof(CuvidParsedFrame), NULL); +parsed_frame.dispinfo.progressive_frame = ctx->progressive_sequence; + memset(¶ms, 0, sizeof(params)); params.progressive_frame = parsed_frame.dispinfo.progressive_frame; params.second_field = parsed_frame.second_field; -- 2.7.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] libavfilter/vf_scale_cuda: fix frame dimensions
AVHWFramesContext has aligned width and height. When initializing a new AVFrame, it receives these aligned values (in av_hwframe_get_buffer), which leads to incorrect scaling. The resulting frames are cropped either horizontally or vertically. As a fix we can overwrite the dimensions to original values right after av_hwframe_get_buffer. More info, samples and reproduction steps are here https://github.com/Svechnikov/ffmpeg-scale-cuda-problem --- libavfilter/vf_scale_cuda.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c index c97a802..ef1bd82 100644 --- a/libavfilter/vf_scale_cuda.c +++ b/libavfilter/vf_scale_cuda.c @@ -463,6 +463,9 @@ static int cudascale_scale(AVFilterContext *ctx, AVFrame *out, AVFrame *in) if (ret < 0) return ret; +s->tmp_frame->width = s->planes_out[0].width; +s->tmp_frame->height = s->planes_out[0].height; + av_frame_move_ref(out, s->frame); av_frame_move_ref(s->frame, s->tmp_frame); -- 2.7.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] libavfilter/vf_scale_cuda: fix src_pitch for 10bit videos
When scaling a 10bit video using scale_cuda filter (witch uses pixel format AV_PIX_FMT_P010LE), the output video gets distorted. I think it has something to do with the differences in processing between cuda_sdk and ffnvcodec with cuda_nvcc (the problem appears after this commit https://github.com/FFmpeg/FFmpeg/commit/2544c7ea67ca9521c5de36396bc9ac7058223742). To solve the problem we should not divide the input frame planes' linesizes by 2 and leave them as they are. More info, samples and reproduction steps are here https://github.com/Svechnikov/ffmpeg-scale-cuda-10bit-problem --- libavfilter/vf_scale_cuda.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c index c97a802..7fc33ee 100644 --- a/libavfilter/vf_scale_cuda.c +++ b/libavfilter/vf_scale_cuda.c @@ -423,11 +423,11 @@ static int scalecuda_resize(AVFilterContext *ctx, break; case AV_PIX_FMT_P010LE: call_resize_kernel(ctx, s->cu_func_ushort, 1, - in->data[0], in->width, in->height, in->linesize[0]/2, + in->data[0], in->width, in->height, in->linesize[0], out->data[0], out->width, out->height, out->linesize[0]/2, 2); call_resize_kernel(ctx, s->cu_func_ushort2, 2, - in->data[1], in->width / 2, in->height / 2, in->linesize[1]/2, + in->data[1], in->width / 2, in->height / 2, in->linesize[1], out->data[0] + out->linesize[0] * ((out->height + 31) & ~0x1f), out->width / 2, out->height / 2, out->linesize[1] / 4, 2); break; -- 2.7.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] Rename SRT's streamid to srt_streamid to avoid a conflict with standard streamid option
Default streamid is some numeric value and not used by SRT code. Instead SRT has its own string streamid. Current code has the same option name for both and this causes a conflict when ffmpeg is started from a terminal. The attached patch fixes it by renaming SRT's "streamid" to "srt_streamid" Best Regards, Sergey From 46d75e066ec828545ebf242ab0530ecb66d7fc6d Mon Sep 17 00:00:00 2001 From: Sergey Ilinykh Date: Thu, 3 Jun 2021 13:13:32 +0300 Subject: [PATCH] Rename SRT's streamid to srt_streamid to avoid a conflict with standard streamid option --- libavformat/libsrt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavformat/libsrt.c b/libavformat/libsrt.c index c1e96f700e..10dfc9e9c9 100644 --- a/libavformat/libsrt.c +++ b/libavformat/libsrt.c @@ -133,7 +133,7 @@ static const AVOption libsrt_options[] = { { "rcvbuf", "Receive buffer size (in bytes)", OFFSET(rcvbuf), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, .flags = D|E }, { "lossmaxttl", "Maximum possible packet reorder tolerance",OFFSET(lossmaxttl), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, .flags = D|E }, { "minversion", "The minimum SRT version that is required from the peer", OFFSET(minversion), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, .flags = D|E }, -{ "streamid", "A string of up to 512 characters that an Initiator can pass to a Responder", OFFSET(streamid), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = D|E }, +{ "srt_streamid", "A string of up to 512 characters that an Initiator can pass to a Responder", OFFSET(streamid), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = D|E }, { "smoother", "The type of Smoother used for the transmission for that socket", OFFSET(smoother), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = D|E }, { "messageapi", "Enable message API", OFFSET(messageapi), AV_OPT_TYPE_BOOL, { .i64 = -1 }, -1, 1, .flags = D|E }, { "transtype", "The transmission type for the socket", OFFSET(transtype),AV_OPT_TYPE_INT, { .i64 = SRTT_INVALID }, SRTT_LIVE, SRTT_INVALID, .flags = D|E, "transtype" }, @@ -608,7 +608,7 @@ static int libsrt_open(URLContext *h, const char *uri, int flags) if (av_find_info_tag(buf, sizeof(buf), "minversion", p)) { s->minversion = strtol(buf, NULL, 0); } -if (av_find_info_tag(buf, sizeof(buf), "streamid", p)) { +if (av_find_info_tag(buf, sizeof(buf), "srt_streamid", p)) { av_freep(&s->streamid); s->streamid = av_strdup(buf); if (!s->streamid) { -- 2.31.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Rename SRT's streamid to srt_streamid to avoid a conflict with standard streamid option
this one http://ffmpeg.org/pipermail/ffmpeg-devel/2021-June/280949.html does a better job. please merge Best Regards, Sergey чт, 3 июн. 2021 г. в 13:37, Sergey Ilinykh : > Default streamid is some numeric value and not used by SRT code. Instead > SRT has its own string streamid. Current code has the same option name for > both and this causes a conflict when ffmpeg is started from a terminal. > > The attached patch fixes it by renaming SRT's "streamid" to "srt_streamid" > > Best Regards, > Sergey > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] Fix potential integer overflow in mov_read_keys
Actual allocation size is computed as (count + 1)*sizeof(meta_keys), so we need to check that (count + 1) won't cause overflow. From cfc0f5a099284c95476d5c020dca05fb743ff5ae Mon Sep 17 00:00:00 2001 From: Sergey Volk Date: Wed, 7 Sep 2016 14:05:35 -0700 Subject: [PATCH] Fix potential integer overflow in mov_read_keys Actual allocation size is computed as (count + 1)*sizeof(meta_keys), so we need to check that (count + 1) won't cause overflow. --- libavformat/mov.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavformat/mov.c b/libavformat/mov.c index f499906..ea7d051 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -3278,7 +3278,7 @@ static int mov_read_keys(MOVContext *c, AVIOContext *pb, MOVAtom atom) avio_skip(pb, 4); count = avio_rb32(pb); -if (count > UINT_MAX / sizeof(*c->meta_keys)) { +if (count + 1 > UINT_MAX / sizeof(*c->meta_keys)) { av_log(c->fc, AV_LOG_ERROR, "The 'keys' atom with the invalid key count: %d\n", count); return AVERROR_INVALIDDATA; -- 2.8.0.rc3.226.g39d4020 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Fix potential integer overflow in mov_read_keys
I just realized that count+1 itself might overflow if count==UINT_MAX, so I guess it's better to subtract 1 from the right-hand side. Attached updated patch. On Wed, Sep 7, 2016 at 2:21 PM, Sergey Volk wrote: > Actual allocation size is computed as (count + 1)*sizeof(meta_keys), so > we need to check that (count + 1) won't cause overflow. > > From 87a7a2e202ebb63362715054773a89ce1fc71743 Mon Sep 17 00:00:00 2001 From: Sergey Volk Date: Wed, 7 Sep 2016 14:05:35 -0700 Subject: [PATCH] Fix potential integer overflow in mov_read_keys Actual allocation size is computed as (count + 1)*sizeof(meta_keys), so we need to check that (count + 1) won't cause overflow. --- libavformat/mov.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavformat/mov.c b/libavformat/mov.c index f499906..a7595c5 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -3278,7 +3278,7 @@ static int mov_read_keys(MOVContext *c, AVIOContext *pb, MOVAtom atom) avio_skip(pb, 4); count = avio_rb32(pb); -if (count > UINT_MAX / sizeof(*c->meta_keys)) { +if (count > UINT_MAX / sizeof(*c->meta_keys) - 1) { av_log(c->fc, AV_LOG_ERROR, "The 'keys' atom with the invalid key count: %d\n", count); return AVERROR_INVALIDDATA; -- 2.8.0.rc3.226.g39d4020 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id
As far as I can see FFmpeg currently doesn't set AVStream::id for matroska/webm streams. I think we could use either MatroskaTrack::num (TrackNumber) or MatroskaTrack::uid (TrackUID) for that. I have found a few discussions claiming that TrackUID could be missing, even though TrackUID is marked as mandatory field in matroska spec, for example see https://github.com/mbunkus/mkvtoolnix/issues/1050 https://lists.w3.org/Archives/Public/public-inbandtracks/2014May/0003.html So I guess it's safer to use TrackNumber for now. --- libavformat/matroskadec.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c index d20568c..8b80df1 100644 --- a/libavformat/matroskadec.c +++ b/libavformat/matroskadec.c @@ -1856,6 +1856,8 @@ static int matroska_parse_tracks(AVFormatContext *s) return AVERROR(ENOMEM); } +st->id = (int) track->num; + if (key_id_base64) { /* export encryption key id as base64 metadata tag */ av_dict_set(&st->metadata, "enc_key_id", key_id_base64, 0); -- 2.7.0.rc3.207.g0ac5344 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id
Ok, something like this for now, then? I'm new to ffmpeg development. When is the next version bump going to happen? --- libavformat/matroskadec.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c index d20568c..4c3e53a 100644 --- a/libavformat/matroskadec.c +++ b/libavformat/matroskadec.c @@ -1856,6 +1856,9 @@ static int matroska_parse_tracks(AVFormatContext *s) return AVERROR(ENOMEM); } +if (track->num <= INT_MAX) + st->id = (int) track->num; + if (key_id_base64) { /* export encryption key id as base64 metadata tag */ av_dict_set(&st->metadata, "enc_key_id", key_id_base64, 0); -- 2.7.0.rc3.207.g0ac5344 On Thu, Mar 3, 2016 at 2:14 AM, Carl Eugen Hoyos wrote: > wm4 googlemail.com> writes: > >> > +st->id = (int) track->num; > >> Might be better after all not to set the id if it's out of range? > > Yes, please. > > While there, the id field could be changed to 64bit with the > next version bump. > > Carl Eugen > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id
Ok, so then I guess we'll need to update AVStream::id to be 64-bit first, and then I'll make the necessary changes in matroskadec. I've prepared a patch to bump AVStream::id to be int64_t in the next major version, I'll send it out shortly. After I rebuilt ffmpeg with AVStream::id being int64_t I got a couple of new warnings in the code that was using 32-bit format specifiers for printing stream ids, I've fixed those as well. I've also re-ran 'make fate' and all the tests seem to be good. On Sat, Mar 5, 2016 at 2:47 AM, Michael Niedermayer wrote: > On Fri, Mar 04, 2016 at 04:19:18PM -0800, Sergey Volk wrote: >> Ok, something like this for now, then? > > your original patch contained a nice commit message, this one > doesnt > > >> I'm new to ffmpeg development. When is the next version bump going to happen? > > you can make changes at the next bump by using #if FF_API... > see libavfilter/version.h > > > [...] > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > It is dangerous to be right in matters on which the established authorities > are wrong. -- Voltaire > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id
From: Sergey Volk Date: Wed, 9 Mar 2016 14:34:19 -0800 Subject: [PATCH] Change AVStream::id to int64_t in the next version bump I have also bumped the major version to 58 locally in version.h, and re-ran make with the stream id being int64_t and fixed all new warnings that showed up (only saw new warnings related to the incorrect format being used for int64_t value). --- ffprobe.c | 8 +++- libavformat/avformat.h | 4 libavformat/concatdec.c | 8 ++-- libavformat/dump.c | 4 libavformat/mpegtsenc.c | 7 ++- libavformat/version.h | 3 +++ 6 files changed, 30 insertions(+), 4 deletions(-) diff --git a/ffprobe.c b/ffprobe.c index f7b51ad..21eab61 100644 --- a/ffprobe.c +++ b/ffprobe.c @@ -2287,7 +2287,13 @@ static int show_stream(WriterContext *w, AVFormatContext *fmt_ctx, int stream_id } } -if (fmt_ctx->iformat->flags & AVFMT_SHOW_IDS) print_fmt("id", "0x%x", stream->id); +#if FF_API_OLD_INT32_STREAM_ID +#define STREAM_ID_FORMAT "0x%x" +#else +#define STREAM_ID_FORMAT "0x%"PRIx64 +#endif +if (fmt_ctx->iformat->flags & AVFMT_SHOW_IDS) print_fmt("id", STREAM_ID_FORMAT, stream->id); +#undef STREAM_ID_FORMAT else print_str_opt("id", "N/A"); print_q("r_frame_rate", stream->r_frame_rate, '/'); print_q("avg_frame_rate", stream->avg_frame_rate, '/'); diff --git a/libavformat/avformat.h b/libavformat/avformat.h index a558f2d..253b293 100644 --- a/libavformat/avformat.h +++ b/libavformat/avformat.h @@ -871,7 +871,11 @@ typedef struct AVStream { * decoding: set by libavformat * encoding: set by the user, replaced by libavformat if left unset */ +#if FF_API_OLD_INT32_STREAM_ID int id; +#else +int64_t id; +#endif /** * Codec context associated with this stream. Allocated and freed by * libavformat. diff --git a/libavformat/concatdec.c b/libavformat/concatdec.c index e69096f..481c8433 100644 --- a/libavformat/concatdec.c +++ b/libavformat/concatdec.c @@ -238,8 +238,12 @@ static int match_streams_exact_id(AVFormatContext *avf) for (j = 0; j < avf->nb_streams; j++) { if (avf->streams[j]->id == st->id) { av_log(avf, AV_LOG_VERBOSE, - "Match slave stream #%d with stream #%d id 0x%x\n", - i, j, st->id); +#if FF_API_OLD_INT32_STREAM_ID + "Match slave stream #%d with stream #%d id 0x%x\n" +#else + "Match slave stream #%d with stream #%d id 0x%"PRIx64"\n" +#endif + , i, j, st->id); if ((ret = copy_stream_props(avf->streams[j], st)) < 0) return ret; cat->cur_file->streams[i].out_stream_index = j; diff --git a/libavformat/dump.c b/libavformat/dump.c index 86bb82d..8b50ec1 100644 --- a/libavformat/dump.c +++ b/libavformat/dump.c @@ -453,7 +453,11 @@ static void dump_stream_format(AVFormatContext *ic, int i, /* the pid is an important information, so we display it */ /* XXX: add a generic system */ if (flags & AVFMT_SHOW_IDS) +#if FF_API_OLD_INT32_STREAM_ID av_log(NULL, AV_LOG_INFO, "[0x%x]", st->id); +#else +av_log(NULL, AV_LOG_INFO, "[0x%"PRIx64"]", st->id); +#endif if (lang) av_log(NULL, AV_LOG_INFO, "(%s)", lang->value); av_log(NULL, AV_LOG_DEBUG, ", %d, %d/%d", st->codec_info_nb_frames, diff --git a/libavformat/mpegtsenc.c b/libavformat/mpegtsenc.c index 68f9867..0244b7f 100644 --- a/libavformat/mpegtsenc.c +++ b/libavformat/mpegtsenc.c @@ -833,7 +833,12 @@ static int mpegts_init(AVFormatContext *s) ts_st->pid = st->id; } else { av_log(s, AV_LOG_ERROR, - "Invalid stream id %d, must be less than 8191\n", st->id); +#if FF_API_OLD_INT32_STREAM_ID + "Invalid stream id %d, must be less than 8191\n", +#else + "Invalid stream id %"PRId64", must be less than 8191\n", +#endif +st->id); ret = AVERROR(EINVAL); goto fail; } diff --git a/libavformat/version.h b/libavformat/version.h index 7dcce2c..e0ac45a 100644 --- a/libavformat/version.h +++ b/libavformat/version.h @@ -74,6 +74,9 @@ #ifndef FF_API_OLD_OPEN_CALLBACKS #define FF_API_OLD_OPEN_CALLBACKS (LIBAVFORMAT_VERSION_MAJOR < 58) #endif +#ifndef FF_API_OLD_INT32_STREAM_ID +#define FF_API_OLD_INT32_STREAM_ID (LIBAVFORMAT_VERSION_MAJOR < 58) +#endif #ifndef FF_API_R_FRAME_RATE #define FF_API_R_FRAME_RATE1 -- 2.7.0.rc3.207.g0ac5344 On Wed, Mar 9, 2016 at 3
Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id
Yeah, I was using Gmail web interface, it does that. I'll try attaching the patch file next time. On Thu, Mar 10, 2016 at 1:23 AM, Moritz Barsnick wrote: > On Wed, Mar 09, 2016 at 15:56:53 -0800, Sergey Volk wrote: > > -if (fmt_ctx->iformat->flags & AVFMT_SHOW_IDS) print_fmt("id", > > "0x%x", stream->id); > > +#if FF_API_OLD_INT32_STREAM_ID > > +#define STREAM_ID_FORMAT "0x%x" > > +#else > > +#define STREAM_ID_FORMAT "0x%"PRIx64 > > +#endif > > +if (fmt_ctx->iformat->flags & AVFMT_SHOW_IDS) print_fmt("id", > > STREAM_ID_FORMAT, stream->id); > > +#undef STREAM_ID_FORMAT > > From pure visual inspection, I believe your patch got broken (wrapped > lines) by your mailer agent or something along the line. > > Moritz > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id
Thanks for the comments, I'll update my next patch to take that into account. But first I wanted to discuss your second point (regarding int64_t/uint64_t choice). I have actually looked at all places that use AVStream::id (34 files under libavformat/ + a few more files outside it). There are a few places which treat stream id as unsigned (e.g. libavformat/bink.c, which assigns AVStream::id to be the result of avio_rl32(pb) ). But most places either assign some signed int value to AVStream::id (the int value often comes as an input parameter to some function that actually assigns stream id) or use negative values for special cases (e.g. libavformat/swfdec.c, libavformat/wtvdec.c). Since most of the code expects AVStream::id to be signed for now, I've decided to make it an int64_t, it makes for a smaller/easier change. I have also been trying to figure out what's FFmpeg code style stance on doing something like 'typedef int64_t StreamId' and then using StreamId type whenever we deal with stream ids. But that's probably a more C++-style approach, not sure if it's appropriate here (and https://ffmpeg.org/developer.html doesn't seem to address this directly). Any opinions on this? On Thu, Mar 10, 2016 at 12:19 AM, Nicolas George wrote: > Thanks for the patch. > > Le decadi 20 ventôse, an CCXXIV, Sergey Volk a écrit : > > I have also bumped the major version to 58 locally in version.h, and > > re-ran make with the stream id being int64_t and fixed all new > > warnings that showed up (only saw new warnings related to the > > incorrect format being used for int64_t value). > > Commit messages are usually written in an impersonal form. Remember that > they will stay. That does not matter much. > > > av_log(avf, AV_LOG_VERBOSE, > > - "Match slave stream #%d with stream #%d id > 0x%x\n", > > - i, j, st->id); > > +#if FF_API_OLD_INT32_STREAM_ID > > + "Match slave stream #%d with stream #%d id > 0x%x\n" > > +#else > > + "Match slave stream #%d with stream #%d id > > 0x%"PRIx64"\n" > > +#endif > > + , i, j, st->id); > > You could do much simpler by casting the id unconditionally to int64_t: > > /* TODO remove cast after FF_API_OLD_INT32_STREAM_ID removal */ > av_log(... "0x%"PRIx64"\n", (int64_t)st->id); > > (I would put the comment at each place the cast is used, to ease finding > all > the casts that can be removed.) > > As a side note, I wonder if uint64_t would not be better than the signed > variant. > > Regards, > > -- > Nicolas George > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Chrome not able to playback aac_he_v2 when remuxed from mpegts to mp4 using the aac_adtstoasc bitstream filter
Looks like it's failing here: https://code.google.com/p/chromium/codesearch#chromium/src/media/filters/ffmpeg_audio_decoder.cc&l=419 Here is the error message I got from Chrome: [1:9:0428/101459:VERBOSE2:decoder_selector.cc(195)] InitializeDecoder [1:9:0428/101459:ERROR:ffmpeg_audio_decoder.cc(421)] Audio configuration specified 2 channels, but FFmpeg thinks the file contains 1 channels So codec_context_->channels is 1, but Chrome expects it to be 2. And ffprobe confirms that the generated test.mp4 actually has 2 channels: Duration: 00:00:10.16, start: 0.00, bitrate: 52 kb/s Stream #0:0(und): Audio: aac (HE-AACv2) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 51 kb/s (default) So we need to figure out why avcodec_open2 set channels=1. On Thu, Apr 28, 2016 at 4:16 AM, Anders Rein wrote: > Google Chrome is not able to playback aac_he_v2 streams remuxed from > mpegts to mp4. To reproduce the problem: > > > ffmpeg -f lavfi -i 'aevalsrc=sin(2*PI*t*440)[out0]' -t 10 -movflags > faststart -c:a libfdk_aac -ac 2 -ar 48000 -profile:a aac_he_v2 -f mpegts > tmp.ts > > ffmpeg -i tmp.ts -c copy -bsf:a aac_adtstoasc test.mp4 > > > However if the audio is encoded directly to mp4 it works fine: > > > ffmpeg -f lavfi -i 'aevalsrc=sin(2*PI*t*440)[out0]' -t 10 -movflags > faststart -c:a libfdk_aac -ac 2 -ar 48000 -profile:a aac_he_v2 -f mp4 > test.mp4 > > > It is only when using aac_he_v2 profile that Chrome refuses to playback > the file. Using the aac_he profile works fine: > > > ffmpeg -f lavfi -i 'aevalsrc=sin(2*PI*t*440)[out0]' -t 10 -movflags > faststart -c:a libfdk_aac -ac 2 -ar 48000 -profile:a aac_he -f mpegts tmp.ts > > ffmpeg -i tmp.ts -c copy -bsf:a aac_adtstoasc test.mp4 > > > There seems to be something wrong with the aac_adtstoasc bitstream filter > that does not account for aac_he_v2. It might be a bug in Chrome as well > since ffplay is able to playback all the variations without any problem, > but please note that Chrome IS able to playback the file when it is encoded > directly. > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] libavfilter/af_biquads: warn about clipping only after frame with clipping
--- libavfilter/af_biquads.c | 1 + 1 file changed, 1 insertion(+) diff --git a/libavfilter/af_biquads.c b/libavfilter/af_biquads.c index 4953202..79f1b7c 100644 --- a/libavfilter/af_biquads.c +++ b/libavfilter/af_biquads.c @@ -420,6 +420,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *buf) if (s->clippings > 0) av_log(ctx, AV_LOG_WARNING, "clipping %d times. Please reduce gain.\n", s->clippings); +s->clippings = 0; if (buf != out_buf) av_frame_free(&buf); -- 1.9.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [GSOC] [PATCH] SRCNN filter
> [...] > > +#define OFFSET(x) offsetof(SRCNNContext, x) > > +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM > > +static const AVOption srcnn_options[] = { > > +{ "config_file", "path to configuration file with network > parameters", OFFSET(config_file_path), AV_OPT_TYPE_STRING, {.str=NULL}, 0, > 0, FLAGS }, > > +{ NULL } > > +}; > > + > > +AVFILTER_DEFINE_CLASS(srcnn); > > + > > +#define CHECK_FILE(file)if (ferror(file) || feof(file)){ \ > > +av_log(context, AV_LOG_ERROR, "error > reading configuration file\n");\ > > +fclose(file); \ > > +return AVERROR(EIO); \ > > +} > > + > > +#define CHECK_ALLOCATION(conv, file)if > (allocate_and_read_convolution_data(&conv, file)){ \ > > +av_log(context, > AV_LOG_ERROR, "could not allocate memory for convolutions\n"); \ > > +fclose(file); \ > > +return AVERROR(ENOMEM); \ > > +} > > + > > > +static int allocate_and_read_convolution_data(Convolution* conv, FILE* > config_file) > > +{ > > +int32_t kernel_size = conv->output_channels * conv->size * > conv->size * conv->input_channels; > > +conv->kernel = av_malloc(kernel_size * sizeof(double)); > > +if (!conv->kernel){ > > +return 1; > > this should return an AVERROR code for consistency with the rest of > the codebase > Ok. > > +} > > > +fread(conv->kernel, sizeof(double), kernel_size, config_file); > > directly reading data types is not portable, it would for example be > endian specific > and using avio for reading may be better, though fread is as far as iam > concerned also ok > Ok, I understand the problem, but I have not really worked with it before, so I need an advice of how to properly fix it. If I understand correctly, for int32_t I need to check endianness and reverse bytes if necessary. But with doubles it is more complicated. Should I write a IEEE 754 converter from binary to double or maybe I can somehow check IEEE 754 doubles support and depending on it either stick to the default network weights, or just read bytes and check endianness, if IEEE 754 doubles are supported? Or maybe avio provide some utility to deal with this problem? > [...] > > +/** > > + * @file > > + * Default cnn weights for x2 upsampling with srcnn filter. > > + */ > > + > > +/// First convolution kernel > > > +static double conv1_kernel[] = { > > static data should be also const, otherwise it may be changed and could > cause > thread saftey issues > Ok, I just wanted to not allocate additional memory in case of using default weights. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [GSOC] [PATCH] SRCNN filter
2018-05-07 17:41 GMT+03:00 Pedro Arthur : > 2018-05-07 0:30 GMT-03:00 Steven Liu : > > Hi Sergey, > > > > How should i test this filter? > > I tested it some days ago, the picture get worse from 2nd frame. > > input resolution 640x480 to 1280x720; > > > > ffmpeg -i input -vf srcnn output > Hi, > The filter expects the input upscaled by 2x, therefore the proper > command would be > > ffmpeg -i input -vf "scale=2*iw:2*ih, srcnn, scale=1280:720" > > The default filter is trained for 2x upscale, anything different from > that may generate bad results. Hi, Moreover, the filter expects the input upscaled with bicubic upscaling, so for other upscaling algorithms bad results are also possible. Also other models for x2, x3, x4 upsampling can be specified using following command: ffmpeg -i input -vf scale=iw*factor:-1,srcnn=path_to_model Configuration files with other models can be found here: https://drive.google.com/drive/folders/1-M9azWTtZ4egf8ndRU7Y_tiGP6QtN-Fp?usp=sharing ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avformat/cafenc: fixed packet_size calculation
the problem is the very last packet can be shorter than default packet_size so it's required to exclude it from packet_size calculations. fixes #10465 --- libavformat/cafenc.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/libavformat/cafenc.c b/libavformat/cafenc.c index 67be59806c..fcc4838392 100644 --- a/libavformat/cafenc.c +++ b/libavformat/cafenc.c @@ -34,6 +34,8 @@ typedef struct { int size_buffer_size; int size_entries_used; int packets; +int64_t duration; +int64_t last_packet_duration; } CAFContext; static uint32_t codec_flags(enum AVCodecID codec_id) { @@ -238,6 +240,8 @@ static int caf_write_packet(AVFormatContext *s, AVPacket *pkt) pkt_sizes[caf->size_entries_used++] = 128 | top; } pkt_sizes[caf->size_entries_used++] = pkt->size & 127; +caf->duration += pkt->duration; +caf->last_packet_duration = pkt->duration; caf->packets++; } avio_write(s->pb, pkt->data, pkt->size); @@ -259,7 +263,11 @@ static int caf_write_trailer(AVFormatContext *s) if (!par->block_align) { int packet_size = samples_per_packet(par); if (!packet_size) { -packet_size = st->duration / (caf->packets - 1); +if (caf->duration) { +packet_size = (caf->duration - caf->last_packet_duration) / (caf->packets - 1); +} else { +packet_size = st->duration / (caf->packets - 1); +} avio_seek(pb, FRAME_SIZE_OFFSET, SEEK_SET); avio_wb32(pb, packet_size); } -- 2.40.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [GSOC] [PATCH] DNN module introduction and SRCNN filter update
2018-05-24 22:52 GMT+03:00 James Almer : > On 5/24/2018 4:24 PM, Sergey Lavrushkin wrote: > > Hello, > > > > This patch introduces DNN inference interface and simple native backend. > > For now implemented backend supports only convolutions with relu > activation > > function, that are sufficient for simple convolutional networks, > > particularly SRCNN. > > SRCNN filter was updated using implemented DNN inference interface and > > native backend. > > > > > > adds_dnn_srcnn.patch > > > > > > From 60247d3deca3c822da0ef8d7390cda08db958830 Mon Sep 17 00:00:00 2001 > > From: Sergey Lavrushkin > > Date: Thu, 24 May 2018 22:05:54 +0300 > > Subject: [PATCH] Adds dnn inference module for simple convolutional > networks. > > Reimplements srcnn filter based on it. > > > > --- > > Changelog | 2 + > > libavfilter/vf_srcnn.c | 300 +++ > > libavfilter/vf_srcnn.h | 855 -- > --- > > libavutil/Makefile | 3 + > > libavutil/dnn_backend_native.c | 382 ++ > > libavutil/dnn_backend_native.h | 40 ++ > > libavutil/dnn_interface.c | 48 +++ > > libavutil/dnn_interface.h | 64 +++ > > libavutil/dnn_srcnn.h | 854 ++ > ++ > > 9 files changed, 1455 insertions(+), 1093 deletions(-) > > delete mode 100644 libavfilter/vf_srcnn.h > > create mode 100644 libavutil/dnn_backend_native.c > > create mode 100644 libavutil/dnn_backend_native.h > > create mode 100644 libavutil/dnn_interface.c > > create mode 100644 libavutil/dnn_interface.h > > create mode 100644 libavutil/dnn_srcnn.h > > With this change you're trying to use libavformat API from libavutil, > which is not ok as the latter must not depend on the former at all. So > if anything, this belongs in libavformat. > > That aside, you're using the ff_ prefix on an installed header, which is > unusable from outside the library that contains the symbol, and the > structs are not using the AV* namespace either. > Does this need to be public API to begin with? If it's only going to be > used by one or more filters, then it might as well remain as an internal > module in libavfilter. > Yes, I think it will be used only in libavfilter. I'll move it there then. And should I use ff_ prefix and AV* namespace for structs in an internal module in libavfilter? > And you need to indent and prettify the tables a bit. > Do you mean kernels and biases in dnn_srcnn.h? Their formatting represents their 4D structure a little bit and it is similar to one that I used for the first srcnn filter version that was successfully pushed before. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [GSOC] [PATCH] DNN module introduction and SRCNN filter update
2018-05-29 4:08 GMT+03:00 Pedro Arthur : > 2018-05-28 19:52 GMT-03:00 Sergey Lavrushkin : > > 2018-05-28 9:32 GMT+03:00 Guo, Yejun : > > > >> looks that no tensorflow dependency is introduced, a new model format is > >> created together with some CPU implementation for inference. With this > >> idea, Android Neural Network would be a very good reference, see > >> https://developer.android.google.cn/ndk/guides/neuralnetworks/. It > >> defines how the model is organized, and also provided a CPU optimized > >> inference implementation (within the NNAPI runtime, it is open source). > It > >> is still under development but mature enough to run some popular dnn > models > >> with proper performance. We can absorb some basic design. Anyway, just a > >> reference fyi. (btw, I'm not sure about any IP issue) > >> > > > > The idea was to first introduce something to use when tensorflow is not > > available. Here is another patch, that introduces tensorflow backend. > I think it would be better for reviewing if you send the second patch > in a new email. Then we need to push the first patch, I think. > > > > > >> For this patch, I have two comments. > >> > >> 1. change from "DNNModel* (*load_default_model)(DNNDefaultModel > >> model_type);" to " DNNModel* (*load_builtin_model)(DNNBuiltinModel > >> model_type);" > >> The DNNModule can be invoked by many filters, default model is a good > >> name at the filter level, while built-in model is better within the DNN > >> scope. > >> > >> typedef struct DNNModule{ > >> // Loads model and parameters from given file. Returns NULL if it is > >> not possible. > >> DNNModel* (*load_model)(const char* model_filename); > >> // Loads one of the default models > >> DNNModel* (*load_default_model)(DNNDefaultModel model_type); > >> // Executes model with specified input and output. Returns DNN_ERROR > >> otherwise. > >> DNNReturnType (*execute_model)(const DNNModel* model); > >> // Frees memory allocated for model. > >> void (*free_model)(DNNModel** model); > >> } DNNModule; > >> > >> > >> 2. add a new variable 'number' for DNNData/InputParams > >> As a typical DNN concept, the data shape usually is: >> width, channel> or , the last component > >> denotes its index changes the fastest in the memory. We can add this > >> concept into the API, and decide to support or or both. > > > > > > I did not add number of elements in batch because I thought, that we > would > > not feed more than one element at once to a network in a ffmpeg filter. > > But it can be easily added if necessary. > > > > So here is the patch that adds tensorflow backend with the previous > patch. > > I forgot to change include guards from AVUTIL_* to AVFILTER_* in it. > You moved the files from libavutil to libavfilter while it was > proposed to move them to libavformat. Not only, it was also proposed to move it to libavfilter if it is going to be used only in filters. I do not know if this module is useful anywhere else besides libavfilter. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module
2018-06-01 6:09 GMT+03:00 Guo, Yejun : > Did you try to build ffmpeg with TENSORFLOW_BACKEND enabled, and run it > without TF library? This case is possible when an end user install > pre-built package on a machine without TF library. > > In function init, the logic is to fall back to cpu path (DNN_NATIVE) if > unable to load tensorflow backend. While in function ff_get_dnn_module, it > has no chance to 'return NULL'. > I tried to run ffmpeg built with libtensorflow enabled and without tensorflow library, it didn't start. I got this message: ffmpeg: error while loading shared libraries: libtensorflow.so: cannot open shared object file: No such file or directory Is it even possible to run it without library that was enabled during configuration? Maybe I need to change something in the configure script? Otherwise there is no point to add any fallback to DNN_NATIVE, if it just won't start. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module
2018-06-02 19:45 GMT+03:00 James Almer : > On 5/31/2018 12:01 PM, Sergey Lavrushkin wrote: > > diff --git a/Changelog b/Changelog > > index df2024fb59..a667fd045d 100644 > > --- a/Changelog > > +++ b/Changelog > > @@ -11,6 +11,7 @@ version : > > - support mbedTLS based TLS > > - DNN inference interface > > - Reimplemented SRCNN filter using DNN inference interface > > +- TensorFlow DNN backend > > This and the two entries you added earlier don't really belong here. > It's enough with the line stating the filter was introduced back in > ffmpeg 4.0 > I should not add any line regarding introduced DNN inference module, that can be usefull for someone writing another filter based on DNN? > > > > > > version 4.0: > > diff --git a/configure b/configure > > index 09ff0c55e2..47e21fec39 100755 > > --- a/configure > > +++ b/configure > > @@ -259,6 +259,7 @@ External library support: > >--enable-libspeexenable Speex de/encoding via libspeex [no] > >--enable-libsrt enable Haivision SRT protocol via libsrt [no] > >--enable-libssh enable SFTP protocol via libssh [no] > > + --enable-libtensorflow enable TensorFlow as a DNN module backend > [no] > > Maybe mention it's for the srcnn filter. > > >--enable-libtesseractenable Tesseract, needed for ocr filter [no] > >--enable-libtheora enable Theora encoding via libtheora [no] > >--enable-libtls enable LibreSSL (via libtls), needed for > https support > > @@ -1713,6 +1714,7 @@ EXTERNAL_LIBRARY_LIST=" > > libspeex > > libsrt > > libssh > > +libtensorflow > > libtesseract > > libtheora > > libtwolame > > @@ -3453,7 +3455,7 @@ avcodec_select="null_bsf" > > avdevice_deps="avformat avcodec avutil" > > avdevice_suggest="libm" > > avfilter_deps="avutil" > > -avfilter_suggest="libm" > > +avfilter_suggest="libm libtensorflow" > > Add instead > > srcnn_filter_suggest="libtensorflow" > > To the corresponding section. > But this DNN inference module can be used for other filters. At least, I think, that after training more complicated models for super resolution I'll have to add them as separate filters. So, I thought, this module shouldn't be a part of srcnn filter from the begining. Or is it better to add *_filter_suggest="libtensorflow" to the configure script and dnn_*.o to the Makefile for every new filter based on this module? > > avformat_deps="avcodec avutil" > > avformat_suggest="libm network zlib" > > avresample_deps="avutil" > > @@ -6055,6 +6057,7 @@ enabled libsoxr && require libsoxr > soxr.h soxr_create -lsoxr > > enabled libssh&& require_pkg_config libssh libssh > libssh/sftp.h sftp_init > > enabled libspeex && require_pkg_config libspeex speex > speex/speex.h speex_decoder_init > > enabled libsrt&& require_pkg_config libsrt "srt >= 1.2.0" > srt/srt.h srt_socket > > +enabled libtensorflow && require libtensorflow tensorflow/c/c_api.h > TF_Version -ltensorflow && add_cflags -DTENSORFLOW_BACKEND > > Superfluous define. Just check for CONFIG_LIBTENSORFLOW instead. > > > enabled libtesseract && require_pkg_config libtesseract tesseract > tesseract/capi.h TessBaseAPICreate > > enabled libtheora && require libtheora theora/theoraenc.h > th_info_init -ltheoraenc -ltheoradec -logg > > enabled libtls&& require_pkg_config libtls libtls tls.h > tls_configure > > diff --git a/libavfilter/Makefile b/libavfilter/Makefile > > index 3201cbeacf..82915e2f75 100644 > > --- a/libavfilter/Makefile > > +++ b/libavfilter/Makefile > > @@ -14,6 +14,7 @@ OBJS = allfilters.o > \ > > buffersrc.o > \ > > dnn_interface.o > \ > > dnn_backend_native.o > \ > > + dnn_backend_tf.o > \ > > See Jan Ekström's patch. Add this to the filter's entry as all these > source files should not be compiled unconditionally. > > > drawutils.o > \ > > fifo.o > \ > > formats.o > \ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module
2018-06-03 19:57 GMT+03:00 Pedro Arthur : > 2018-05-31 12:01 GMT-03:00 Sergey Lavrushkin : > > Hello, > > > > This patch introduces TensorFlow backend for DNN inference module. > > This backend uses TensorFlow binary models and requires from model > > to have the operation named 'x' as an input operation and the operation > > named 'y' as an output operation. Models are executed using > libtensorflow. > > Hi, > > You added the tf model in dnn_srcnn.h, it seems the data is being > duplicated as it already contains the weights as C float arrays. > Is it possible to construct the model graph via C api and set the > weights using the ones we already have, eliminating the need for > storing the whole tf model? Hi, I think, it is possible, but it will require to manually create every operation and specify each of their attributes and inputs in a certain order specified by operations declaration. Here is that model: https://drive.google.com/file/d/1s7bW7QnUfmTaYoMLPdYYTOLujqNgRq0J/view?usp=sharing It is just a lot easier to store the whole model and not construct it manually. Another way, I think of, is to pass weights in placeholders and not save them in model, but it has to be done when session is already created and not during model loading. Maybe some init operation can be specified with variables assignment to values passed through placeholders during model loading, if it is possible. But is it really crucial to not store the whole tf model? It is not that big. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module
> > My concern is when we add more models, currently we have to store 2 > models, one for the "native" implementation and one for the TF > backend. > There is also the case were one wants to update the weights for a > model, it will be necessary to update both the native and TF data. > Having duplicated data is much easier to get inconsistencies between > implementations. > I understand the problem, but I am afraid that manual graph construction can take a lot of time, especially if we add something more complicated than srcnn, and the second approach passing weights in placeholders will require to add some logic for it in other parts of API besides model loading. I am thinking of another way, that is to get weights for native model from this binary tf model, if they are stored there consistently, and not specify them as float arrays. But then for each new model we need to find offsets for each weights array. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module
2018-06-05 17:20 GMT+03:00 James Almer : > On 6/3/2018 3:02 PM, Sergey Lavrushkin wrote: > > diff --git a/libavfilter/vf_srcnn.c b/libavfilter/vf_srcnn.c > > index d6efe9b478..5c5e26b33a 100644 > > --- a/libavfilter/vf_srcnn.c > > +++ b/libavfilter/vf_srcnn.c > > @@ -41,7 +41,6 @@ typedef struct SRCNNContext { > > DNNData input_output; > > } SRCNNContext; > > > > - > > #define OFFSET(x) offsetof(SRCNNContext, x) > > #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM > > static const AVOption srcnn_options[] = { > > @@ -55,10 +54,19 @@ static av_cold int init(AVFilterContext* context) > > { > > SRCNNContext* srcnn_context = context->priv; > > > > -srcnn_context->dnn_module = ff_get_dnn_module(DNN_NATIVE); > > +srcnn_context->dnn_module = ff_get_dnn_module(DNN_TF); > > This should be a filter AVOption, not hardcoded to one or another. What > if i, for whatever reason, want to use the native backend when i have > libtensorflow enabled? > > > if (!srcnn_context->dnn_module){ > > -av_log(context, AV_LOG_ERROR, "could not create dnn module\n"); > > -return AVERROR(ENOMEM); > > +srcnn_context->dnn_module = ff_get_dnn_module(DNN_NATIVE); > > +if (!srcnn_context->dnn_module){ > > +av_log(context, AV_LOG_ERROR, "could not create dnn > module\n"); > > +return AVERROR(ENOMEM); > > +} > > +else{ > > +av_log(context, AV_LOG_INFO, "using native backend for DNN > inference\n"); > > VERBOSE, not INFO > > > +} > > +} > > +else{ > > +av_log(context, AV_LOG_INFO, "using tensorflow backend for DNN > inference\n"); > > Ditto. > > > } > > if (!srcnn_context->model_filename){ > > av_log(context, AV_LOG_INFO, "model file for network was not > specified, using default network for x2 upsampling\n"); Here is the patch, that fixes described issues. From 971e15b4b1e3f2747aa07d0221f99226cba622ac Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Wed, 6 Jun 2018 01:44:40 +0300 Subject: [PATCH] libavfilter/vf_srcnn.c: adds DNN module backend AVOption, changes AV_LOG_INFO message to AV_LOG_VERBOSE. --- libavfilter/vf_srcnn.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/libavfilter/vf_srcnn.c b/libavfilter/vf_srcnn.c index 5c5e26b33a..17e380503e 100644 --- a/libavfilter/vf_srcnn.c +++ b/libavfilter/vf_srcnn.c @@ -36,6 +36,7 @@ typedef struct SRCNNContext { char* model_filename; float* input_output_buf; +DNNBackendType backend_type; DNNModule* dnn_module; DNNModel* model; DNNData input_output; @@ -44,6 +45,9 @@ typedef struct SRCNNContext { #define OFFSET(x) offsetof(SRCNNContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption srcnn_options[] = { +{ "dnn_backend", "DNN backend used for model execution", OFFSET(backend_type), AV_OPT_TYPE_FLAGS, { .i64 = 0 }, 0, 1, FLAGS, "backend" }, +{ "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, +{ "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, { "model_filename", "path to model file specifying network architecture and its parameters", OFFSET(model_filename), AV_OPT_TYPE_STRING, {.str=NULL}, 0, 0, FLAGS }, { NULL } }; @@ -54,29 +58,20 @@ static av_cold int init(AVFilterContext* context) { SRCNNContext* srcnn_context = context->priv; -srcnn_context->dnn_module = ff_get_dnn_module(DNN_TF); +srcnn_context->dnn_module = ff_get_dnn_module(srcnn_context->backend_type); if (!srcnn_context->dnn_module){ -srcnn_context->dnn_module = ff_get_dnn_module(DNN_NATIVE); -if (!srcnn_context->dnn_module){ -av_log(context, AV_LOG_ERROR, "could not create dnn module\n"); -return AVERROR(ENOMEM); -} -else{ -av_log(context, AV_LOG_INFO, "using native backend for DNN inference\n"); -} -} -else{ -av_log(context, AV_LOG_INFO, "using tensorflow backend for DNN inference\n"); +av_log(context, AV_LOG_ERROR, "could not create DNN module for requested backend\n"); +return AVERROR(ENOMEM); } if (!srcnn_context->model_filename){ -av_log(context, AV_LOG_INFO, "model file for network was not specified, using default network for x2 upsampling\n"); +
Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module
2018-06-06 17:22 GMT+03:00 Pedro Arthur : > Hi, > > 2018-06-05 20:23 GMT-03:00 Sergey Lavrushkin : > > Here is the patch, that fixes described issues. > When I try to run (video input), when tf is not enabled in configure it > crashes. > > > $ffmpeg -i in.mp4 -vf srcnn=dnn_backend=tensorflow out.mp4 > > ffmpeg version N-91232-g256386fd3e Copyright (c) 2000-2018 the FFmpeg > developers > built with gcc 7 (Ubuntu 7.3.0-16ubuntu3) > configuration: > libavutil 56. 18.102 / 56. 18.102 > libavcodec 58. 19.105 / 58. 19.105 > libavformat58. 17.100 / 58. 17.100 > libavdevice58. 4.100 / 58. 4.100 > libavfilter 7. 25.100 / 7. 25.100 > libswscale 5. 2.100 / 5. 2.100 > libswresample 3. 2.100 / 3. 2.100 > Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'in.mp4': > Metadata: > major_brand : isom > minor_version : 512 > compatible_brands: isomiso2mp41 > encoder : Lavf58.17.100 > Duration: 00:06:13.70, start: 0.00, bitrate: 5912 kb/s > Stream #0:0(und): Video: mpeg4 (Simple Profile) (mp4v / > 0x7634706D), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 5777 kb/s, 29.97 > fps, 29.97 tbr, 30k tbn, 30k tbc (default) > Metadata: > handler_name: VideoHandler > Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, > stereo, fltp, 128 kb/s (default) > Metadata: > handler_name: SoundHandler > Stream mapping: > Stream #0:0 -> #0:0 (mpeg4 (native) -> mpeg4 (native)) > Stream #0:1 -> #0:1 (aac (native) -> aac (native)) > Press [q] to stop, [?] for help > free(): invalid pointer > Aborted (core dumped) > > > > When the output is an image, t does not crashes but neither fallback to > native > > > $ffmpeg -i in.jpg -vf srcnn=dnn_backend=tensorflow out.png > > ffmpeg version N-91232-g256386fd3e Copyright (c) 2000-2018 the FFmpeg > developers > built with gcc 7 (Ubuntu 7.3.0-16ubuntu3) > configuration: > libavutil 56. 18.102 / 56. 18.102 > libavcodec 58. 19.105 / 58. 19.105 > libavformat58. 17.100 / 58. 17.100 > libavdevice58. 4.100 / 58. 4.100 > libavfilter 7. 25.100 / 7. 25.100 > libswscale 5. 2.100 / 5. 2.100 > libswresample 3. 2.100 / 3. 2.100 > Input #0, image2, from 'in.jpg': > Duration: 00:00:00.04, start: 0.00, bitrate: 43469 kb/s > Stream #0:0: Video: mjpeg, yuvj444p(pc, bt470bg/unknown/unknown), > 1192x670 [SAR 1:1 DAR 596:335], 25 tbr, 25 tbn, 25 tbc > Stream mapping: > Stream #0:0 -> #0:0 (mjpeg (native) -> png (native)) > Press [q] to stop, [?] for help > [Parsed_srcnn_0 @ 0x557d3ea55980] could not create DNN module for > requested backend > [AVFilterGraph @ 0x557d3ea102c0] Error initializing filter 'srcnn' > with args 'dnn_backend=tensorflow' > Error reinitializing filters! > Failed to inject frame into filter network: Cannot allocate memory > Error while processing the decoded data for stream #0:0 > Conversion failed! > > > I think you could disable the tensorflow option if it is not enable in > configure or fallback to native, either solution is ok for me. I disabled tensorflow option when it is not configured with it. Here is the updated patch. I think, crash occurred due to improper call to av_freep for dnn_module. Here is also the patch, that fixes this bug. From 33c1e08b650f3724c1317f024d716c8234e283b6 Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Wed, 6 Jun 2018 01:44:40 +0300 Subject: [PATCH 1/2] libavfilter/vf_srcnn.c: adds DNN module backend AVOption, changes AV_LOG_INFO message to AV_LOG_VERBOSE. --- libavfilter/vf_srcnn.c | 25 +++-- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/libavfilter/vf_srcnn.c b/libavfilter/vf_srcnn.c index 5c5e26b33a..bba54f6780 100644 --- a/libavfilter/vf_srcnn.c +++ b/libavfilter/vf_srcnn.c @@ -36,6 +36,7 @@ typedef struct SRCNNContext { char* model_filename; float* input_output_buf; +DNNBackendType backend_type; DNNModule* dnn_module; DNNModel* model; DNNData input_output; @@ -44,6 +45,11 @@ typedef struct SRCNNContext { #define OFFSET(x) offsetof(SRCNNContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption srcnn_options[] = { +{ "dnn_backend", "DNN backend used for model execution", OFFSET(backend_type), AV_OPT_TYPE_FLAGS, { .i64 = 0 }, 0, 1, FLAGS, "backend" }, +{ "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, +#if (CONFIG_LIBTENSORFLOW == 1) +{ "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i
Re: [FFmpeg-devel] [GSOC] [PATCH] On the fly generation of default DNN models and code style fixes
2018-07-28 4:31 GMT+03:00 Michael Niedermayer : > On Fri, Jul 27, 2018 at 08:06:15PM +0300, Sergey Lavrushkin wrote: > > Hello, > > > > The first patch provides on the fly generation of default DNN models, > > that eliminates data duplication for model weights. Also, files with > > internal weights > > were replaced with automatically generated one for models I trained. > > Scripts for training and generating these files can be found here: > > https://github.com/HighVoltageRocknRoll/sr > > Later, I will add a description to this repo on how to use it and > benchmark > > results for trained models. > > > > The second patch fixes some code style issues for pointers in DNN module > > and sr filter. Are there any other code style fixes I should make for > this > > code? > > > It seems the code with these patches produces some warnings: > > In file included from libavfilter/dnn_backend_native.c:27:0: > libavfilter/dnn_srcnn.h:2113:21: warning: ‘srcnn_consts’ defined but not > used [-Wunused-variable] > static const float *srcnn_consts[] = { > ^ > libavfilter/dnn_srcnn.h:2122:24: warning: ‘srcnn_consts_dims’ defined but > not used [-Wunused-variable] > static const long int *srcnn_consts_dims[] = { > ^ > libavfilter/dnn_srcnn.h:2142:20: warning: ‘srcnn_activations’ defined but > not used [-Wunused-variable] > static const char *srcnn_activations[] = { > ^ > In file included from libavfilter/dnn_backend_native.c:28:0: > libavfilter/dnn_espcn.h:5401:21: warning: ‘espcn_consts’ defined but not > used [-Wunused-variable] > static const float *espcn_consts[] = { > ^ > libavfilter/dnn_espcn.h:5410:24: warning: ‘espcn_consts_dims’ defined but > not used [-Wunused-variable] > static const long int *espcn_consts_dims[] = { > ^ > libavfilter/dnn_espcn.h:5432:20: warning: ‘espcn_activations’ defined but > not used [-Wunused-variable] > static const char *espcn_activations[] = { > ^ > Here is the patch, that fixes these warnings. From 37cd7bdf2610e1c3e89210a49e8f5f3832726281 Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Sat, 28 Jul 2018 12:55:02 +0300 Subject: [PATCH 3/3] libavfilter: Fixes warnings for unused variables in dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c. --- libavfilter/dnn_backend_tf.c | 64 +++- libavfilter/dnn_espcn.h | 37 - libavfilter/dnn_srcnn.h | 35 3 files changed, 63 insertions(+), 73 deletions(-) diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c index 6307c794a5..7a4ad72d27 100644 --- a/libavfilter/dnn_backend_tf.c +++ b/libavfilter/dnn_backend_tf.c @@ -374,9 +374,71 @@ DNNModel *ff_dnn_load_default_model_tf(DNNDefaultModel model_type) TFModel *tf_model = NULL; TF_OperationDescription *op_desc; TF_Operation *op; -TF_Operation *const_ops_buffer[6]; TF_Output input; int64_t input_shape[] = {1, -1, -1, 1}; +const char tanh[] = "Tanh"; +const char sigmoid[] = "Sigmoid"; +const char relu[] = "Relu"; + +const float *srcnn_consts[] = { +srcnn_conv1_kernel, +srcnn_conv1_bias, +srcnn_conv2_kernel, +srcnn_conv2_bias, +srcnn_conv3_kernel, +srcnn_conv3_bias +}; +const long int *srcnn_consts_dims[] = { +srcnn_conv1_kernel_dims, +srcnn_conv1_bias_dims, +srcnn_conv2_kernel_dims, +srcnn_conv2_bias_dims, +srcnn_conv3_kernel_dims, +srcnn_conv3_bias_dims +}; +const int srcnn_consts_dims_len[] = { +4, +1, +4, +1, +4, +1 +}; +const char *srcnn_activations[] = { +relu, +relu, +relu +}; + +const float *espcn_consts[] = { +espcn_conv1_kernel, +espcn_conv1_bias, +espcn_conv2_kernel, +espcn_conv2_bias, +espcn_conv3_kernel, +espcn_conv3_bias +}; +const long int *espcn_consts_dims[] = { +espcn_conv1_kernel_dims, +espcn_conv1_bias_dims, +espcn_conv2_kernel_dims, +espcn_conv2_bias_dims, +espcn_conv3_kernel_dims, +espcn_conv3_bias_dims +}; +const int espcn_consts_dims_len[] = { +4, +1, +4, +1, +4, +1 +}; +const char *espcn_activations[] = { +tanh, +tanh, +sigmoid +}; input.index = 0; diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h index a0dd61cd0d..9344aa90fe 100644 --- a/libavfilter/dnn_espcn.h +++ b/libavfilter/dnn_espcn.h @@ -5398,41 +5398,4 @@ static const long int espcn_conv3_bias_dims
Re: [FFmpeg-devel] [GSOC] [PATCH] On the fly generation of default DNN models and code style fixes
2018-07-30 2:01 GMT+03:00 Michael Niedermayer : > On Sat, Jul 28, 2018 at 01:00:53PM +0300, Sergey Lavrushkin wrote: > > 2018-07-28 4:31 GMT+03:00 Michael Niedermayer : > > > > > On Fri, Jul 27, 2018 at 08:06:15PM +0300, Sergey Lavrushkin wrote: > > > > Hello, > > > > > > > > The first patch provides on the fly generation of default DNN models, > > > > that eliminates data duplication for model weights. Also, files with > > > > internal weights > > > > were replaced with automatically generated one for models I trained. > > > > Scripts for training and generating these files can be found here: > > > > https://github.com/HighVoltageRocknRoll/sr > > > > Later, I will add a description to this repo on how to use it and > > > benchmark > > > > results for trained models. > > > > > > > > The second patch fixes some code style issues for pointers in DNN > module > > > > and sr filter. Are there any other code style fixes I should make for > > > this > > > > code? > > > > > > > > > It seems the code with these patches produces some warnings: > > > > > > In file included from libavfilter/dnn_backend_native.c:27:0: > > > libavfilter/dnn_srcnn.h:2113:21: warning: ‘srcnn_consts’ defined but > not > > > used [-Wunused-variable] > > > static const float *srcnn_consts[] = { > > > ^ > > > libavfilter/dnn_srcnn.h:2122:24: warning: ‘srcnn_consts_dims’ defined > but > > > not used [-Wunused-variable] > > > static const long int *srcnn_consts_dims[] = { > > > ^ > > > libavfilter/dnn_srcnn.h:2142:20: warning: ‘srcnn_activations’ defined > but > > > not used [-Wunused-variable] > > > static const char *srcnn_activations[] = { > > > ^ > > > In file included from libavfilter/dnn_backend_native.c:28:0: > > > libavfilter/dnn_espcn.h:5401:21: warning: ‘espcn_consts’ defined but > not > > > used [-Wunused-variable] > > > static const float *espcn_consts[] = { > > > ^ > > > libavfilter/dnn_espcn.h:5410:24: warning: ‘espcn_consts_dims’ defined > but > > > not used [-Wunused-variable] > > > static const long int *espcn_consts_dims[] = { > > > ^ > > > libavfilter/dnn_espcn.h:5432:20: warning: ‘espcn_activations’ defined > but > > > not used [-Wunused-variable] > > > static const char *espcn_activations[] = { > > > ^ > > > > > > > Here is the patch, that fixes these warnings. > > > dnn_backend_tf.c | 64 ++ > - > > dnn_espcn.h | 37 --- > > dnn_srcnn.h | 35 -- > > 3 files changed, 63 insertions(+), 73 deletions(-) > > 1faef51b86165326a4693c07a203113e2c85f7fb 0003-libavfilter-Fixes- > warnings-for-unused-variables-in-d.patch > > From 37cd7bdf2610e1c3e89210a49e8f5f3832726281 Mon Sep 17 00:00:00 2001 > > From: Sergey Lavrushkin > > Date: Sat, 28 Jul 2018 12:55:02 +0300 > > Subject: [PATCH 3/3] libavfilter: Fixes warnings for unused variables in > > dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c. > > > > --- > > libavfilter/dnn_backend_tf.c | 64 ++ > +- > > libavfilter/dnn_espcn.h | 37 - > > libavfilter/dnn_srcnn.h | 35 > > 3 files changed, 63 insertions(+), 73 deletions(-) > > > > diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c > > index 6307c794a5..7a4ad72d27 100644 > > --- a/libavfilter/dnn_backend_tf.c > > +++ b/libavfilter/dnn_backend_tf.c > > @@ -374,9 +374,71 @@ DNNModel *ff_dnn_load_default_model_tf(DNNDefaultModel > model_type) > > TFModel *tf_model = NULL; > > TF_OperationDescription *op_desc; > > TF_Operation *op; > > -TF_Operation *const_ops_buffer[6]; > > TF_Output input; > > int64_t input_shape[] = {1, -1, -1, 1}; > > +const char tanh[] = "Tanh"; > > +const char sigmoid[] = "Sigmoid"; > > +const char relu[] = "Relu"; > > + > > +const float *srcnn_consts[] = { > > +srcnn_conv1_kernel, > > +srcnn_conv1_bias, > > +srcnn_conv2_kernel, > > +srcnn_conv2_bias, > > +srcnn_conv3_kernel, &
[FFmpeg-devel] [PATCH 7/7] libavfilter: Adds proper file descriptions to dnn_srcnn.h and dnn_espcn.h.
--- libavfilter/dnn_espcn.h | 3 ++- libavfilter/dnn_srcnn.h | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h index 9344aa90fe..e0013fe1dd 100644 --- a/libavfilter/dnn_espcn.h +++ b/libavfilter/dnn_espcn.h @@ -20,7 +20,8 @@ /** * @file - * Default cnn weights for x2 upscaling with espcn model. + * This file contains CNN weights for ESPCN model (https://arxiv.org/abs/1609.05158), + * auto generated by scripts provided in the repository: https://github.com/HighVoltageRocknRoll/sr.git. */ #ifndef AVFILTER_DNN_ESPCN_H diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h index 4f5332ce18..8bf563bd62 100644 --- a/libavfilter/dnn_srcnn.h +++ b/libavfilter/dnn_srcnn.h @@ -20,7 +20,8 @@ /** * @file - * Default cnn weights for x2 upscaling with srcnn model. + * This file contains CNN weights for SRCNN model (https://arxiv.org/abs/1501.00092), + * auto generated by scripts provided in the repository: https://github.com/HighVoltageRocknRoll/sr.git. */ #ifndef AVFILTER_DNN_SRCNN_H -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 5/7] libavfilter/dnn_backend_tf.c: Fixes ff_dnn_free_model_tf.
--- libavfilter/dnn_backend_tf.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c index 7a4ad72d27..662a2a3c6e 100644 --- a/libavfilter/dnn_backend_tf.c +++ b/libavfilter/dnn_backend_tf.c @@ -570,7 +570,9 @@ void ff_dnn_free_model_tf(DNNModel **model) if (tf_model->input_tensor){ TF_DeleteTensor(tf_model->input_tensor); } -av_freep(&tf_model->output_data->data); +if (tf_model->output_data){ +av_freep(&(tf_model->output_data->data)); +} av_freep(&tf_model); av_freep(model); } -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 6/7] libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8 conversions.
This patch removes conversions, declared inside the sr filter, and uses libswscale inside the filter to perform them for only Y channel of input. The sr filter still has uint formats as input, as it does not use chroma channels in models and these channels are upscaled using libswscale, float formats for input would cause unnecessary conversions during scaling for these channels. --- libavfilter/vf_sr.c | 134 +++- 1 file changed, 48 insertions(+), 86 deletions(-) diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c index 944a0e28e7..5ad1baa4c0 100644 --- a/libavfilter/vf_sr.c +++ b/libavfilter/vf_sr.c @@ -45,8 +45,8 @@ typedef struct SRContext { DNNModel *model; DNNData input, output; int scale_factor; -struct SwsContext *sws_context; -int sws_slice_h; +struct SwsContext *sws_contexts[3]; +int sws_slice_h, sws_input_linesize, sws_output_linesize; } SRContext; #define OFFSET(x) offsetof(SRContext, x) @@ -95,6 +95,10 @@ static av_cold int init(AVFilterContext *context) return AVERROR(EIO); } +sr_context->sws_contexts[0] = NULL; +sr_context->sws_contexts[1] = NULL; +sr_context->sws_contexts[2] = NULL; + return 0; } @@ -110,6 +114,7 @@ static int query_formats(AVFilterContext *context) av_log(context, AV_LOG_ERROR, "could not create formats list\n"); return AVERROR(ENOMEM); } + return ff_set_common_formats(context, formats_list); } @@ -140,21 +145,31 @@ static int config_props(AVFilterLink *inlink) else{ outlink->h = sr_context->output.height; outlink->w = sr_context->output.width; +sr_context->sws_contexts[1] = sws_getContext(sr_context->input.width, sr_context->input.height, AV_PIX_FMT_GRAY8, + sr_context->input.width, sr_context->input.height, AV_PIX_FMT_GRAYF32, + 0, NULL, NULL, NULL); +sr_context->sws_input_linesize = sr_context->input.width << 2; +sr_context->sws_contexts[2] = sws_getContext(sr_context->output.width, sr_context->output.height, AV_PIX_FMT_GRAYF32, + sr_context->output.width, sr_context->output.height, AV_PIX_FMT_GRAY8, + 0, NULL, NULL, NULL); +sr_context->sws_output_linesize = sr_context->output.width << 2; +if (!sr_context->sws_contexts[1] || !sr_context->sws_contexts[2]){ +av_log(context, AV_LOG_ERROR, "could not create SwsContext for conversions\n"); +return AVERROR(ENOMEM); +} switch (sr_context->model_type){ case SRCNN: -sr_context->sws_context = sws_getContext(inlink->w, inlink->h, inlink->format, - outlink->w, outlink->h, outlink->format, SWS_BICUBIC, NULL, NULL, NULL); -if (!sr_context->sws_context){ -av_log(context, AV_LOG_ERROR, "could not create SwsContext\n"); +sr_context->sws_contexts[0] = sws_getContext(inlink->w, inlink->h, inlink->format, + outlink->w, outlink->h, outlink->format, + SWS_BICUBIC, NULL, NULL, NULL); +if (!sr_context->sws_contexts[0]){ +av_log(context, AV_LOG_ERROR, "could not create SwsContext for scaling\n"); return AVERROR(ENOMEM); } sr_context->sws_slice_h = inlink->h; break; case ESPCN: -if (inlink->format == AV_PIX_FMT_GRAY8){ -sr_context->sws_context = NULL; -} -else{ +if (inlink->format != AV_PIX_FMT_GRAY8){ sws_src_h = sr_context->input.height; sws_src_w = sr_context->input.width; sws_dst_h = sr_context->output.height; @@ -184,13 +199,14 @@ static int config_props(AVFilterLink *inlink) sws_dst_w = AV_CEIL_RSHIFT(sws_dst_w, 2); break; default: -av_log(context, AV_LOG_ERROR, "could not create SwsContext for input pixel format"); +av_log(context, AV_LOG_ERROR, "could not create SwsContext for scaling for given input pixel format"); return AVERROR(EIO); } -sr_context->sws_context = sws_getContext(sws_src_w, sws_src_h, AV_PIX_FMT_GRAY8, - sws_dst_w, sws_dst_h, AV_PIX_FMT_GRAY8, SWS_BICUBIC, NULL, NULL, NULL); -if (!sr_context->sws_context){ -av_log(context, AV_LOG_ERROR, "could not create SwsContext\n"); +sr_context->sws_contexts[0] = sws_getContext(sws_src_w, sws_src_h, AV_PIX_FMT_GRAY8, +
[FFmpeg-devel] [PATCH 3/7] libavfilter: Fixes warnings for unused variables in dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c.
--- libavfilter/dnn_backend_tf.c | 64 +++- libavfilter/dnn_espcn.h | 37 - libavfilter/dnn_srcnn.h | 35 3 files changed, 63 insertions(+), 73 deletions(-) diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c index 6307c794a5..7a4ad72d27 100644 --- a/libavfilter/dnn_backend_tf.c +++ b/libavfilter/dnn_backend_tf.c @@ -374,9 +374,71 @@ DNNModel *ff_dnn_load_default_model_tf(DNNDefaultModel model_type) TFModel *tf_model = NULL; TF_OperationDescription *op_desc; TF_Operation *op; -TF_Operation *const_ops_buffer[6]; TF_Output input; int64_t input_shape[] = {1, -1, -1, 1}; +const char tanh[] = "Tanh"; +const char sigmoid[] = "Sigmoid"; +const char relu[] = "Relu"; + +const float *srcnn_consts[] = { +srcnn_conv1_kernel, +srcnn_conv1_bias, +srcnn_conv2_kernel, +srcnn_conv2_bias, +srcnn_conv3_kernel, +srcnn_conv3_bias +}; +const long int *srcnn_consts_dims[] = { +srcnn_conv1_kernel_dims, +srcnn_conv1_bias_dims, +srcnn_conv2_kernel_dims, +srcnn_conv2_bias_dims, +srcnn_conv3_kernel_dims, +srcnn_conv3_bias_dims +}; +const int srcnn_consts_dims_len[] = { +4, +1, +4, +1, +4, +1 +}; +const char *srcnn_activations[] = { +relu, +relu, +relu +}; + +const float *espcn_consts[] = { +espcn_conv1_kernel, +espcn_conv1_bias, +espcn_conv2_kernel, +espcn_conv2_bias, +espcn_conv3_kernel, +espcn_conv3_bias +}; +const long int *espcn_consts_dims[] = { +espcn_conv1_kernel_dims, +espcn_conv1_bias_dims, +espcn_conv2_kernel_dims, +espcn_conv2_bias_dims, +espcn_conv3_kernel_dims, +espcn_conv3_bias_dims +}; +const int espcn_consts_dims_len[] = { +4, +1, +4, +1, +4, +1 +}; +const char *espcn_activations[] = { +tanh, +tanh, +sigmoid +}; input.index = 0; diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h index a0dd61cd0d..9344aa90fe 100644 --- a/libavfilter/dnn_espcn.h +++ b/libavfilter/dnn_espcn.h @@ -5398,41 +5398,4 @@ static const long int espcn_conv3_bias_dims[] = { 4 }; -static const float *espcn_consts[] = { -espcn_conv1_kernel, -espcn_conv1_bias, -espcn_conv2_kernel, -espcn_conv2_bias, -espcn_conv3_kernel, -espcn_conv3_bias -}; - -static const long int *espcn_consts_dims[] = { -espcn_conv1_kernel_dims, -espcn_conv1_bias_dims, -espcn_conv2_kernel_dims, -espcn_conv2_bias_dims, -espcn_conv3_kernel_dims, -espcn_conv3_bias_dims -}; - -static const int espcn_consts_dims_len[] = { -4, -1, -4, -1, -4, -1 -}; - -static const char espcn_tanh[] = "Tanh"; - -static const char espcn_sigmoid[] = "Sigmoid"; - -static const char *espcn_activations[] = { -espcn_tanh, -espcn_tanh, -espcn_sigmoid -}; - #endif diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h index 26143654b8..4f5332ce18 100644 --- a/libavfilter/dnn_srcnn.h +++ b/libavfilter/dnn_srcnn.h @@ -2110,39 +2110,4 @@ static const long int srcnn_conv3_bias_dims[] = { 1 }; -static const float *srcnn_consts[] = { -srcnn_conv1_kernel, -srcnn_conv1_bias, -srcnn_conv2_kernel, -srcnn_conv2_bias, -srcnn_conv3_kernel, -srcnn_conv3_bias -}; - -static const long int *srcnn_consts_dims[] = { -srcnn_conv1_kernel_dims, -srcnn_conv1_bias_dims, -srcnn_conv2_kernel_dims, -srcnn_conv2_bias_dims, -srcnn_conv3_kernel_dims, -srcnn_conv3_bias_dims -}; - -static const int srcnn_consts_dims_len[] = { -4, -1, -4, -1, -4, -1 -}; - -static const char srcnn_relu[] = "Relu"; - -static const char *srcnn_activations[] = { -srcnn_relu, -srcnn_relu, -srcnn_relu -}; - #endif -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [GSOC][PATCH 0/7] Improvements for sr filter and DNN module
Hello, These patches address several raised concerns regarding sr filter and DNN module. I included three patches, that I've already sent, but they still have not been reviewed properly. libavfilter: Adds on the fly generation of default DNN models for tensorflow backend instead of storing binary model. libavfilter: Code style fixes for pointers in DNN module and sr filter. libavfilter: Fixes warnings for unused variables in dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c. Adds gray floating-point pixel formats. libavfilter/dnn_backend_tf.c: Fixes ff_dnn_free_model_tf. libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8 conversions. libavfilter: Adds proper file descriptions to dnn_srcnn.h and dnn_espcn.h. libavfilter/dnn_backend_native.c |96 +- libavfilter/dnn_backend_native.h | 8 +- libavfilter/dnn_backend_tf.c | 396 +- libavfilter/dnn_backend_tf.h | 8 +- libavfilter/dnn_espcn.h | 17947 +++-- libavfilter/dnn_interface.c | 4 +- libavfilter/dnn_interface.h |16 +- libavfilter/dnn_srcnn.h | 6979 +- libavfilter/vf_sr.c | 194 +- libavutil/pixdesc.c |22 + libavutil/pixfmt.h | 5 + libswscale/swscale_internal.h| 7 + libswscale/swscale_unscaled.c|54 +- libswscale/utils.c | 5 +- 14 files changed, 7983 insertions(+), 17758 deletions(-) -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 2/7] libavfilter: Code style fixes for pointers in DNN module and sr filter.
--- libavfilter/dnn_backend_native.c | 84 +++--- libavfilter/dnn_backend_native.h | 8 +-- libavfilter/dnn_backend_tf.c | 108 +++ libavfilter/dnn_backend_tf.h | 8 +-- libavfilter/dnn_espcn.h | 6 +-- libavfilter/dnn_interface.c | 4 +- libavfilter/dnn_interface.h | 16 +++--- libavfilter/dnn_srcnn.h | 6 +-- libavfilter/vf_sr.c | 60 +++--- 9 files changed, 150 insertions(+), 150 deletions(-) diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_native.c index 3e6b86280d..baefea7fcb 100644 --- a/libavfilter/dnn_backend_native.c +++ b/libavfilter/dnn_backend_native.c @@ -34,15 +34,15 @@ typedef enum {RELU, TANH, SIGMOID} ActivationFunc; typedef struct Layer{ LayerType type; -float* output; -void* params; +float *output; +void *params; } Layer; typedef struct ConvolutionalParams{ int32_t input_num, output_num, kernel_size; ActivationFunc activation; -float* kernel; -float* biases; +float *kernel; +float *biases; } ConvolutionalParams; typedef struct InputParams{ @@ -55,16 +55,16 @@ typedef struct DepthToSpaceParams{ // Represents simple feed-forward convolutional network. typedef struct ConvolutionalNetwork{ -Layer* layers; +Layer *layers; int32_t layers_num; } ConvolutionalNetwork; -static DNNReturnType set_input_output_native(void* model, DNNData* input, DNNData* output) +static DNNReturnType set_input_output_native(void *model, DNNData *input, DNNData *output) { -ConvolutionalNetwork* network = (ConvolutionalNetwork*)model; -InputParams* input_params; -ConvolutionalParams* conv_params; -DepthToSpaceParams* depth_to_space_params; +ConvolutionalNetwork *network = (ConvolutionalNetwork *)model; +InputParams *input_params; +ConvolutionalParams *conv_params; +DepthToSpaceParams *depth_to_space_params; int cur_width, cur_height, cur_channels; int32_t layer; @@ -72,7 +72,7 @@ static DNNReturnType set_input_output_native(void* model, DNNData* input, DNNDat return DNN_ERROR; } else{ -input_params = (InputParams*)network->layers[0].params; +input_params = (InputParams *)network->layers[0].params; input_params->width = cur_width = input->width; input_params->height = cur_height = input->height; input_params->channels = cur_channels = input->channels; @@ -88,14 +88,14 @@ static DNNReturnType set_input_output_native(void* model, DNNData* input, DNNDat for (layer = 1; layer < network->layers_num; ++layer){ switch (network->layers[layer].type){ case CONV: -conv_params = (ConvolutionalParams*)network->layers[layer].params; +conv_params = (ConvolutionalParams *)network->layers[layer].params; if (conv_params->input_num != cur_channels){ return DNN_ERROR; } cur_channels = conv_params->output_num; break; case DEPTH_TO_SPACE: -depth_to_space_params = (DepthToSpaceParams*)network->layers[layer].params; +depth_to_space_params = (DepthToSpaceParams *)network->layers[layer].params; if (cur_channels % (depth_to_space_params->block_size * depth_to_space_params->block_size) != 0){ return DNN_ERROR; } @@ -127,16 +127,16 @@ static DNNReturnType set_input_output_native(void* model, DNNData* input, DNNDat // layers_num,layer_type,layer_parameterss,layer_type,layer_parameters... // For CONV layer: activation_function, input_num, output_num, kernel_size, kernel, biases // For DEPTH_TO_SPACE layer: block_size -DNNModel* ff_dnn_load_model_native(const char* model_filename) +DNNModel *ff_dnn_load_model_native(const char *model_filename) { -DNNModel* model = NULL; -ConvolutionalNetwork* network = NULL; -AVIOContext* model_file_context; +DNNModel *model = NULL; +ConvolutionalNetwork *network = NULL; +AVIOContext *model_file_context; int file_size, dnn_size, kernel_size, i; int32_t layer; LayerType layer_type; -ConvolutionalParams* conv_params; -DepthToSpaceParams* depth_to_space_params; +ConvolutionalParams *conv_params; +DepthToSpaceParams *depth_to_space_params; model = av_malloc(sizeof(DNNModel)); if (!model){ @@ -155,7 +155,7 @@ DNNModel* ff_dnn_load_model_native(const char* model_filename) av_freep(&model); return NULL; } -model->model = (void*)network; +model->model = (void *)network; network->layers_num = 1 + (int32_t)avio_rl32(model_file_context); dnn_size = 4; @@ -251,10 +251,10 @@ DNNModel* ff_dnn_load_model_native(const char* model_filename) return model; } -static int set_up_conv_layer(Layer* layer, const float* kernel, const float* biases, ActivationFunc activation, +stat
[FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
This patch adds two floating-point gray formats to use them in sr filter for conversion with libswscale. I added conversion from uint gray to float and backwards in swscale_unscaled.c, that is enough for sr filter. But for proper format addition, should I add anything else? --- libavutil/pixdesc.c | 22 ++ libavutil/pixfmt.h| 5 libswscale/swscale_internal.h | 7 ++ libswscale/swscale_unscaled.c | 54 +-- libswscale/utils.c| 5 +++- 5 files changed, 90 insertions(+), 3 deletions(-) diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c index 96e079584a..7d307d9120 100644 --- a/libavutil/pixdesc.c +++ b/libavutil/pixdesc.c @@ -2198,6 +2198,28 @@ static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = { .flags = AV_PIX_FMT_FLAG_PLANAR | AV_PIX_FMT_FLAG_ALPHA | AV_PIX_FMT_FLAG_RGB | AV_PIX_FMT_FLAG_FLOAT, }, +[AV_PIX_FMT_GRAYF32BE] = { +.name = "grayf32be", +.nb_components = 1, +.log2_chroma_w = 0, +.log2_chroma_h = 0, +.comp = { +{ 0, 4, 0, 0, 32, 3, 31, 1 }, /* Y */ +}, +.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_FLOAT, +.alias = "yf32be", +}, +[AV_PIX_FMT_GRAYF32LE] = { +.name = "grayf32le", +.nb_components = 1, +.log2_chroma_w = 0, +.log2_chroma_h = 0, +.comp = { +{ 0, 4, 0, 0, 32, 3, 31, 1 }, /* Y */ +}, +.flags = AV_PIX_FMT_FLAG_FLOAT, +.alias = "yf32le", +}, [AV_PIX_FMT_DRM_PRIME] = { .name = "drm_prime", .flags = AV_PIX_FMT_FLAG_HWACCEL, diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h index 2b3307845e..aa9a4f60c1 100644 --- a/libavutil/pixfmt.h +++ b/libavutil/pixfmt.h @@ -320,6 +320,9 @@ enum AVPixelFormat { AV_PIX_FMT_GBRAPF32BE, ///< IEEE-754 single precision planar GBRA 4:4:4:4, 128bpp, big-endian AV_PIX_FMT_GBRAPF32LE, ///< IEEE-754 single precision planar GBRA 4:4:4:4, 128bpp, little-endian +AV_PIX_FMT_GRAYF32BE, ///< IEEE-754 single precision Y, 32bpp, big-endian +AV_PIX_FMT_GRAYF32LE, ///< IEEE-754 single precision Y, 32bpp, little-endian + /** * DRM-managed buffers exposed through PRIME buffer sharing. * @@ -405,6 +408,8 @@ enum AVPixelFormat { #define AV_PIX_FMT_GBRPF32AV_PIX_FMT_NE(GBRPF32BE, GBRPF32LE) #define AV_PIX_FMT_GBRAPF32 AV_PIX_FMT_NE(GBRAPF32BE, GBRAPF32LE) +#define AV_PIX_FMT_GRAYF32 AV_PIX_FMT_NE(GRAYF32BE, GRAYF32LE) + #define AV_PIX_FMT_YUVA420P9 AV_PIX_FMT_NE(YUVA420P9BE , YUVA420P9LE) #define AV_PIX_FMT_YUVA422P9 AV_PIX_FMT_NE(YUVA422P9BE , YUVA422P9LE) #define AV_PIX_FMT_YUVA444P9 AV_PIX_FMT_NE(YUVA444P9BE , YUVA444P9LE) diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 1703856ab2..4a2cdfe658 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -764,6 +764,13 @@ static av_always_inline int isAnyRGB(enum AVPixelFormat pix_fmt) pix_fmt == AV_PIX_FMT_MONOBLACK || pix_fmt == AV_PIX_FMT_MONOWHITE; } +static av_always_inline int isFloat(enum AVPixelFormat pix_fmt) +{ +const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt); +av_assert0(desc); +return desc->flags & AV_PIX_FMT_FLAG_FLOAT; +} + static av_always_inline int isALPHA(enum AVPixelFormat pix_fmt) { const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt); diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 6480070cbf..f5b4c9be9d 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -1467,6 +1467,46 @@ static int yvu9ToYv12Wrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int uint_y_to_float_y_wrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dst[], int dstStride[]) +{ +int y, x; +int dstStrideFloat = dstStride[0] >> 2;; +const uint8_t *srcPtr = src[0]; +float *dstPtr = (float *)(dst[0] + dstStride[0] * srcSliceY); + +for (y = 0; y < srcSliceH; ++y){ +for (x = 0; x < c->srcW; ++x){ +dstPtr[x] = (float)srcPtr[x] / 255.0f; +} +srcPtr += srcStride[0]; +dstPtr += dstStrideFloat; +} + +return srcSliceH; +} + +static int float_y_to_uint_y_wrapper(SwsContext *c, const uint8_t* src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t* dst[], int dstStride[]) +{ +int y, x; +int srcStrideFloat = srcStride[0] >> 2; +const float *srcPtr = (const float *)src[0]; +uint8_t *dstPtr = dst[0] + dstStride[0] * srcSliceY; + +for (y = 0; y < srcSliceH; ++y){ +for (x = 0; x < c->srcW; ++x){ +dstPtr[x] = (uint8_t)(255.0
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
2018-08-03 16:07 GMT+03:00 Michael Niedermayer : > On Thu, Aug 02, 2018 at 09:52:45PM +0300, Sergey Lavrushkin wrote: > > This patch adds two floating-point gray formats to use them in sr filter > for > > conversion with libswscale. I added conversion from uint gray to float > and > > backwards in swscale_unscaled.c, that is enough for sr filter. But for > > proper format addition, should I add anything else? > > > > --- > > libavutil/pixdesc.c | 22 ++ > > libavutil/pixfmt.h| 5 > > libswscale/swscale_internal.h | 7 ++ > > libswscale/swscale_unscaled.c | 54 ++ > +++-- > > libswscale/utils.c| 5 +++- > > please split this in a patch or libavutil and one for libswscale > they also need some version.h bump > Ok. also fate tests need an update, (make fate) fails otherwise, the update > should > be part of the patch that causes the failure otherwise In one test for these formats I get: filter-pixfmts-scale grayf32be grayf32le monob f01cb0b623357387827902d9d0963435 I guess, it is because I only implemented conversion in swscale_unscaled. What can I do to fix it? Should I implement conversion for scaling or maybe change something in the test, so it would not check these formats (if it is possible). Anyway, I need to know what changes should I do and where. > > 5 files changed, 90 insertions(+), 3 deletions(-) > > > > diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c > > index 96e079584a..7d307d9120 100644 > > --- a/libavutil/pixdesc.c > > +++ b/libavutil/pixdesc.c > > @@ -2198,6 +2198,28 @@ static const AVPixFmtDescriptor > av_pix_fmt_descriptors[AV_PIX_FMT_NB] = { > > .flags = AV_PIX_FMT_FLAG_PLANAR | AV_PIX_FMT_FLAG_ALPHA | > > AV_PIX_FMT_FLAG_RGB | AV_PIX_FMT_FLAG_FLOAT, > > }, > > +[AV_PIX_FMT_GRAYF32BE] = { > > +.name = "grayf32be", > > +.nb_components = 1, > > +.log2_chroma_w = 0, > > +.log2_chroma_h = 0, > > +.comp = { > > +{ 0, 4, 0, 0, 32, 3, 31, 1 }, /* Y */ > > +}, > > +.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_FLOAT, > > +.alias = "yf32be", > > +}, > > +[AV_PIX_FMT_GRAYF32LE] = { > > +.name = "grayf32le", > > +.nb_components = 1, > > +.log2_chroma_w = 0, > > +.log2_chroma_h = 0, > > +.comp = { > > +{ 0, 4, 0, 0, 32, 3, 31, 1 }, /* Y */ > > +}, > > +.flags = AV_PIX_FMT_FLAG_FLOAT, > > +.alias = "yf32le", > > +}, > > [AV_PIX_FMT_DRM_PRIME] = { > > .name = "drm_prime", > > .flags = AV_PIX_FMT_FLAG_HWACCEL, > > > diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h > > index 2b3307845e..aa9a4f60c1 100644 > > --- a/libavutil/pixfmt.h > > +++ b/libavutil/pixfmt.h > > @@ -320,6 +320,9 @@ enum AVPixelFormat { > > AV_PIX_FMT_GBRAPF32BE, ///< IEEE-754 single precision planar GBRA > 4:4:4:4, 128bpp, big-endian > > AV_PIX_FMT_GBRAPF32LE, ///< IEEE-754 single precision planar GBRA > 4:4:4:4, 128bpp, little-endian > > > > +AV_PIX_FMT_GRAYF32BE, ///< IEEE-754 single precision Y, 32bpp, > big-endian > > +AV_PIX_FMT_GRAYF32LE, ///< IEEE-754 single precision Y, 32bpp, > little-endian > > + > > /** > > * DRM-managed buffers exposed through PRIME buffer sharing. > > * > > new enum values can only be added in such a way that no value of an > existing > enum changes. This would change the value of the following enums Ok. > @@ -405,6 +408,8 @@ enum AVPixelFormat { > > #define AV_PIX_FMT_GBRPF32AV_PIX_FMT_NE(GBRPF32BE, GBRPF32LE) > > #define AV_PIX_FMT_GBRAPF32 AV_PIX_FMT_NE(GBRAPF32BE, GBRAPF32LE) > > > > +#define AV_PIX_FMT_GRAYF32 AV_PIX_FMT_NE(GRAYF32BE, GRAYF32LE) > > + > > #define AV_PIX_FMT_YUVA420P9 AV_PIX_FMT_NE(YUVA420P9BE , YUVA420P9LE) > > #define AV_PIX_FMT_YUVA422P9 AV_PIX_FMT_NE(YUVA422P9BE , YUVA422P9LE) > > #define AV_PIX_FMT_YUVA444P9 AV_PIX_FMT_NE(YUVA444P9BE , YUVA444P9LE) > > diff --git a/libswscale/swscale_internal.h > b/libswscale/swscale_internal.h > > index 1703856ab2..4a2cdfe658 100644 > > --- a/libswscale/swscale_internal.h > > +++ b/libswscale/swscale_internal.h > > @@ -764,6 +764,13 @@ static av_always_inline int isAnyRGB(enum > AVPixelFormat pix_fmt) > > pix_fmt == AV_PIX_FMT_MONOBLACK || pix
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
2018-08-04 0:11 GMT+03:00 Michael Niedermayer : > On Fri, Aug 03, 2018 at 10:33:00PM +0300, Sergey Lavrushkin wrote: > > 2018-08-03 16:07 GMT+03:00 Michael Niedermayer : > > > > > On Thu, Aug 02, 2018 at 09:52:45PM +0300, Sergey Lavrushkin wrote: > > > > This patch adds two floating-point gray formats to use them in sr > filter > > > for > > > > conversion with libswscale. I added conversion from uint gray to > float > > > and > > > > backwards in swscale_unscaled.c, that is enough for sr filter. But > for > > > > proper format addition, should I add anything else? > > > > > > > > --- > > > > libavutil/pixdesc.c | 22 ++ > > > > libavutil/pixfmt.h| 5 > > > > libswscale/swscale_internal.h | 7 ++ > > > > libswscale/swscale_unscaled.c | 54 ++ > > > +++-- > > > > libswscale/utils.c| 5 +++- > > > > > > please split this in a patch or libavutil and one for libswscale > > > they also need some version.h bump > > > > > > > Ok. > > > > also fate tests need an update, (make fate) fails otherwise, the update > > > should > > > be part of the patch that causes the failure otherwise > > > > > > In one test for these formats I get: > > > > filter-pixfmts-scale > > grayf32be grayf32le monob > > f01cb0b623357387827902d9d0963435 > > > > I guess, it is because I only implemented conversion in swscale_unscaled. > > What can I do to fix it? Should I implement conversion for scaling or > maybe > > change something in the test, so it would not check these formats (if it > is > > possible). > > Anyway, I need to know what changes should I do and where. > > well, swscale shouldnt really have formats only half supported > so for any supported format in and out it should work with any > width / height in / out > > Theres a wide range of possibilities how to implement this. > The correct / ideal way is of course to implement a full floating point > path > for scaling along side the integer code. > a simpler aprouch would be to convert from/to float to/from integers and > use > the existing code. (this of course has the disadvantage of loosing > precission) > Well, I want to implement simpler approach, as I still have to finish correcting sr filter. But I need some explanations regarding what I should add. If I understand correcly, I need to add conversion from float to the ff_sws_init_input_funcs function in libswscale/input.c and conversion to float to the ff_sws_init_output_funcs function in libswscale/output.c If I am not mistaken, in the first case I need to provide c->lumToYV12 and in the second case - yuv2plane1 and yuv2planeX. So, in the first case, to what format should I add conversion, specifically what number of bits per pixel should be used? As I look through other conversion functions, it seems that somewhere uint8 is used and somewhere - uint16. Is it somehow determined later during scaling? If I am going to convert to uint8 from my float format, should I define it somewhere, that I am converting to uint8? And in the second case, I don't completely understand, what these two functions are doing, especially tha last one with filters. Is it also just simple conversions or these functions also cover something else? And in their descriptions it is written, that: * @param src scaled source data, 15 bits for 8-10-bit output, *19 bits for 16-bit output (in int32_t) * @param destpointer to the output plane. For >8-bit *output, this is in uint16_t In my case, the output is 32-bit. Does this mean, that float type, basically, is not supported and I also have to modify something in scaling? If so, what should I add? > [...] > > > +const uint8_t *srcPtr = src[0]; > > > > +float *dstPtr = (float *)(dst[0] + dstStride[0] * srcSliceY); > > > > + > > > > +for (y = 0; y < srcSliceH; ++y){ > > > > +for (x = 0; x < c->srcW; ++x){ > > > > +dstPtr[x] = (float)srcPtr[x] / 255.0f; > > > > > > division is slow. This should either be a multiplication with the > > > inverse or a LUT with 8bit index changing to float. > > > > > > The faster of them should be used > > > > > > > LUT seems to be faster. Can I place it in SwsContext and initialize it in > > sws_init_context when necessary? > > yes of course > > thanks > > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/7] libavfilter: Code style fixes for pointers in DNN module and sr filter.
Updated patch. 2018-08-06 17:55 GMT+03:00 Pedro Arthur : > 2018-08-02 15:52 GMT-03:00 Sergey Lavrushkin : > > --- > > libavfilter/dnn_backend_native.c | 84 +++--- > > libavfilter/dnn_backend_native.h | 8 +-- > > libavfilter/dnn_backend_tf.c | 108 +++--- > - > > libavfilter/dnn_backend_tf.h | 8 +-- > > libavfilter/dnn_espcn.h | 6 +-- > > libavfilter/dnn_interface.c | 4 +- > > libavfilter/dnn_interface.h | 16 +++--- > > libavfilter/dnn_srcnn.h | 6 +-- > > libavfilter/vf_sr.c | 60 +++--- > > 9 files changed, 150 insertions(+), 150 deletions(-) > > > > diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_ > native.c > > index 3e6b86280d..baefea7fcb 100644 > > --- a/libavfilter/dnn_backend_native.c > > +++ b/libavfilter/dnn_backend_native.c > > @@ -34,15 +34,15 @@ typedef enum {RELU, TANH, SIGMOID} ActivationFunc; > > > > typedef struct Layer{ > > LayerType type; > > -float* output; > > -void* params; > > +float *output; > > +void *params; > > } Layer; > > > > typedef struct ConvolutionalParams{ > > int32_t input_num, output_num, kernel_size; > > ActivationFunc activation; > > -float* kernel; > > -float* biases; > > +float *kernel; > > +float *biases; > > } ConvolutionalParams; > > > > typedef struct InputParams{ > > @@ -55,16 +55,16 @@ typedef struct DepthToSpaceParams{ > > > > // Represents simple feed-forward convolutional network. > > typedef struct ConvolutionalNetwork{ > > -Layer* layers; > > +Layer *layers; > > int32_t layers_num; > > } ConvolutionalNetwork; > > > > -static DNNReturnType set_input_output_native(void* model, DNNData* > input, DNNData* output) > > +static DNNReturnType set_input_output_native(void *model, DNNData > *input, DNNData *output) > > { > > -ConvolutionalNetwork* network = (ConvolutionalNetwork*)model; > > -InputParams* input_params; > > -ConvolutionalParams* conv_params; > > -DepthToSpaceParams* depth_to_space_params; > > +ConvolutionalNetwork *network = (ConvolutionalNetwork *)model; > > +InputParams *input_params; > > +ConvolutionalParams *conv_params; > > +DepthToSpaceParams *depth_to_space_params; > > int cur_width, cur_height, cur_channels; > > int32_t layer; > > > > @@ -72,7 +72,7 @@ static DNNReturnType set_input_output_native(void* > model, DNNData* input, DNNDat > > return DNN_ERROR; > > } > > else{ > > -input_params = (InputParams*)network->layers[0].params; > > +input_params = (InputParams *)network->layers[0].params; > > input_params->width = cur_width = input->width; > > input_params->height = cur_height = input->height; > > input_params->channels = cur_channels = input->channels; > > @@ -88,14 +88,14 @@ static DNNReturnType set_input_output_native(void* > model, DNNData* input, DNNDat > > for (layer = 1; layer < network->layers_num; ++layer){ > > switch (network->layers[layer].type){ > > case CONV: > > -conv_params = (ConvolutionalParams*)network- > >layers[layer].params; > > +conv_params = (ConvolutionalParams *)network->layers[layer]. > params; > > if (conv_params->input_num != cur_channels){ > > return DNN_ERROR; > > } > > cur_channels = conv_params->output_num; > > break; > > case DEPTH_TO_SPACE: > > -depth_to_space_params = (DepthToSpaceParams*)network-> > layers[layer].params; > > +depth_to_space_params = (DepthToSpaceParams > *)network->layers[layer].params; > > if (cur_channels % (depth_to_space_params->block_size * > depth_to_space_params->block_size) != 0){ > > return DNN_ERROR; > > } > > @@ -127,16 +127,16 @@ static DNNReturnType set_input_output_native(void* > model, DNNData* input, DNNDat > > // layers_num,layer_type,layer_parameterss,layer_type,layer_ > parameters... > > // For CONV layer: activation_function, input_num, output_num, > kernel_size, kernel, biases > > // For DEPTH_TO_SPACE layer: block_size > > -DNNModel* ff_dnn_load_model_native(const char* model_filename) > > +DNNMode
Re: [FFmpeg-devel] [PATCH 3/7] libavfilter: Fixes warnings for unused variables in dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c.
Made variables static. 2018-08-06 21:19 GMT+03:00 Pedro Arthur : > 2018-08-02 15:52 GMT-03:00 Sergey Lavrushkin : > > --- > > libavfilter/dnn_backend_tf.c | 64 ++ > +- > > libavfilter/dnn_espcn.h | 37 - > > libavfilter/dnn_srcnn.h | 35 > > 3 files changed, 63 insertions(+), 73 deletions(-) > > > > diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c > > index 6307c794a5..7a4ad72d27 100644 > > --- a/libavfilter/dnn_backend_tf.c > > +++ b/libavfilter/dnn_backend_tf.c > > @@ -374,9 +374,71 @@ DNNModel *ff_dnn_load_default_model_tf(DNNDefaultModel > model_type) > > TFModel *tf_model = NULL; > > TF_OperationDescription *op_desc; > > TF_Operation *op; > > -TF_Operation *const_ops_buffer[6]; > > TF_Output input; > > int64_t input_shape[] = {1, -1, -1, 1}; > > +const char tanh[] = "Tanh"; > > +const char sigmoid[] = "Sigmoid"; > > +const char relu[] = "Relu"; > > + > > +const float *srcnn_consts[] = { > > +srcnn_conv1_kernel, > > +srcnn_conv1_bias, > > +srcnn_conv2_kernel, > > +srcnn_conv2_bias, > > +srcnn_conv3_kernel, > > +srcnn_conv3_bias > > +}; > > +const long int *srcnn_consts_dims[] = { > > +srcnn_conv1_kernel_dims, > > +srcnn_conv1_bias_dims, > > +srcnn_conv2_kernel_dims, > > +srcnn_conv2_bias_dims, > > +srcnn_conv3_kernel_dims, > > +srcnn_conv3_bias_dims > > +}; > > +const int srcnn_consts_dims_len[] = { > > +4, > > +1, > > +4, > > +1, > > +4, > > +1 > > +}; > > +const char *srcnn_activations[] = { > > +relu, > > +relu, > > +relu > > +}; > > + > > +const float *espcn_consts[] = { > > +espcn_conv1_kernel, > > +espcn_conv1_bias, > > +espcn_conv2_kernel, > > +espcn_conv2_bias, > > +espcn_conv3_kernel, > > +espcn_conv3_bias > > +}; > > +const long int *espcn_consts_dims[] = { > > +espcn_conv1_kernel_dims, > > +espcn_conv1_bias_dims, > > +espcn_conv2_kernel_dims, > > +espcn_conv2_bias_dims, > > +espcn_conv3_kernel_dims, > > +espcn_conv3_bias_dims > > +}; > > +const int espcn_consts_dims_len[] = { > > +4, > > +1, > > +4, > > +1, > > +4, > > +1 > > +}; > > +const char *espcn_activations[] = { > > +tanh, > > +tanh, > > +sigmoid > > +}; > > > > input.index = 0; > > > > diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h > > index a0dd61cd0d..9344aa90fe 100644 > > --- a/libavfilter/dnn_espcn.h > > +++ b/libavfilter/dnn_espcn.h > > @@ -5398,41 +5398,4 @@ static const long int espcn_conv3_bias_dims[] = { > > 4 > > }; > > > > -static const float *espcn_consts[] = { > > -espcn_conv1_kernel, > > -espcn_conv1_bias, > > -espcn_conv2_kernel, > > -espcn_conv2_bias, > > -espcn_conv3_kernel, > > -espcn_conv3_bias > > -}; > > - > > -static const long int *espcn_consts_dims[] = { > > -espcn_conv1_kernel_dims, > > -espcn_conv1_bias_dims, > > -espcn_conv2_kernel_dims, > > -espcn_conv2_bias_dims, > > -espcn_conv3_kernel_dims, > > -espcn_conv3_bias_dims > > -}; > > - > > -static const int espcn_consts_dims_len[] = { > > -4, > > -1, > > -4, > > -1, > > -4, > > -1 > > -}; > > - > > -static const char espcn_tanh[] = "Tanh"; > > - > > -static const char espcn_sigmoid[] = "Sigmoid"; > > - > > -static const char *espcn_activations[] = { > > -espcn_tanh, > > -espcn_tanh, > > -espcn_sigmoid > > -}; > > - > > #endif > > diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h > > index 26143654b8..4f5332ce18 100644 > > --- a/libavfilter/dnn_srcnn.h > > +++ b/libavfilter/dnn_srcnn.h > > @@ -2110,39 +2110,4 @@ static const long int srcnn_conv3_bias_dims[] = { > > 1 > > }; > > > > -static const floa
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
I split patch to one for libavutil and another for libswscale, also added LUT for unscaled conversion, added conversions for scaling and updated fate tests. From 8bcc10b49c41612b4d6549e64d90acf3f0b3fc6a Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Fri, 3 Aug 2018 18:02:49 +0300 Subject: [PATCH 4/9] libavutil: Adds gray floating-point pixel formats. --- libavutil/pixdesc.c | 22 ++ libavutil/pixfmt.h | 5 + libavutil/version.h | 2 +- tests/ref/fate/sws-pixdesc-query | 3 +++ 4 files changed, 31 insertions(+), 1 deletion(-) diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c index 96e079584a..970a83214c 100644 --- a/libavutil/pixdesc.c +++ b/libavutil/pixdesc.c @@ -2206,6 +2206,28 @@ static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = { .name = "opencl", .flags = AV_PIX_FMT_FLAG_HWACCEL, }, +[AV_PIX_FMT_GRAYF32BE] = { +.name = "grayf32be", +.nb_components = 1, +.log2_chroma_w = 0, +.log2_chroma_h = 0, +.comp = { +{ 0, 4, 0, 0, 32, 3, 31, 1 }, /* Y */ +}, +.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_FLOAT, +.alias = "yf32be", +}, +[AV_PIX_FMT_GRAYF32LE] = { +.name = "grayf32le", +.nb_components = 1, +.log2_chroma_w = 0, +.log2_chroma_h = 0, +.comp = { +{ 0, 4, 0, 0, 32, 3, 31, 1 }, /* Y */ +}, +.flags = AV_PIX_FMT_FLAG_FLOAT, +.alias = "yf32le", +}, }; #if FF_API_PLUS1_MINUS1 FF_ENABLE_DEPRECATION_WARNINGS diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h index 2b3307845e..7b254732d8 100644 --- a/libavutil/pixfmt.h +++ b/libavutil/pixfmt.h @@ -337,6 +337,9 @@ enum AVPixelFormat { AV_PIX_FMT_GRAY14BE, ///<Y, 14bpp, big-endian AV_PIX_FMT_GRAY14LE, ///<Y, 14bpp, little-endian +AV_PIX_FMT_GRAYF32BE, ///< IEEE-754 single precision Y, 32bpp, big-endian +AV_PIX_FMT_GRAYF32LE, ///< IEEE-754 single precision Y, 32bpp, little-endian + AV_PIX_FMT_NB ///< number of pixel formats, DO NOT USE THIS if you want to link with shared libav* because the number of formats might differ between versions }; @@ -405,6 +408,8 @@ enum AVPixelFormat { #define AV_PIX_FMT_GBRPF32AV_PIX_FMT_NE(GBRPF32BE, GBRPF32LE) #define AV_PIX_FMT_GBRAPF32 AV_PIX_FMT_NE(GBRAPF32BE, GBRAPF32LE) +#define AV_PIX_FMT_GRAYF32 AV_PIX_FMT_NE(GRAYF32BE, GRAYF32LE) + #define AV_PIX_FMT_YUVA420P9 AV_PIX_FMT_NE(YUVA420P9BE , YUVA420P9LE) #define AV_PIX_FMT_YUVA422P9 AV_PIX_FMT_NE(YUVA422P9BE , YUVA422P9LE) #define AV_PIX_FMT_YUVA444P9 AV_PIX_FMT_NE(YUVA444P9BE , YUVA444P9LE) diff --git a/libavutil/version.h b/libavutil/version.h index 44bdebdc93..5205c5bc60 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -79,7 +79,7 @@ */ #define LIBAVUTIL_VERSION_MAJOR 56 -#define LIBAVUTIL_VERSION_MINOR 18 +#define LIBAVUTIL_VERSION_MINOR 19 #define LIBAVUTIL_VERSION_MICRO 102 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ diff --git a/tests/ref/fate/sws-pixdesc-query b/tests/ref/fate/sws-pixdesc-query index 8071ec484d..451c7d83b9 100644 --- a/tests/ref/fate/sws-pixdesc-query +++ b/tests/ref/fate/sws-pixdesc-query @@ -126,6 +126,7 @@ isBE: gray14be gray16be gray9be + grayf32be nv20be p010be p016be @@ -412,6 +413,8 @@ Gray: gray16le gray9be gray9le + grayf32be + grayf32le ya16be ya16le ya8 -- 2.14.1 From 35f97f77465bec4344ac7d5a6742388d9c1470cc Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Fri, 3 Aug 2018 18:06:50 +0300 Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray format. --- libswscale/input.c | 38 libswscale/output.c | 60 libswscale/ppc/swscale_altivec.c | 1 + libswscale/swscale_internal.h| 9 + libswscale/swscale_unscaled.c| 54 ++-- libswscale/utils.c | 20 ++- libswscale/x86/swscale_template.c| 3 +- tests/ref/fate/filter-pixdesc-grayf32be | 1 + tests/ref/fate/filter-pixdesc-grayf32le | 1 + tests/ref/fate/filter-pixfmts-copy | 2 ++ tests/ref/fate/filter-pixfmts-crop | 2 ++ tests/ref/fate/filter-pixfmts-field | 2 ++ tests/ref/fate/filter-pixfmts-fieldorder | 2 ++ tests/ref/fate/filter-pixfmts-hflip | 2 ++ tests/ref/fate/filter-pixfmts-il | 2 ++ tests/ref/fate/filter-pixfmts-null | 2 ++ tests/ref/fate/filter-pixfmts-scale | 2 ++ tests/ref/fate/filter-pixfmts-transpose | 2 ++ tests/ref/fate/filter-pixfmts-vflip | 2 ++ 19 files changed, 203 insertions(+), 4 deletions(-) create mode 100644 tests/ref/fate/filter-pi
Re: [FFmpeg-devel] [PATCH 5/7] libavfilter/dnn_backend_tf.c: Fixes ff_dnn_free_model_tf.
Updated patch. From 11186187d0b5a4725415a91947f38d5e166e024c Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Tue, 31 Jul 2018 18:40:24 +0300 Subject: [PATCH 6/9] libavfilter/dnn_backend_tf.c: Fixes ff_dnn_free_model_tf. --- libavfilter/dnn_backend_tf.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c index bd21137a8a..971a914c67 100644 --- a/libavfilter/dnn_backend_tf.c +++ b/libavfilter/dnn_backend_tf.c @@ -571,7 +571,9 @@ void ff_dnn_free_model_tf(DNNModel **model) if (tf_model->input_tensor){ TF_DeleteTensor(tf_model->input_tensor); } -av_freep(&tf_model->output_data->data); +if (tf_model->output_data){ +av_freep(&(tf_model->output_data->data)); +} av_freep(&tf_model); av_freep(model); } -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 7/7] libavfilter: Adds proper file descriptions to dnn_srcnn.h and dnn_espcn.h.
Updated patch. 2018-08-02 21:52 GMT+03:00 Sergey Lavrushkin : > --- > libavfilter/dnn_espcn.h | 3 ++- > libavfilter/dnn_srcnn.h | 3 ++- > 2 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h > index 9344aa90fe..e0013fe1dd 100644 > --- a/libavfilter/dnn_espcn.h > +++ b/libavfilter/dnn_espcn.h > @@ -20,7 +20,8 @@ > > /** > * @file > - * Default cnn weights for x2 upscaling with espcn model. > + * This file contains CNN weights for ESPCN model ( > https://arxiv.org/abs/1609.05158), > + * auto generated by scripts provided in the repository: > https://github.com/HighVoltageRocknRoll/sr.git. > */ > > #ifndef AVFILTER_DNN_ESPCN_H > diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h > index 4f5332ce18..8bf563bd62 100644 > --- a/libavfilter/dnn_srcnn.h > +++ b/libavfilter/dnn_srcnn.h > @@ -20,7 +20,8 @@ > > /** > * @file > - * Default cnn weights for x2 upscaling with srcnn model. > + * This file contains CNN weights for SRCNN model ( > https://arxiv.org/abs/1501.00092), > + * auto generated by scripts provided in the repository: > https://github.com/HighVoltageRocknRoll/sr.git. > */ > > #ifndef AVFILTER_DNN_SRCNN_H > -- > 2.14.1 > > From c2060d992664087fcfffa447768a6ad8f5e38623 Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Thu, 2 Aug 2018 19:56:23 +0300 Subject: [PATCH 8/9] libavfilter: Adds proper file descriptions to dnn_srcnn.h and dnn_espcn.h. --- libavfilter/dnn_espcn.h | 3 ++- libavfilter/dnn_srcnn.h | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h index 9344aa90fe..e0013fe1dd 100644 --- a/libavfilter/dnn_espcn.h +++ b/libavfilter/dnn_espcn.h @@ -20,7 +20,8 @@ /** * @file - * Default cnn weights for x2 upscaling with espcn model. + * This file contains CNN weights for ESPCN model (https://arxiv.org/abs/1609.05158), + * auto generated by scripts provided in the repository: https://github.com/HighVoltageRocknRoll/sr.git. */ #ifndef AVFILTER_DNN_ESPCN_H diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h index 4f5332ce18..8bf563bd62 100644 --- a/libavfilter/dnn_srcnn.h +++ b/libavfilter/dnn_srcnn.h @@ -20,7 +20,8 @@ /** * @file - * Default cnn weights for x2 upscaling with srcnn model. + * This file contains CNN weights for SRCNN model (https://arxiv.org/abs/1501.00092), + * auto generated by scripts provided in the repository: https://github.com/HighVoltageRocknRoll/sr.git. */ #ifndef AVFILTER_DNN_SRCNN_H -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] Documentation for sr filter
From f076c4be5455331958b928fcea6b3dd8da287527 Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Fri, 3 Aug 2018 17:24:00 +0300 Subject: [PATCH 9/9] doc/filters.texi: Adds documentation for sr filter. --- doc/filters.texi | 60 1 file changed, 60 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index 0b0903e5a7..e2436a24e7 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -15394,6 +15394,66 @@ option may cause flicker since the B-Frames have often larger QP. Default is @code{0} (not enabled). @end table +@section sr + +Scale the input by applying one of the super-resolution methods based on +convolutional neural networks. + +Training scripts as well as scripts for model generation are provided in +the repository @url{https://github.com/HighVoltageRocknRoll/sr.git}. + +The filter accepts the following options: + +@table @option +@item model +Specify what super-resolution model to use. This option accepts the following values: + +@table @samp +@item srcnn +Super-Resolution Convolutional Neural Network model +@url{https://arxiv.org/abs/1501.00092}. + +@item espcn +Efficient Sub-Pixel Convolutional Neural Network model +@url{https://arxiv.org/abs/1609.05158}. + +@end table + +Default value is @samp{srcnn}. + +@item dnn_backend +Specify what DNN backend to use for model loading and execution. This option accepts +the following values: + +@table @samp +@item native +Native implementation of DNN loading and execution. + +@item tensorflow +TensorFlow backend @url{https://www.tensorflow.org/}. To enable this backend you +need to install the TensorFlow for C library (see +@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with +@code{--enable-libtensorflow} + +@end table + +Default value is @samp{native}. + +@item scale_factor +Set scale factor for SRCNN model, for which custom model file was provided. +Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is neccessary +for SRCNN model, because it accepts input upscaled using bicubic upscaling with +proper scale factor. + +Default value is @code{2}. + +@item model_filename +Set path to model file specifying network architecture and its parameters. +Note that different backends use different file format. If path to model +file is not specified, built-in models for 2x upscaling are used. + +@end table + @anchor{subtitles} @section subtitles -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 6/7] libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8 conversions.
Updated patch. 2018-08-02 21:52 GMT+03:00 Sergey Lavrushkin : > This patch removes conversions, declared inside the sr filter, and uses > libswscale inside > the filter to perform them for only Y channel of input. The sr filter > still has uint > formats as input, as it does not use chroma channels in models and these > channels are > upscaled using libswscale, float formats for input would cause unnecessary > conversions > during scaling for these channels. > > --- > libavfilter/vf_sr.c | 134 +++--- > -- > 1 file changed, 48 insertions(+), 86 deletions(-) > > diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c > index 944a0e28e7..5ad1baa4c0 100644 > --- a/libavfilter/vf_sr.c > +++ b/libavfilter/vf_sr.c > @@ -45,8 +45,8 @@ typedef struct SRContext { > DNNModel *model; > DNNData input, output; > int scale_factor; > -struct SwsContext *sws_context; > -int sws_slice_h; > +struct SwsContext *sws_contexts[3]; > +int sws_slice_h, sws_input_linesize, sws_output_linesize; > } SRContext; > > #define OFFSET(x) offsetof(SRContext, x) > @@ -95,6 +95,10 @@ static av_cold int init(AVFilterContext *context) > return AVERROR(EIO); > } > > +sr_context->sws_contexts[0] = NULL; > +sr_context->sws_contexts[1] = NULL; > +sr_context->sws_contexts[2] = NULL; > + > return 0; > } > > @@ -110,6 +114,7 @@ static int query_formats(AVFilterContext *context) > av_log(context, AV_LOG_ERROR, "could not create formats list\n"); > return AVERROR(ENOMEM); > } > + > return ff_set_common_formats(context, formats_list); > } > > @@ -140,21 +145,31 @@ static int config_props(AVFilterLink *inlink) > else{ > outlink->h = sr_context->output.height; > outlink->w = sr_context->output.width; > +sr_context->sws_contexts[1] = sws_getContext(sr_context->input.width, > sr_context->input.height, AV_PIX_FMT_GRAY8, > + > sr_context->input.width, sr_context->input.height, AV_PIX_FMT_GRAYF32, > + 0, NULL, NULL, NULL); > +sr_context->sws_input_linesize = sr_context->input.width << 2; > +sr_context->sws_contexts[2] = > sws_getContext(sr_context->output.width, > sr_context->output.height, AV_PIX_FMT_GRAYF32, > + > sr_context->output.width, sr_context->output.height, AV_PIX_FMT_GRAY8, > + 0, NULL, NULL, NULL); > +sr_context->sws_output_linesize = sr_context->output.width << 2; > +if (!sr_context->sws_contexts[1] || !sr_context->sws_contexts[2]){ > +av_log(context, AV_LOG_ERROR, "could not create SwsContext > for conversions\n"); > +return AVERROR(ENOMEM); > +} > switch (sr_context->model_type){ > case SRCNN: > -sr_context->sws_context = sws_getContext(inlink->w, > inlink->h, inlink->format, > - outlink->w, > outlink->h, outlink->format, SWS_BICUBIC, NULL, NULL, NULL); > -if (!sr_context->sws_context){ > -av_log(context, AV_LOG_ERROR, "could not create > SwsContext\n"); > +sr_context->sws_contexts[0] = sws_getContext(inlink->w, > inlink->h, inlink->format, > + outlink->w, > outlink->h, outlink->format, > + SWS_BICUBIC, > NULL, NULL, NULL); > +if (!sr_context->sws_contexts[0]){ > +av_log(context, AV_LOG_ERROR, "could not create > SwsContext for scaling\n"); > return AVERROR(ENOMEM); > } > sr_context->sws_slice_h = inlink->h; > break; > case ESPCN: > -if (inlink->format == AV_PIX_FMT_GRAY8){ > -sr_context->sws_context = NULL; > -} > -else{ > +if (inlink->format != AV_PIX_FMT_GRAY8){ > sws_src_h = sr_context->input.height; > sws_src_w = sr_context->input.width; > sws_dst_h = sr_context->output.height; > @@ -184,13 +199,14 @@ static int config_props(AVFilterLink *inlink) > sws_dst_w = AV_CEIL_RSHIFT(sws_dst_w, 2); > break; > default: > -av_log(context, AV_LOG_ERROR, "could not
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
Here are updated patches with fixes. I updated conversion functions, so they should properly work with format for different endianness. 2018-08-08 1:47 GMT+03:00 Michael Niedermayer : > On Tue, Aug 07, 2018 at 12:17:58AM +0300, Sergey Lavrushkin wrote: > > I split patch to one for libavutil and another for libswscale, > > also added LUT for unscaled conversion, added > > conversions for scaling and updated fate tests. > > > libavutil/pixdesc.c | 22 ++ > > libavutil/pixfmt.h |5 + > > libavutil/version.h |2 +- > > tests/ref/fate/sws-pixdesc-query |3 +++ > > 4 files changed, 31 insertions(+), 1 deletion(-) > > b58f328f5d90954c62957f127b1acbfad5795a4d 0004-libavutil-Adds-gray- > floating-point-pixel-formats.patch > > From 8bcc10b49c41612b4d6549e64d90acf3f0b3fc6a Mon Sep 17 00:00:00 2001 > > From: Sergey Lavrushkin > > Date: Fri, 3 Aug 2018 18:02:49 +0300 > > Subject: [PATCH 4/9] libavutil: Adds gray floating-point pixel formats. > > > > --- > > libavutil/pixdesc.c | 22 ++ > > libavutil/pixfmt.h | 5 + > > libavutil/version.h | 2 +- > > tests/ref/fate/sws-pixdesc-query | 3 +++ > > 4 files changed, 31 insertions(+), 1 deletion(-) > > > > diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c > > index 96e079584a..970a83214c 100644 > > --- a/libavutil/pixdesc.c > > +++ b/libavutil/pixdesc.c > > @@ -2206,6 +2206,28 @@ static const AVPixFmtDescriptor > av_pix_fmt_descriptors[AV_PIX_FMT_NB] = { > > .name = "opencl", > > .flags = AV_PIX_FMT_FLAG_HWACCEL, > > }, > > +[AV_PIX_FMT_GRAYF32BE] = { > > +.name = "grayf32be", > > +.nb_components = 1, > > +.log2_chroma_w = 0, > > +.log2_chroma_h = 0, > > +.comp = { > > +{ 0, 4, 0, 0, 32, 3, 31, 1 }, /* Y */ > > +}, > > +.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_FLOAT, > > +.alias = "yf32be", > > +}, > > +[AV_PIX_FMT_GRAYF32LE] = { > > +.name = "grayf32le", > > +.nb_components = 1, > > +.log2_chroma_w = 0, > > +.log2_chroma_h = 0, > > +.comp = { > > +{ 0, 4, 0, 0, 32, 3, 31, 1 }, /* Y */ > > +}, > > +.flags = AV_PIX_FMT_FLAG_FLOAT, > > +.alias = "yf32le", > > +}, > > }; > > #if FF_API_PLUS1_MINUS1 > > FF_ENABLE_DEPRECATION_WARNINGS > > diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h > > index 2b3307845e..7b254732d8 100644 > > --- a/libavutil/pixfmt.h > > +++ b/libavutil/pixfmt.h > > @@ -337,6 +337,9 @@ enum AVPixelFormat { > > AV_PIX_FMT_GRAY14BE, ///<Y, 14bpp, big-endian > > AV_PIX_FMT_GRAY14LE, ///<Y, 14bpp, little-endian > > > > +AV_PIX_FMT_GRAYF32BE, ///< IEEE-754 single precision Y, 32bpp, > big-endian > > +AV_PIX_FMT_GRAYF32LE, ///< IEEE-754 single precision Y, 32bpp, > little-endian > > + > > AV_PIX_FMT_NB ///< number of pixel formats, DO NOT USE THIS > if you want to link with shared libav* because the number of formats might > differ between versions > > }; > > > > @@ -405,6 +408,8 @@ enum AVPixelFormat { > > #define AV_PIX_FMT_GBRPF32AV_PIX_FMT_NE(GBRPF32BE, GBRPF32LE) > > #define AV_PIX_FMT_GBRAPF32 AV_PIX_FMT_NE(GBRAPF32BE, GBRAPF32LE) > > > > +#define AV_PIX_FMT_GRAYF32 AV_PIX_FMT_NE(GRAYF32BE, GRAYF32LE) > > + > > #define AV_PIX_FMT_YUVA420P9 AV_PIX_FMT_NE(YUVA420P9BE , YUVA420P9LE) > > #define AV_PIX_FMT_YUVA422P9 AV_PIX_FMT_NE(YUVA422P9BE , YUVA422P9LE) > > #define AV_PIX_FMT_YUVA444P9 AV_PIX_FMT_NE(YUVA444P9BE , YUVA444P9LE) > > diff --git a/libavutil/version.h b/libavutil/version.h > > index 44bdebdc93..5205c5bc60 100644 > > --- a/libavutil/version.h > > +++ b/libavutil/version.h > > > @@ -79,7 +79,7 @@ > > */ > > > > #define LIBAVUTIL_VERSION_MAJOR 56 > > -#define LIBAVUTIL_VERSION_MINOR 18 > > +#define LIBAVUTIL_VERSION_MINOR 19 > > #define LIBAVUTIL_VERSION_MICRO 102 > > a bump to minor must reset micro to 100 > > [...] > > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > In a rich man's house there is no place to spit but his face. > -- Diogenes of Sinope > > ___ > ffmp
Re: [FFmpeg-devel] [PATCH] Documentation for sr filter
2018-08-07 13:14 GMT+03:00 Moritz Barsnick : > On Tue, Aug 07, 2018 at 00:24:29 +0300, Sergey Lavrushkin wrote: > > +@table @option > > +@item model > > +Specify what super-resolution model to use. This option accepts the > following values: >^ nit: which > > > +Specify what DNN backend to use for model loading and execution. This > option accepts > Ditto > > > +Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is > neccessary >^ > necessary > > > +Note that different backends use different file format. If path to model >^ formats > Here is updated patch. From 1fed1ea07a5727d937228307bffbde13e6727669 Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Fri, 3 Aug 2018 17:24:00 +0300 Subject: [PATCH 9/9] doc/filters.texi: Adds documentation for sr filter. --- doc/filters.texi | 60 1 file changed, 60 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index 0b0903e5a7..e2436a24e7 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -15394,6 +15394,66 @@ option may cause flicker since the B-Frames have often larger QP. Default is @code{0} (not enabled). @end table +@section sr + +Scale the input by applying one of the super-resolution methods based on +convolutional neural networks. + +Training scripts as well as scripts for model generation are provided in +the repository @url{https://github.com/HighVoltageRocknRoll/sr.git}. + +The filter accepts the following options: + +@table @option +@item model +Specify what super-resolution model to use. This option accepts the following values: + +@table @samp +@item srcnn +Super-Resolution Convolutional Neural Network model +@url{https://arxiv.org/abs/1501.00092}. + +@item espcn +Efficient Sub-Pixel Convolutional Neural Network model +@url{https://arxiv.org/abs/1609.05158}. + +@end table + +Default value is @samp{srcnn}. + +@item dnn_backend +Specify what DNN backend to use for model loading and execution. This option accepts +the following values: + +@table @samp +@item native +Native implementation of DNN loading and execution. + +@item tensorflow +TensorFlow backend @url{https://www.tensorflow.org/}. To enable this backend you +need to install the TensorFlow for C library (see +@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with +@code{--enable-libtensorflow} + +@end table + +Default value is @samp{native}. + +@item scale_factor +Set scale factor for SRCNN model, for which custom model file was provided. +Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is neccessary +for SRCNN model, because it accepts input upscaled using bicubic upscaling with +proper scale factor. + +Default value is @code{2}. + +@item model_filename +Set path to model file specifying network architecture and its parameters. +Note that different backends use different file format. If path to model +file is not specified, built-in models for 2x upscaling are used. + +@end table + @anchor{subtitles} @section subtitles -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Documentation for sr filter
Sorry, I accidentally sent previous patch, here is updated version. From 99afeefe4add5b932140388f48ec4111734aa593 Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Fri, 3 Aug 2018 17:24:00 +0300 Subject: [PATCH 9/9] doc/filters.texi: Adds documentation for sr filter. --- doc/filters.texi | 60 1 file changed, 60 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index 0b0903e5a7..9995ca532b 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -15394,6 +15394,66 @@ option may cause flicker since the B-Frames have often larger QP. Default is @code{0} (not enabled). @end table +@section sr + +Scale the input by applying one of the super-resolution methods based on +convolutional neural networks. + +Training scripts as well as scripts for model generation are provided in +the repository @url{https://github.com/HighVoltageRocknRoll/sr.git}. + +The filter accepts the following options: + +@table @option +@item model +Specify which super-resolution model to use. This option accepts the following values: + +@table @samp +@item srcnn +Super-Resolution Convolutional Neural Network model +@url{https://arxiv.org/abs/1501.00092}. + +@item espcn +Efficient Sub-Pixel Convolutional Neural Network model +@url{https://arxiv.org/abs/1609.05158}. + +@end table + +Default value is @samp{srcnn}. + +@item dnn_backend +Specify which DNN backend to use for model loading and execution. This option accepts +the following values: + +@table @samp +@item native +Native implementation of DNN loading and execution. + +@item tensorflow +TensorFlow backend @url{https://www.tensorflow.org/}. To enable this backend you +need to install the TensorFlow for C library (see +@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with +@code{--enable-libtensorflow} + +@end table + +Default value is @samp{native}. + +@item scale_factor +Set scale factor for SRCNN model, for which custom model file was provided. +Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is necessary +for SRCNN model, because it accepts input upscaled using bicubic upscaling with +proper scale factor. + +Default value is @code{2}. + +@item model_filename +Set path to model file specifying network architecture and its parameters. +Note that different backends use different file formats. If path to model +file is not specified, built-in models for 2x upscaling are used. + +@end table + @anchor{subtitles} @section subtitles -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
2018-08-10 20:24 GMT+03:00 Michael Niedermayer : > On Thu, Aug 09, 2018 at 08:15:16PM +0300, Sergey Lavrushkin wrote: > > Here are updated patches with fixes. I updated conversion functions, so > > they should > > properly work with format for different endianness. > [...] > > diff --git a/libswscale/input.c b/libswscale/input.c > > index 3fd3a5d81e..0e016d387f 100644 > > --- a/libswscale/input.c > > +++ b/libswscale/input.c > > @@ -942,6 +942,30 @@ static av_always_inline void > planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV, > > } > > #undef rdpx > > > > +static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const > uint8_t *_src, const uint8_t *unused1, > > +const uint8_t *unused2, int > width, uint32_t *unused) > > +{ > > +int i; > > +const float *src = (const float *)_src; > > +uint16_t *dst= (uint16_t *)_dst; > > + > > +for (i = 0; i < width; ++i){ > > +dst[i] = lrintf(65535.0f * FFMIN(FFMAX(src[i], 0.0f), 1.0f)); > > +} > > +} > > is it faster to clip the float before lrintf() than the integer afterwards > ? > Clipping integers is faster, switched to it. > [...] > > diff --git a/libswscale/output.c b/libswscale/output.c > > index 0af2fffea4..cd408fb285 100644 > > --- a/libswscale/output.c > > +++ b/libswscale/output.c > > @@ -208,6 +208,121 @@ static void yuv2p016cX_c(SwsContext *c, const > int16_t *chrFilter, int chrFilterS > > } > > } > > > > +static av_always_inline void > > +yuv2plane1_float_c_template(const int32_t *src, float *dest, int dstW) > > +{ > > +#if HAVE_BIGENDIAN > > +static const int big_endian = 1; > > +#else > > +static const int big_endian = 0; > > +#endif > > you can use HAVE_BIGENDIAN in place of big_endian > its either 0 or 1 already > or static const int big_endian = HAVE_BIGENDIAN > Ok. Here is updated patch. From cf523bcb50537abbf6daf0eb799341d8b706d366 Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Fri, 3 Aug 2018 18:06:50 +0300 Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray format. --- libswscale/input.c | 38 +++ libswscale/output.c | 105 +++ libswscale/ppc/swscale_altivec.c | 1 + libswscale/swscale_internal.h| 9 +++ libswscale/swscale_unscaled.c| 54 +++- libswscale/utils.c | 20 +- libswscale/x86/swscale_template.c| 3 +- tests/ref/fate/filter-pixdesc-grayf32be | 1 + tests/ref/fate/filter-pixdesc-grayf32le | 1 + tests/ref/fate/filter-pixfmts-copy | 2 + tests/ref/fate/filter-pixfmts-crop | 2 + tests/ref/fate/filter-pixfmts-field | 2 + tests/ref/fate/filter-pixfmts-fieldorder | 2 + tests/ref/fate/filter-pixfmts-hflip | 2 + tests/ref/fate/filter-pixfmts-il | 2 + tests/ref/fate/filter-pixfmts-null | 2 + tests/ref/fate/filter-pixfmts-scale | 2 + tests/ref/fate/filter-pixfmts-transpose | 2 + tests/ref/fate/filter-pixfmts-vflip | 2 + 19 files changed, 248 insertions(+), 4 deletions(-) create mode 100644 tests/ref/fate/filter-pixdesc-grayf32be create mode 100644 tests/ref/fate/filter-pixdesc-grayf32le diff --git a/libswscale/input.c b/libswscale/input.c index 3fd3a5d81e..7e45df50ce 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -942,6 +942,30 @@ static av_always_inline void planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV, } #undef rdpx +static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, +const uint8_t *unused2, int width, uint32_t *unused) +{ +int i; +const float *src = (const float *)_src; +uint16_t *dst= (uint16_t *)_dst; + +for (i = 0; i < width; ++i){ +dst[i] = FFMIN(FFMAX(lrintf(65535.0f * src[i]), 0), 65535); +} +} + +static av_always_inline void grayf32ToY16_bswap_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, + const uint8_t *unused2, int width, uint32_t *unused) +{ +int i; +const uint32_t *src = (const uint32_t *)_src; +uint16_t *dst= (uint16_t *)_dst; + +for (i = 0; i < width; ++i){ +dst[i] = FFMIN(FFMAX(lrintf(65535.0f * av_int2float(av_bswap32(src[i]))), 0.0f), 65535); +} +} + #define rgb9plus_planar_funcs_endian(nbits, endian_name, endian)\ static void planar_rgb##nbits##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4], \ int w, int32_t *rgb2yuv)
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
2018-08-12 0:45 GMT+03:00 Michael Niedermayer : > On Sat, Aug 11, 2018 at 05:52:32PM +0300, Sergey Lavrushkin wrote: > > 2018-08-10 20:24 GMT+03:00 Michael Niedermayer : > > > > > On Thu, Aug 09, 2018 at 08:15:16PM +0300, Sergey Lavrushkin wrote: > > > > Here are updated patches with fixes. I updated conversion functions, > so > > > > they should > > > > properly work with format for different endianness. > > > [...] > > > > diff --git a/libswscale/input.c b/libswscale/input.c > > > > index 3fd3a5d81e..0e016d387f 100644 > > > > --- a/libswscale/input.c > > > > +++ b/libswscale/input.c > > > > @@ -942,6 +942,30 @@ static av_always_inline void > > > planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV, > > > > } > > > > #undef rdpx > > > > > > > > +static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const > > > uint8_t *_src, const uint8_t *unused1, > > > > +const uint8_t *unused2, > int > > > width, uint32_t *unused) > > > > +{ > > > > +int i; > > > > +const float *src = (const float *)_src; > > > > +uint16_t *dst= (uint16_t *)_dst; > > > > + > > > > +for (i = 0; i < width; ++i){ > > > > +dst[i] = lrintf(65535.0f * FFMIN(FFMAX(src[i], 0.0f), > 1.0f)); > > > > +} > > > > +} > > > > > > is it faster to clip the float before lrintf() than the integer > afterwards > > > ? > > > > > > > Clipping integers is faster, switched to it. > > > > > > > [...] > > > > diff --git a/libswscale/output.c b/libswscale/output.c > > > > index 0af2fffea4..cd408fb285 100644 > > > > --- a/libswscale/output.c > > > > +++ b/libswscale/output.c > > > > @@ -208,6 +208,121 @@ static void yuv2p016cX_c(SwsContext *c, const > > > int16_t *chrFilter, int chrFilterS > > > > } > > > > } > > > > > > > > +static av_always_inline void > > > > +yuv2plane1_float_c_template(const int32_t *src, float *dest, int > dstW) > > > > +{ > > > > +#if HAVE_BIGENDIAN > > > > +static const int big_endian = 1; > > > > +#else > > > > +static const int big_endian = 0; > > > > +#endif > > > > > > you can use HAVE_BIGENDIAN in place of big_endian > > > its either 0 or 1 already > > > or static const int big_endian = HAVE_BIGENDIAN > > > > > > > Ok. > > > > Here is updated patch. > > > libswscale/input.c | 38 +++ > > libswscale/output.c | 105 > +++ > > libswscale/ppc/swscale_altivec.c |1 > > libswscale/swscale_internal.h|9 ++ > > libswscale/swscale_unscaled.c| 54 +++ > > libswscale/utils.c | 20 + > > libswscale/x86/swscale_template.c|3 > > tests/ref/fate/filter-pixdesc-grayf32be |1 > > tests/ref/fate/filter-pixdesc-grayf32le |1 > > tests/ref/fate/filter-pixfmts-copy |2 > > tests/ref/fate/filter-pixfmts-crop |2 > > tests/ref/fate/filter-pixfmts-field |2 > > tests/ref/fate/filter-pixfmts-fieldorder |2 > > tests/ref/fate/filter-pixfmts-hflip |2 > > tests/ref/fate/filter-pixfmts-il |2 > > tests/ref/fate/filter-pixfmts-null |2 > > tests/ref/fate/filter-pixfmts-scale |2 > > tests/ref/fate/filter-pixfmts-transpose |2 > > tests/ref/fate/filter-pixfmts-vflip |2 > > 19 files changed, 248 insertions(+), 4 deletions(-) > > db401051d0e42132f7ce76cb78de584951be704b 0005-libswscale-Adds- > conversions-from-to-float-gray-forma.patch > > From cf523bcb50537abbf6daf0eb799341d8b706d366 Mon Sep 17 00:00:00 2001 > > From: Sergey Lavrushkin > > Date: Fri, 3 Aug 2018 18:06:50 +0300 > > Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray > format. > > > > --- > > libswscale/input.c | 38 +++ > > libswscale/output.c | 105 > +++ > > libswscale/ppc/swscale_altivec.c | 1 + > > libswscale/swscale_internal.h| 9 +++ > > libswscale/swscale_unscaled.c| 54 +++- > > libswscale/utils.c
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
> > Just use av_clipf instead of FFMIN/FFMAX. Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8. From 210e497d76328947fdf424b169728fa728cc18f2 Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Fri, 3 Aug 2018 18:06:50 +0300 Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray format. --- libswscale/input.c | 38 +++ libswscale/output.c | 105 +++ libswscale/ppc/swscale_altivec.c | 1 + libswscale/swscale_internal.h| 9 +++ libswscale/swscale_unscaled.c| 54 +++- libswscale/utils.c | 20 +- libswscale/x86/swscale_template.c| 3 +- tests/ref/fate/filter-pixdesc-grayf32be | 1 + tests/ref/fate/filter-pixdesc-grayf32le | 1 + tests/ref/fate/filter-pixfmts-copy | 2 + tests/ref/fate/filter-pixfmts-crop | 2 + tests/ref/fate/filter-pixfmts-field | 2 + tests/ref/fate/filter-pixfmts-fieldorder | 2 + tests/ref/fate/filter-pixfmts-hflip | 2 + tests/ref/fate/filter-pixfmts-il | 2 + tests/ref/fate/filter-pixfmts-null | 2 + tests/ref/fate/filter-pixfmts-scale | 2 + tests/ref/fate/filter-pixfmts-transpose | 2 + tests/ref/fate/filter-pixfmts-vflip | 2 + 19 files changed, 248 insertions(+), 4 deletions(-) create mode 100644 tests/ref/fate/filter-pixdesc-grayf32be create mode 100644 tests/ref/fate/filter-pixdesc-grayf32le diff --git a/libswscale/input.c b/libswscale/input.c index 3fd3a5d81e..4099c19c2b 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -942,6 +942,30 @@ static av_always_inline void planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV, } #undef rdpx +static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, +const uint8_t *unused2, int width, uint32_t *unused) +{ +int i; +const float *src = (const float *)_src; +uint16_t *dst= (uint16_t *)_dst; + +for (i = 0; i < width; ++i){ +dst[i] = av_clip_uint16(lrintf(65535.0f * src[i])); +} +} + +static av_always_inline void grayf32ToY16_bswap_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, + const uint8_t *unused2, int width, uint32_t *unused) +{ +int i; +const uint32_t *src = (const uint32_t *)_src; +uint16_t *dst= (uint16_t *)_dst; + +for (i = 0; i < width; ++i){ +dst[i] = av_clip_uint16(lrintf(65535.0f * av_int2float(av_bswap32(src[i]; +} +} + #define rgb9plus_planar_funcs_endian(nbits, endian_name, endian)\ static void planar_rgb##nbits##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4], \ int w, int32_t *rgb2yuv) \ @@ -1538,6 +1562,20 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c) case AV_PIX_FMT_P010BE: c->lumToYV12 = p010BEToY_c; break; +case AV_PIX_FMT_GRAYF32LE: +#if HAVE_BIGENDIAN +c->lumToYV12 = grayf32ToY16_bswap_c; +#else +c->lumToYV12 = grayf32ToY16_c; +#endif +break; +case AV_PIX_FMT_GRAYF32BE: +#if HAVE_BIGENDIAN +c->lumToYV12 = grayf32ToY16_c; +#else +c->lumToYV12 = grayf32ToY16_bswap_c; +#endif +break; } if (c->needAlpha) { if (is16BPS(srcFormat) || isNBPS(srcFormat)) { diff --git a/libswscale/output.c b/libswscale/output.c index 0af2fffea4..de8637aa3b 100644 --- a/libswscale/output.c +++ b/libswscale/output.c @@ -208,6 +208,105 @@ static void yuv2p016cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterS } } +static av_always_inline void +yuv2plane1_float_c_template(const int32_t *src, float *dest, int dstW) +{ +static const int big_endian = HAVE_BIGENDIAN; +static const int shift = 3; +static const float float_mult = 1.0f / 65535.0f; +int i, val; +uint16_t val_uint; + +for (i = 0; i < dstW; ++i){ +val = src[i] + (1 << (shift - 1)); +output_pixel(&val_uint, val, 0, uint); +dest[i] = float_mult * (float)val_uint; +} +} + +static av_always_inline void +yuv2plane1_float_bswap_c_template(const int32_t *src, uint32_t *dest, int dstW) +{ +static const int big_endian = HAVE_BIGENDIAN; +static const int shift = 3; +static const float float_mult = 1.0f / 65535.0f; +int i, val; +uint16_t val_uint; + +for (i = 0; i < dstW; ++i){ +val = src[i] + (1 << (shift - 1)); +output_pixel(&val_uint, val, 0, uint); +dest[i] = av_bswap32(av_float2int(float_mult * (float)val_uint)); +} +} + +static av_always_inline void +yuv2planeX_float_c_template(const int16_t *filter, int filterSize, const int32_t **src, +float *dest, int ds
Re: [FFmpeg-devel] [PATCH 6/7] libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8 conversions.
2018-08-15 1:49 GMT+03:00 Marton Balint : > > On Tue, 14 Aug 2018, Pedro Arthur wrote: > > 2018-08-14 15:45 GMT-03:00 Rostislav Pehlivanov : >> >>> On Thu, 2 Aug 2018 at 20:00, Sergey Lavrushkin >>> wrote: >>> >>> This patch removes conversions, declared inside the sr filter, and uses >>>> libswscale inside >>>> the filter to perform them for only Y channel of input. The sr filter >>>> still has uint >>>> formats as input, as it does not use chroma channels in models and these >>>> channels are >>>> upscaled using libswscale, float formats for input would cause >>>> unnecessary >>>> conversions >>>> during scaling for these channels. >>>> >>>> > [...] > > You are planning to remove *all* conversion still, right? Its still >>> unacceptable that there *are* conversions. >>> >> >> They are here because it is the most efficient way to do it. The >> filter works only on luminance channel therefore we only apply >> conversion to Y channel, and bicubic upscale to chrominance. >> I can't see how one can achieve the same result, without doing useless >> computations, if not in this way. >> > > Is there a reason why only the luminance channel is scaled this way? Can't > you also train scaling chroma planes the same way? This way you could > really eliminate the internal calls to swscale. If the user prefers to > scale only one channel, he can always split the planes and scale them > separately (using different filters) and then merge them. > If it is possible, I can then change sr filter to work only for Y channel. Can you give me some examples of how to split the planes, filter them separately and merge them back? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/2] doc/filters.texi: Adds documentation for sr filter.
Resending patch with documentation for sr filter. --- doc/filters.texi | 60 1 file changed, 60 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index 267bd04a43..b2a74cb1ce 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -15403,6 +15403,66 @@ option may cause flicker since the B-Frames have often larger QP. Default is @code{0} (not enabled). @end table +@section sr + +Scale the input by applying one of the super-resolution methods based on +convolutional neural networks. + +Training scripts as well as scripts for model generation are provided in +the repository @url{https://github.com/HighVoltageRocknRoll/sr.git}. + +The filter accepts the following options: + +@table @option +@item model +Specify which super-resolution model to use. This option accepts the following values: + +@table @samp +@item srcnn +Super-Resolution Convolutional Neural Network model +@url{https://arxiv.org/abs/1501.00092}. + +@item espcn +Efficient Sub-Pixel Convolutional Neural Network model +@url{https://arxiv.org/abs/1609.05158}. + +@end table + +Default value is @samp{srcnn}. + +@item dnn_backend +Specify which DNN backend to use for model loading and execution. This option accepts +the following values: + +@table @samp +@item native +Native implementation of DNN loading and execution. + +@item tensorflow +TensorFlow backend @url{https://www.tensorflow.org/}. To enable this backend you +need to install the TensorFlow for C library (see +@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with +@code{--enable-libtensorflow} + +@end table + +Default value is @samp{native}. + +@item scale_factor +Set scale factor for SRCNN model, for which custom model file was provided. +Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is necessary +for SRCNN model, because it accepts input upscaled using bicubic upscaling with +proper scale factor. + +Default value is @code{2}. + +@item model_filename +Set path to model file specifying network architecture and its parameters. +Note that different backends use different file formats. If path to model +file is not specified, built-in models for 2x upscaling are used. + +@end table + @anchor{subtitles} @section subtitles -- 2.14.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] doc/filters.texi: Adds documentation for sr filter.
2018-08-15 19:59 GMT+03:00 Gyan Doshi : > > > On 15-08-2018 10:05 PM, Sergey Lavrushkin wrote: > >> Resending patch with documentation for sr filter. >> > > LGTM. Will apply with some small changes. > > I've merged the docs entry in the 2nd part, so remove it from there. > This entry corresponds to changes made in the second patch. Without these changes it is not true. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
пт, 17 авг. 2018 г., 6:47 James Almer : > On 8/14/2018 1:23 PM, Michael Niedermayer wrote: > > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote: > >>> > >>> Just use av_clipf instead of FFMIN/FFMAX. > >> > >> > >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8. > > > > will apply > > > > thanks > > This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be > tested for bitexact output. The gbrpf32 ones aren't, for example. > > http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx If I am not mistaken, gbrpf32 formats are not supported in libswscale and not tested because of that. > > Was a float gray pixfmt needed for this filter? Gray16 was not an option? > All calculations in neural network are done using floats. What can I do to fix this issue? Can I get a VM image for this host to test it? > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] libavfilter: Removes stored DNN models. Adds support for native backend model file format in tf backend.
2018-08-17 17:46 GMT+03:00 Pedro Arthur : > Hi, > > You did not provided any pre trained model files, so anyone trying to > test it has to perform the whole training! > I'm attaching the models I generated, if anyone is interested in testing > it. > > When applying the filter with tf backend there are artifacts in the > borders, for both srcnn and espcn (out_[srcnn|espcn]_tf.jpg). > It seems that a few lines in the top row of the image are repeated for > espcn using native backend (out_srcnn_nt.jpg). > I guess, it is because I didn't add any padding to the image and tf fills borders with 0 for 'SAME' padding in convolutions. I'll add required padding size calculation and insert padding operation to the graph. > The model/model_filename options are not coherent, the model type > should be defined in the file anyway therefore there is no need for > both options. > It is also buggy, if you specify the model_filename but not the model > type it will default to srcnn even if the model file is for espcn, no > error is generated and the output ofc is buggy. > I think, I can remove model type and check if model changes input size. I think all my switches for model type actually depend on this condition. If I remove conversions inside the filter and make it to work only for one plane, it basically will become a filter that executes neural network for one channel input. But there is a problem with float format - it brokes fate on some 32 bit hosts, as James stated, and I need first to fix this issue, or, otherwise, revert to doing conversions in the filter. > I personally would prefer to use only model=file as it is shorter than > model_filename=file. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
2018-08-17 23:28 GMT+03:00 Michael Niedermayer : > On Fri, Aug 17, 2018 at 12:46:52AM -0300, James Almer wrote: > > On 8/14/2018 1:23 PM, Michael Niedermayer wrote: > > > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote: > > >>> > > >>> Just use av_clipf instead of FFMIN/FFMAX. > > >> > > >> > > >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8. > > > > > > will apply > > > > > > thanks > > > > This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be > > tested for bitexact output. The gbrpf32 ones aren't, for example. > > http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot= > x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx > > h > i remember i had tested this locally on 32bit > can something be slightly adjusted (like an offset or factor) to avoid any > values becoming close to 0.5 and rounding differently on platforms ? If not the tests should skip float pixel formats or try the nearest > neighbor scaler > Can it really be the problem with scaler? Do all these failed test use scaling? Is not it the problem, that different platforms can give slightly different results for floating-point operations? Does input for the float format is somehow generated for these tests, so the input conversion is tested? Maybe it uses output conversion first? If it is the problem of different floating-point operations results on different platforms, maybe it is possible to use precomputed LUT for output conversion, so it will give the same results? Or is it possible to modify tests for the float format, so it will check if pixels of the result are just close to some reference. > Sergey, can you look into this (its your patch) ? (just asking to make sure > not eevryone thinks someone else will work on this) > Yes, I can, just need to know, what is possible to do to fix this issue, besides skipping the tests. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.
2018-08-18 23:20 GMT+03:00 Michael Niedermayer : > On Sat, Aug 18, 2018 at 02:10:21PM +0300, Sergey Lavrushkin wrote: > > 2018-08-17 23:28 GMT+03:00 Michael Niedermayer : > > > > > On Fri, Aug 17, 2018 at 12:46:52AM -0300, James Almer wrote: > > > > On 8/14/2018 1:23 PM, Michael Niedermayer wrote: > > > > > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote: > > > > >>> > > > > >>> Just use av_clipf instead of FFMIN/FFMAX. > > > > >> > > > > >> > > > > >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8. > > > > > > > > > > will apply > > > > > > > > > > thanks > > > > > > > > This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be > > > > tested for bitexact output. The gbrpf32 ones aren't, for example. > > > > http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot= > > > x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx > > > > > > h > > > i remember i had tested this locally on 32bit > > > can something be slightly adjusted (like an offset or factor) to avoid > any > > > values becoming close to 0.5 and rounding differently on platforms ? > > > > If not the tests should skip float pixel formats or try the nearest > > > neighbor scaler > > > > > > > Can it really be the problem with scaler? Do all these failed test use > > scaling? > > Is not it the problem, that different platforms can give slightly > different > > results for > > floating-point operations? Does input for the float format is somehow > > generated > > for these tests, so the input conversion is tested? Maybe it uses output > > conversion first? > > If it is the problem of different floating-point operations results on > > different platforms, > > > maybe it is possible to use precomputed LUT for output conversion, so it > > I dont think we should change the "algorithm" to achive "bitexactness" > we could of course but it feels like the wrong reason to make such a > change. How its done should be choosen based on what is fast (and to a > lesser extend clean, simple and maintainable) > > > > > will give > > the same results? Or is it possible to modify tests for the float format, > > so it will > > check if pixels of the result are just close to some reference. > > Its possible to compare to a reference, we do this in some other tests, > but thats surely more work than just disabling teh specific tests or trying > to nudge them a little to see if that makes nothing fall too close to n + > 0.5 > > > > > > > > Sergey, can you look into this (its your patch) ? (just asking to make > sure > > > not eevryone thinks someone else will work on this) > > > > > > > Yes, I can, just need to know, what is possible to do to fix this issue, > > besides skipping the tests. > > most things are possible > Hi, I am having trouble reproducing this error. These tests are fine for 32-bit VMs on my computers. So the only thing I can do is to disable these tests for these formats. Otherwise, I need to test other changes somehow. Here is the patch, that skips pixfmts tests for these formats. From a92e6965f9c328fcaa18460ac9da975748272e0a Mon Sep 17 00:00:00 2001 From: Sergey Lavrushkin Date: Mon, 20 Aug 2018 23:14:07 +0300 Subject: [PATCH] tests: Disables pixfmts tests for float gray formats. --- tests/fate-run.sh| 4 ++-- tests/ref/fate/filter-pixfmts-copy | 2 -- tests/ref/fate/filter-pixfmts-crop | 2 -- tests/ref/fate/filter-pixfmts-field | 2 -- tests/ref/fate/filter-pixfmts-fieldorder | 2 -- tests/ref/fate/filter-pixfmts-hflip | 2 -- tests/ref/fate/filter-pixfmts-il | 2 -- tests/ref/fate/filter-pixfmts-null | 2 -- tests/ref/fate/filter-pixfmts-scale | 2 -- tests/ref/fate/filter-pixfmts-transpose | 2 -- tests/ref/fate/filter-pixfmts-vflip | 2 -- 11 files changed, 2 insertions(+), 22 deletions(-) diff --git a/tests/fate-run.sh b/tests/fate-run.sh index aece90a01d..e8d71707b0 100755 --- a/tests/fate-run.sh +++ b/tests/fate-run.sh @@ -288,8 +288,8 @@ pixfmts(){ in_fmts=${outfile}_in_fmts # exclude pixel formats which are not supported as input -$showfiltfmts scale | awk -F '[ \r]' '/^INPUT/{ fmt=substr($3, 5); print fmt }' | sort >$scale_in_fmts -$showfiltfmts scale | awk -F '[ \r]' '/^OUTPUT/{ fmt=substr($3, 5); print fmt }' | sort >$scale_out_fmts +$showfiltfmts scale | awk -F
[FFmpeg-devel] [PATCH] avformat/cafenc: fixed packet_size calculation
the problem is the very last packet can be shorter than default packet_size so it's required to exclude it from packet_size calculations. fixes #10465 --- libavformat/cafenc.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/libavformat/cafenc.c b/libavformat/cafenc.c index 67be59806c..fcc4838392 100644 --- a/libavformat/cafenc.c +++ b/libavformat/cafenc.c @@ -34,6 +34,8 @@ typedef struct { int size_buffer_size; int size_entries_used; int packets; +int64_t duration; +int64_t last_packet_duration; } CAFContext; static uint32_t codec_flags(enum AVCodecID codec_id) { @@ -238,6 +240,8 @@ static int caf_write_packet(AVFormatContext *s, AVPacket *pkt) pkt_sizes[caf->size_entries_used++] = 128 | top; } pkt_sizes[caf->size_entries_used++] = pkt->size & 127; +caf->duration += pkt->duration; +caf->last_packet_duration = pkt->duration; caf->packets++; } avio_write(s->pb, pkt->data, pkt->size); @@ -259,7 +263,11 @@ static int caf_write_trailer(AVFormatContext *s) if (!par->block_align) { int packet_size = samples_per_packet(par); if (!packet_size) { -packet_size = st->duration / (caf->packets - 1); +if (caf->duration) { +packet_size = (caf->duration - caf->last_packet_duration) / (caf->packets - 1); +} else { +packet_size = st->duration / (caf->packets - 1); +} avio_seek(pb, FRAME_SIZE_OFFSET, SEEK_SET); avio_wb32(pb, packet_size); } -- 2.40.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".