from:"sergey"

[FFmpeg-devel] suggested patch: avfilter/vf_subtitles: add support for subtitles font scaling

2014-09-10 Thread Sergey

Hello.

Recently I used ffmpeg to embed subtitles, and I needed to scale them.

I thought "original_size" option scales subtitles, but it does not.
So I wrote a short patch for it to do that (attached)

If that is considered too complex/bad idea I've also attached another patch
that adds a "font_scale" option instead.

I hope you like one of these.
-- diff -ur ffmpeg-HEAD-649b7a9/libavfilter/vf_subtitles.c ffmpeg-HEAD-649b7a9.new/libavfilter/vf_subtitles.c
--- ffmpeg-HEAD-649b7a9/libavfilter/vf_subtitles.c	2014-09-10 00:07:59.0 +
+++ ffmpeg-HEAD-649b7a9.new/libavfilter/vf_subtitles.c	2014-09-10 00:11:26.0 +
@@ -136,9 +136,11 @@
 ff_draw_init(&ass->draw, inlink->format, 0);
 
 ass_set_frame_size  (ass->renderer, inlink->w, inlink->h);
-if (ass->original_w && ass->original_h)
+if (ass->original_w && ass->original_h) {
 ass_set_aspect_ratio(ass->renderer, (double)inlink->w / inlink->h,
  (double)ass->original_w / ass->original_h);
+ass_set_font_scale(ass->renderer, (inlink->w + inlink->h) * 1.0 / (ass->original_w + ass->original_h));
+}
 
 return 0;
 }
diff -ur ffmpeg-HEAD-649b7a9/libavfilter/vf_subtitles.c ffmpeg-HEAD-649b7a9.new/libavfilter/vf_subtitles.c
--- ffmpeg-HEAD-649b7a9/libavfilter/vf_subtitles.c	2014-09-10 01:02:49.0 +
+++ ffmpeg-HEAD-649b7a9.new/libavfilter/vf_subtitles.c	2014-09-10 03:34:19.0 +
@@ -55,6 +55,7 @@
 uint8_t rgba_map[4];
 int pix_step[4];   ///< steps per pixel for each plane of the main output
 int original_w, original_h;
+double font_scale;
 FFDrawContext draw;
 } AssContext;
 
@@ -65,6 +66,7 @@
 {"filename",   "set the filename of file to read", OFFSET(filename),   AV_OPT_TYPE_STRING, {.str = NULL},  CHAR_MIN, CHAR_MAX, FLAGS }, \
 {"f",  "set the filename of file to read", OFFSET(filename),   AV_OPT_TYPE_STRING, {.str = NULL},  CHAR_MIN, CHAR_MAX, FLAGS }, \
 {"original_size",  "set the size of the original video (used to scale fonts)", OFFSET(original_w), AV_OPT_TYPE_IMAGE_SIZE, {.str = NULL},  CHAR_MIN, CHAR_MAX, FLAGS }, \
+{"font_scale", "set font scale",   OFFSET(font_scale), AV_OPT_TYPE_DOUBLE, {.dbl = 1}, 0.01, 99,   FLAGS }, \
 
 /* libass supports a log level ranging from 0 to 7 */
 static const int ass_libavfilter_log_level_map[] = {
@@ -139,6 +141,7 @@
 if (ass->original_w && ass->original_h)
 ass_set_aspect_ratio(ass->renderer, (double)inlink->w / inlink->h,
  (double)ass->original_w / ass->original_h);
+ass_set_font_scale(ass->renderer, ass->font_scale);
 
 return 0;
 }
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] h264: fix RTSP stream decoding

2018-01-03 Thread sergey

> The error code returned by decode_extradata_ps() is inconsistent after this
> its not "if any failed" it is returning an error if the last failed 

Sorry, I don't get how it is supposed to work. I just found the previous 
implementation and checked which commit broke it.

The other possible solution on upper level: 

---

From 9fcd003a095b19b9e2fb5f6af3cc57a9e131f308 Mon Sep 17 00:00:00 2001
From: Sergey Gavrushkin 
Date: Wed, 3 Jan 2018 12:51:15 +0300
Subject: [PATCH] libavcodec/h264: fix decoding

Fixes ticket #6422. It is a regression fix for an issue that was introduced in 
commit
98c97994c5b90bdae02accb155eeceeb5224b8ef. Variable err_recognition is
ignored while extradata is decoded and the whole decoding process is
failed due to timeout.

Signed-off-by: Sergey Gavrushkin 
---
 libavcodec/h264_parse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/h264_parse.c b/libavcodec/h264_parse.c
index fee28d9..403fd39 100644
--- a/libavcodec/h264_parse.c
+++ b/libavcodec/h264_parse.c
@@ -487,7 +487,7 @@ int ff_h264_decode_extradata(const uint8_t *data, int size, 
H264ParamSets *ps,
 } else {
 *is_avc = 0;
 ret = decode_extradata_ps(data, size, ps, 0, logctx);
-if (ret < 0)
+if (ret < 0 && (err_recognition & AV_EF_EXPLODE))
 return ret;
 }
     return size;
--
2.6.4


Thank you, 
Sergey
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] cuviddec: improved way of finding out if a frame is interlaced or progressive

2019-04-22 Thread Sergey Svechnikov

There are 2 types of problems when using adaptive deinterlace with cuvid:

1. Sometimes, in the middle of transcoding, cuvid outputs frames with visible 
horizontal lines (as though weave deinterlace method was chosen);
2. Occasionally, on scene changes, cuvid outputs a wrong frame, which should 
have been shown several seconds before (as if the frame was assigned some wrong 
PTS value).

The reason is that sometimes CUVIDPARSERDISPINFO has property progressive_frame 
equal to 1 with interlaced videos.
In order to fix the problem we should check if the video is interlaced or 
progressive in the beginning of a video sequence (cuvid_handle_video_sequence).
And then we just use this information instead of the property progressive_frame 
in CUVIDPARSERDISPINFO (which is unreliable).

More info, samples and reproduction steps are here 
https://github.com/Svechnikov/ffmpeg-cuda-deinterlace-problems
---
 libavcodec/cuviddec.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
index 2aecb45..671fc8c 100644
--- a/libavcodec/cuviddec.c
+++ b/libavcodec/cuviddec.c
@@ -77,6 +77,7 @@ typedef struct CuvidContext
 int deint_mode;
 int deint_mode_current;
 int64_t prev_pts;
+unsigned char progressive_sequence;
 
 int internal_error;
 int decoder_flushing;
@@ -216,6 +217,8 @@ static int CUDAAPI cuvid_handle_video_sequence(void 
*opaque, CUVIDEOFORMAT* form
   ? cudaVideoDeinterlaceMode_Weave
   : ctx->deint_mode;
 
+ctx->progressive_sequence = format->progressive_sequence;
+
 if (!format->progressive_sequence && ctx->deint_mode_current == 
cudaVideoDeinterlaceMode_Weave)
 avctx->flags |= AV_CODEC_FLAG_INTERLACED_DCT;
 else
@@ -509,6 +512,8 @@ static int cuvid_output_frame(AVCodecContext *avctx, 
AVFrame *frame)
 
 av_fifo_generic_read(ctx->frame_queue, &parsed_frame, 
sizeof(CuvidParsedFrame), NULL);
 
+parsed_frame.dispinfo.progressive_frame = ctx->progressive_sequence;
+
 memset(¶ms, 0, sizeof(params));
 params.progressive_frame = parsed_frame.dispinfo.progressive_frame;
 params.second_field = parsed_frame.second_field;
-- 
2.7.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] libavfilter/vf_scale_cuda: fix frame dimensions

2019-05-08 Thread Sergey Svechnikov

AVHWFramesContext has aligned width and height.
When initializing a new AVFrame, it receives these aligned values (in 
av_hwframe_get_buffer), which leads to incorrect scaling.
The resulting frames are cropped either horizontally or vertically.
As a fix we can overwrite the dimensions to original values right after 
av_hwframe_get_buffer.
More info, samples and reproduction steps are here 
https://github.com/Svechnikov/ffmpeg-scale-cuda-problem
---
 libavfilter/vf_scale_cuda.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c
index c97a802..ef1bd82 100644
--- a/libavfilter/vf_scale_cuda.c
+++ b/libavfilter/vf_scale_cuda.c
@@ -463,6 +463,9 @@ static int cudascale_scale(AVFilterContext *ctx, AVFrame 
*out, AVFrame *in)
 if (ret < 0)
 return ret;
 
+s->tmp_frame->width = s->planes_out[0].width;
+s->tmp_frame->height = s->planes_out[0].height;
+
 av_frame_move_ref(out, s->frame);
 av_frame_move_ref(s->frame, s->tmp_frame);
 
-- 
2.7.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] libavfilter/vf_scale_cuda: fix src_pitch for 10bit videos

2019-05-13 Thread Sergey Svechnikov

When scaling a 10bit video using scale_cuda filter (witch uses pixel format 
AV_PIX_FMT_P010LE), the output video gets distorted.
I think it has something to do with the differences in processing between 
cuda_sdk and ffnvcodec with cuda_nvcc
(the problem appears after this commit 
https://github.com/FFmpeg/FFmpeg/commit/2544c7ea67ca9521c5de36396bc9ac7058223742).
To solve the problem we should not divide the input frame planes' linesizes by 
2 and leave them as they are.
More info, samples and reproduction steps are here 
https://github.com/Svechnikov/ffmpeg-scale-cuda-10bit-problem
---
 libavfilter/vf_scale_cuda.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c
index c97a802..7fc33ee 100644
--- a/libavfilter/vf_scale_cuda.c
+++ b/libavfilter/vf_scale_cuda.c
@@ -423,11 +423,11 @@ static int scalecuda_resize(AVFilterContext *ctx,
 break;
 case AV_PIX_FMT_P010LE:
 call_resize_kernel(ctx, s->cu_func_ushort, 1,
-   in->data[0], in->width, in->height, 
in->linesize[0]/2,
+   in->data[0], in->width, in->height, in->linesize[0],
out->data[0], out->width, out->height, 
out->linesize[0]/2,
2);
 call_resize_kernel(ctx, s->cu_func_ushort2, 2,
-   in->data[1], in->width / 2, in->height / 2, 
in->linesize[1]/2,
+   in->data[1], in->width / 2, in->height / 2, 
in->linesize[1],
out->data[0] + out->linesize[0] * ((out->height + 
31) & ~0x1f), out->width / 2, out->height / 2, out->linesize[1] / 4,
2);
 break;
-- 
2.7.4
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] Rename SRT's streamid to srt_streamid to avoid a conflict with standard streamid option

2021-06-03 Thread Sergey Ilinykh

Default streamid is some numeric value and not used by SRT code. Instead
SRT has its own string streamid. Current code has the same option name for
both and this causes a conflict when ffmpeg is started from a terminal.

The attached patch fixes it by renaming SRT's "streamid" to "srt_streamid"

Best Regards,
Sergey
From 46d75e066ec828545ebf242ab0530ecb66d7fc6d Mon Sep 17 00:00:00 2001
From: Sergey Ilinykh 
Date: Thu, 3 Jun 2021 13:13:32 +0300
Subject: [PATCH] Rename SRT's streamid to srt_streamid to avoid a conflict
 with standard streamid option

---
 libavformat/libsrt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavformat/libsrt.c b/libavformat/libsrt.c
index c1e96f700e..10dfc9e9c9 100644
--- a/libavformat/libsrt.c
+++ b/libavformat/libsrt.c
@@ -133,7 +133,7 @@ static const AVOption libsrt_options[] = {
 { "rcvbuf", "Receive buffer size (in bytes)",   OFFSET(rcvbuf),   AV_OPT_TYPE_INT,  { .i64 = -1 }, -1, INT_MAX,   .flags = D|E },
 { "lossmaxttl", "Maximum possible packet reorder tolerance",OFFSET(lossmaxttl),   AV_OPT_TYPE_INT,  { .i64 = -1 }, -1, INT_MAX,   .flags = D|E },
 { "minversion", "The minimum SRT version that is required from the peer",   OFFSET(minversion),   AV_OPT_TYPE_INT,  { .i64 = -1 }, -1, INT_MAX,   .flags = D|E },
-{ "streamid",   "A string of up to 512 characters that an Initiator can pass to a Responder",  OFFSET(streamid),  AV_OPT_TYPE_STRING,   { .str = NULL },  .flags = D|E },
+{ "srt_streamid",   "A string of up to 512 characters that an Initiator can pass to a Responder",  OFFSET(streamid),  AV_OPT_TYPE_STRING,   { .str = NULL },  .flags = D|E },
 { "smoother",   "The type of Smoother used for the transmission for that socket",   OFFSET(smoother), AV_OPT_TYPE_STRING,   { .str = NULL },  .flags = D|E },
 { "messageapi", "Enable message API",   OFFSET(messageapi),   AV_OPT_TYPE_BOOL, { .i64 = -1 }, -1, 1, .flags = D|E },
 { "transtype",  "The transmission type for the socket", OFFSET(transtype),AV_OPT_TYPE_INT,  { .i64 = SRTT_INVALID }, SRTT_LIVE, SRTT_INVALID, .flags = D|E, "transtype" },
@@ -608,7 +608,7 @@ static int libsrt_open(URLContext *h, const char *uri, int flags)
 if (av_find_info_tag(buf, sizeof(buf), "minversion", p)) {
 s->minversion = strtol(buf, NULL, 0);
 }
-if (av_find_info_tag(buf, sizeof(buf), "streamid", p)) {
+if (av_find_info_tag(buf, sizeof(buf), "srt_streamid", p)) {
 av_freep(&s->streamid);
 s->streamid = av_strdup(buf);
 if (!s->streamid) {
-- 
2.31.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Rename SRT's streamid to srt_streamid to avoid a conflict with standard streamid option

2021-06-08 Thread Sergey Ilinykh

this one http://ffmpeg.org/pipermail/ffmpeg-devel/2021-June/280949.html
does a better job. please merge


Best Regards,
Sergey


чт, 3 июн. 2021 г. в 13:37, Sergey Ilinykh :

> Default streamid is some numeric value and not used by SRT code. Instead
> SRT has its own string streamid. Current code has the same option name for
> both and this causes a conflict when ffmpeg is started from a terminal.
>
> The attached patch fixes it by renaming SRT's "streamid" to "srt_streamid"
>
> Best Regards,
> Sergey
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] Fix potential integer overflow in mov_read_keys

2016-09-07 Thread Sergey Volk

Actual allocation size is computed as (count + 1)*sizeof(meta_keys), so
we need to check that (count + 1) won't cause overflow.
From cfc0f5a099284c95476d5c020dca05fb743ff5ae Mon Sep 17 00:00:00 2001
From: Sergey Volk 
Date: Wed, 7 Sep 2016 14:05:35 -0700
Subject: [PATCH] Fix potential integer overflow in mov_read_keys

Actual allocation size is computed as (count + 1)*sizeof(meta_keys), so
we need to check that (count + 1) won't cause overflow.
---
 libavformat/mov.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/mov.c b/libavformat/mov.c
index f499906..ea7d051 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -3278,7 +3278,7 @@ static int mov_read_keys(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 
 avio_skip(pb, 4);
 count = avio_rb32(pb);
-if (count > UINT_MAX / sizeof(*c->meta_keys)) {
+if (count + 1 > UINT_MAX / sizeof(*c->meta_keys)) {
 av_log(c->fc, AV_LOG_ERROR,
"The 'keys' atom with the invalid key count: %d\n", count);
 return AVERROR_INVALIDDATA;
-- 
2.8.0.rc3.226.g39d4020

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] Fix potential integer overflow in mov_read_keys

2016-09-07 Thread Sergey Volk

I just realized that count+1 itself might overflow if count==UINT_MAX, so I
guess it's better to subtract 1 from the right-hand side. Attached updated
patch.

On Wed, Sep 7, 2016 at 2:21 PM, Sergey Volk  wrote:

> Actual allocation size is computed as (count + 1)*sizeof(meta_keys), so
> we need to check that (count + 1) won't cause overflow.
>
>
From 87a7a2e202ebb63362715054773a89ce1fc71743 Mon Sep 17 00:00:00 2001
From: Sergey Volk 
Date: Wed, 7 Sep 2016 14:05:35 -0700
Subject: [PATCH] Fix potential integer overflow in mov_read_keys

Actual allocation size is computed as (count + 1)*sizeof(meta_keys), so
we need to check that (count + 1) won't cause overflow.
---
 libavformat/mov.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/mov.c b/libavformat/mov.c
index f499906..a7595c5 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -3278,7 +3278,7 @@ static int mov_read_keys(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 
 avio_skip(pb, 4);
 count = avio_rb32(pb);
-if (count > UINT_MAX / sizeof(*c->meta_keys)) {
+if (count > UINT_MAX / sizeof(*c->meta_keys) - 1) {
 av_log(c->fc, AV_LOG_ERROR,
"The 'keys' atom with the invalid key count: %d\n", count);
 return AVERROR_INVALIDDATA;
-- 
2.8.0.rc3.226.g39d4020

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id

2016-03-02 Thread Sergey Volk

As far as I can see FFmpeg currently doesn't set AVStream::id for
matroska/webm streams. I think we could use either MatroskaTrack::num
(TrackNumber) or MatroskaTrack::uid (TrackUID) for that.
I have found a few discussions claiming that TrackUID could be missing,
even though TrackUID is marked as mandatory field in matroska spec, for
example see
https://github.com/mbunkus/mkvtoolnix/issues/1050
https://lists.w3.org/Archives/Public/public-inbandtracks/2014May/0003.html
So I guess it's safer to use TrackNumber for now.
---
 libavformat/matroskadec.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
index d20568c..8b80df1 100644
--- a/libavformat/matroskadec.c
+++ b/libavformat/matroskadec.c
@@ -1856,6 +1856,8 @@ static int matroska_parse_tracks(AVFormatContext *s)
 return AVERROR(ENOMEM);
 }

+st->id = (int) track->num;
+
 if (key_id_base64) {
 /* export encryption key id as base64 metadata tag */
 av_dict_set(&st->metadata, "enc_key_id", key_id_base64, 0);
-- 
2.7.0.rc3.207.g0ac5344
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id

2016-03-04 Thread Sergey Volk

Ok, something like this for now, then?
I'm new to ffmpeg development. When is the next version bump going to happen?
---
 libavformat/matroskadec.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
index d20568c..4c3e53a 100644
--- a/libavformat/matroskadec.c
+++ b/libavformat/matroskadec.c
@@ -1856,6 +1856,9 @@ static int matroska_parse_tracks(AVFormatContext *s)
 return AVERROR(ENOMEM);
 }

+if (track->num <= INT_MAX)
+  st->id = (int) track->num;
+
 if (key_id_base64) {
 /* export encryption key id as base64 metadata tag */
 av_dict_set(&st->metadata, "enc_key_id", key_id_base64, 0);
-- 
2.7.0.rc3.207.g0ac5344

On Thu, Mar 3, 2016 at 2:14 AM, Carl Eugen Hoyos  wrote:
> wm4  googlemail.com> writes:
>
>> > +st->id = (int) track->num;
>
>> Might be better after all not to set the id if it's out of range?
>
> Yes, please.
>
> While there, the id field could be changed to 64bit with the
> next version bump.
>
> Carl Eugen
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id

2016-03-09 Thread Sergey Volk

Ok, so then I guess we'll need to update AVStream::id to be 64-bit
first, and then I'll make the necessary changes in matroskadec.
I've prepared a patch to bump AVStream::id to be int64_t in the next
major version, I'll send it out shortly. After I rebuilt ffmpeg
with AVStream::id being int64_t I got a couple of new warnings in the
code that was using 32-bit format specifiers for printing
stream ids, I've fixed those as well. I've also re-ran 'make fate' and
all the tests seem to be good.

On Sat, Mar 5, 2016 at 2:47 AM, Michael Niedermayer
 wrote:
> On Fri, Mar 04, 2016 at 04:19:18PM -0800, Sergey Volk wrote:
>> Ok, something like this for now, then?
>
> your original patch contained a nice commit message, this one
> doesnt
>
>
>> I'm new to ffmpeg development. When is the next version bump going to happen?
>
> you can make changes at the next bump by using #if FF_API...
> see libavfilter/version.h
>
>
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> It is dangerous to be right in matters on which the established authorities
> are wrong. -- Voltaire
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id

2016-03-09 Thread Sergey Volk

From: Sergey Volk 
Date: Wed, 9 Mar 2016 14:34:19 -0800
Subject: [PATCH] Change AVStream::id to int64_t in the next version bump

I have also bumped the major version to 58 locally in version.h, and
re-ran make with the stream id being int64_t and fixed all new
warnings that showed up (only saw new warnings related to the
incorrect format being used for int64_t value).
---
 ffprobe.c   | 8 +++-
 libavformat/avformat.h  | 4 
 libavformat/concatdec.c | 8 ++--
 libavformat/dump.c  | 4 
 libavformat/mpegtsenc.c | 7 ++-
 libavformat/version.h   | 3 +++
 6 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/ffprobe.c b/ffprobe.c
index f7b51ad..21eab61 100644
--- a/ffprobe.c
+++ b/ffprobe.c
@@ -2287,7 +2287,13 @@ static int show_stream(WriterContext *w,
AVFormatContext *fmt_ctx, int stream_id
 }
 }

-if (fmt_ctx->iformat->flags & AVFMT_SHOW_IDS) print_fmt("id",
"0x%x", stream->id);
+#if FF_API_OLD_INT32_STREAM_ID
+#define STREAM_ID_FORMAT "0x%x"
+#else
+#define STREAM_ID_FORMAT "0x%"PRIx64
+#endif
+if (fmt_ctx->iformat->flags & AVFMT_SHOW_IDS) print_fmt("id",
STREAM_ID_FORMAT, stream->id);
+#undef STREAM_ID_FORMAT
 else  print_str_opt("id", "N/A");
 print_q("r_frame_rate",   stream->r_frame_rate,   '/');
 print_q("avg_frame_rate", stream->avg_frame_rate, '/');
diff --git a/libavformat/avformat.h b/libavformat/avformat.h
index a558f2d..253b293 100644
--- a/libavformat/avformat.h
+++ b/libavformat/avformat.h
@@ -871,7 +871,11 @@ typedef struct AVStream {
  * decoding: set by libavformat
  * encoding: set by the user, replaced by libavformat if left unset
  */
+#if FF_API_OLD_INT32_STREAM_ID
 int id;
+#else
+int64_t id;
+#endif
 /**
  * Codec context associated with this stream. Allocated and freed by
  * libavformat.
diff --git a/libavformat/concatdec.c b/libavformat/concatdec.c
index e69096f..481c8433 100644
--- a/libavformat/concatdec.c
+++ b/libavformat/concatdec.c
@@ -238,8 +238,12 @@ static int match_streams_exact_id(AVFormatContext *avf)
 for (j = 0; j < avf->nb_streams; j++) {
 if (avf->streams[j]->id == st->id) {
 av_log(avf, AV_LOG_VERBOSE,
-   "Match slave stream #%d with stream #%d id 0x%x\n",
-   i, j, st->id);
+#if FF_API_OLD_INT32_STREAM_ID
+   "Match slave stream #%d with stream #%d id 0x%x\n"
+#else
+   "Match slave stream #%d with stream #%d id
0x%"PRIx64"\n"
+#endif
+   , i, j, st->id);
 if ((ret = copy_stream_props(avf->streams[j], st)) < 0)
 return ret;
 cat->cur_file->streams[i].out_stream_index = j;
diff --git a/libavformat/dump.c b/libavformat/dump.c
index 86bb82d..8b50ec1 100644
--- a/libavformat/dump.c
+++ b/libavformat/dump.c
@@ -453,7 +453,11 @@ static void dump_stream_format(AVFormatContext *ic, int i,
 /* the pid is an important information, so we display it */
 /* XXX: add a generic system */
 if (flags & AVFMT_SHOW_IDS)
+#if FF_API_OLD_INT32_STREAM_ID
 av_log(NULL, AV_LOG_INFO, "[0x%x]", st->id);
+#else
+av_log(NULL, AV_LOG_INFO, "[0x%"PRIx64"]", st->id);
+#endif
 if (lang)
 av_log(NULL, AV_LOG_INFO, "(%s)", lang->value);
 av_log(NULL, AV_LOG_DEBUG, ", %d, %d/%d", st->codec_info_nb_frames,
diff --git a/libavformat/mpegtsenc.c b/libavformat/mpegtsenc.c
index 68f9867..0244b7f 100644
--- a/libavformat/mpegtsenc.c
+++ b/libavformat/mpegtsenc.c
@@ -833,7 +833,12 @@ static int mpegts_init(AVFormatContext *s)
 ts_st->pid = st->id;
 } else {
 av_log(s, AV_LOG_ERROR,
-   "Invalid stream id %d, must be less than 8191\n", st->id);
+#if FF_API_OLD_INT32_STREAM_ID
+   "Invalid stream id %d, must be less than 8191\n",
+#else
+   "Invalid stream id %"PRId64", must be less than 8191\n",
+#endif
+st->id);
 ret = AVERROR(EINVAL);
 goto fail;
 }
diff --git a/libavformat/version.h b/libavformat/version.h
index 7dcce2c..e0ac45a 100644
--- a/libavformat/version.h
+++ b/libavformat/version.h
@@ -74,6 +74,9 @@
 #ifndef FF_API_OLD_OPEN_CALLBACKS
 #define FF_API_OLD_OPEN_CALLBACKS   (LIBAVFORMAT_VERSION_MAJOR < 58)
 #endif
+#ifndef FF_API_OLD_INT32_STREAM_ID
+#define FF_API_OLD_INT32_STREAM_ID  (LIBAVFORMAT_VERSION_MAJOR < 58)
+#endif

 #ifndef FF_API_R_FRAME_RATE
 #define FF_API_R_FRAME_RATE1
-- 
2.7.0.rc3.207.g0ac5344

On Wed, Mar 9, 2016 at 3

Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id

2016-03-15 Thread Sergey Volk

Yeah, I was using Gmail web interface, it does that. I'll try attaching the
patch file next time.

On Thu, Mar 10, 2016 at 1:23 AM, Moritz Barsnick  wrote:

> On Wed, Mar 09, 2016 at 15:56:53 -0800, Sergey Volk wrote:
> > -if (fmt_ctx->iformat->flags & AVFMT_SHOW_IDS) print_fmt("id",
> > "0x%x", stream->id);
> > +#if FF_API_OLD_INT32_STREAM_ID
> > +#define STREAM_ID_FORMAT "0x%x"
> > +#else
> > +#define STREAM_ID_FORMAT "0x%"PRIx64
> > +#endif
> > +if (fmt_ctx->iformat->flags & AVFMT_SHOW_IDS) print_fmt("id",
> > STREAM_ID_FORMAT, stream->id);
> > +#undef STREAM_ID_FORMAT
>
> From pure visual inspection, I believe your patch got broken (wrapped
> lines) by your mailer agent or something along the line.
>
> Moritz
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] Use matroska TrackNumber for populating AVStream::id

2016-03-15 Thread Sergey Volk

Thanks for the comments, I'll update my next patch to take that into
account.
But first I wanted to discuss your second point (regarding int64_t/uint64_t
choice).
I have actually looked at all places that use AVStream::id (34 files under
libavformat/ + a few more files outside it). There are a few places which
treat stream id as unsigned (e.g. libavformat/bink.c, which assigns
AVStream::id to be the result of avio_rl32(pb) ). But most places either
assign some signed int value to AVStream::id (the int value often comes as
an input parameter to some function that actually assigns stream id) or use
negative values for special cases
(e.g. libavformat/swfdec.c, libavformat/wtvdec.c). Since most of the code
expects AVStream::id to be signed for now, I've decided to make it an
int64_t, it makes for a smaller/easier change.
I have also been trying to figure out what's FFmpeg code style stance on
doing something like 'typedef int64_t StreamId' and then using StreamId
type whenever we deal with stream ids. But that's probably a more C++-style
approach, not sure if it's appropriate here (and
https://ffmpeg.org/developer.html doesn't seem to address this directly).
Any opinions on this?

On Thu, Mar 10, 2016 at 12:19 AM, Nicolas George  wrote:

> Thanks for the patch.
>
> Le decadi 20 ventôse, an CCXXIV, Sergey Volk a écrit :
> > I have also bumped the major version to 58 locally in version.h, and
> > re-ran make with the stream id being int64_t and fixed all new
> > warnings that showed up (only saw new warnings related to the
> > incorrect format being used for int64_t value).
>
> Commit messages are usually written in an impersonal form. Remember that
> they will stay. That does not matter much.
>
> >  av_log(avf, AV_LOG_VERBOSE,
> > -   "Match slave stream #%d with stream #%d id
> 0x%x\n",
> > -   i, j, st->id);
> > +#if FF_API_OLD_INT32_STREAM_ID
> > +   "Match slave stream #%d with stream #%d id
> 0x%x\n"
> > +#else
> > +   "Match slave stream #%d with stream #%d id
> > 0x%"PRIx64"\n"
> > +#endif
> > +   , i, j, st->id);
>
> You could do much simpler by casting the id unconditionally to int64_t:
>
> /* TODO remove cast after FF_API_OLD_INT32_STREAM_ID removal */
> av_log(... "0x%"PRIx64"\n", (int64_t)st->id);
>
> (I would put the comment at each place the cast is used, to ease finding
> all
> the casts that can be removed.)
>
> As a side note, I wonder if uint64_t would not be better than the signed
> variant.
>
> Regards,
>
> --
>   Nicolas George
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] Chrome not able to playback aac_he_v2 when remuxed from mpegts to mp4 using the aac_adtstoasc bitstream filter

2016-04-28 Thread Sergey Volk

Looks like it's failing here:
https://code.google.com/p/chromium/codesearch#chromium/src/media/filters/ffmpeg_audio_decoder.cc&l=419

Here is the error message I got from Chrome:
[1:9:0428/101459:VERBOSE2:decoder_selector.cc(195)] InitializeDecoder
[1:9:0428/101459:ERROR:ffmpeg_audio_decoder.cc(421)] Audio configuration
specified 2 channels, but FFmpeg thinks the file contains 1 channels

So codec_context_->channels is 1, but Chrome expects it to be 2. And
ffprobe confirms that the generated test.mp4 actually has 2 channels:
  Duration: 00:00:10.16, start: 0.00, bitrate: 52 kb/s
Stream #0:0(und): Audio: aac (HE-AACv2) (mp4a / 0x6134706D), 48000 Hz,
stereo, fltp, 51 kb/s (default)

So we need to figure out why avcodec_open2 set channels=1.

On Thu, Apr 28, 2016 at 4:16 AM, Anders Rein  wrote:

> Google Chrome is not able to playback aac_he_v2 streams remuxed from
> mpegts to mp4. To reproduce the problem:
>
>
> ffmpeg -f lavfi -i 'aevalsrc=sin(2*PI*t*440)[out0]' -t 10 -movflags
> faststart -c:a libfdk_aac -ac 2 -ar 48000 -profile:a aac_he_v2 -f mpegts
> tmp.ts
>
> ffmpeg -i tmp.ts -c copy -bsf:a aac_adtstoasc test.mp4
>
>
> However if the audio is encoded directly to mp4 it works fine:
>
>
> ffmpeg -f lavfi -i 'aevalsrc=sin(2*PI*t*440)[out0]' -t 10 -movflags
> faststart -c:a libfdk_aac -ac 2 -ar 48000 -profile:a aac_he_v2 -f mp4
> test.mp4
>
>
> It is only when using aac_he_v2 profile that Chrome refuses to playback
> the file. Using the aac_he profile works fine:
>
>
> ffmpeg -f lavfi -i 'aevalsrc=sin(2*PI*t*440)[out0]' -t 10 -movflags
> faststart -c:a libfdk_aac -ac 2 -ar 48000 -profile:a aac_he -f mpegts tmp.ts
>
> ffmpeg -i tmp.ts -c copy -bsf:a aac_adtstoasc test.mp4
>
>
> There seems to be something wrong with the aac_adtstoasc bitstream filter
> that does not account for aac_he_v2. It might be a bug in Chrome as well
> since ffplay is able to playback all the variations without any problem,
> but please note that Chrome IS able to playback the file when it is encoded
> directly.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] libavfilter/af_biquads: warn about clipping only after frame with clipping

2017-01-12 Thread Sergey Kudryashov

---
 libavfilter/af_biquads.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavfilter/af_biquads.c b/libavfilter/af_biquads.c
index 4953202..79f1b7c 100644
--- a/libavfilter/af_biquads.c
+++ b/libavfilter/af_biquads.c
@@ -420,6 +420,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *buf)
 
 if (s->clippings > 0)
 av_log(ctx, AV_LOG_WARNING, "clipping %d times. Please reduce 
gain.\n", s->clippings);
+s->clippings = 0;
 
 if (buf != out_buf)
 av_frame_free(&buf);
-- 
1.9.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [GSOC] [PATCH] SRCNN filter

2018-03-28 Thread Sergey Lavrushkin

> [...]
> > +#define OFFSET(x) offsetof(SRCNNContext, x)
> > +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM
> > +static const AVOption srcnn_options[] = {
> > +{ "config_file", "path to configuration file with network
> parameters", OFFSET(config_file_path), AV_OPT_TYPE_STRING, {.str=NULL}, 0,
> 0, FLAGS },
> > +{ NULL }
> > +};
> > +
> > +AVFILTER_DEFINE_CLASS(srcnn);
> > +
> > +#define CHECK_FILE(file)if (ferror(file) || feof(file)){ \
> > +av_log(context, AV_LOG_ERROR, "error
> reading configuration file\n");\
> > +fclose(file); \
> > +return AVERROR(EIO); \
> > +}
> > +
> > +#define CHECK_ALLOCATION(conv, file)if
> (allocate_and_read_convolution_data(&conv, file)){ \
> > +av_log(context,
> AV_LOG_ERROR, "could not allocate memory for convolutions\n"); \
> > +fclose(file); \
> > +return AVERROR(ENOMEM); \
> > +}
> > +
>
> > +static int allocate_and_read_convolution_data(Convolution* conv, FILE*
> config_file)
> > +{
> > +int32_t kernel_size = conv->output_channels * conv->size *
> conv->size * conv->input_channels;
> > +conv->kernel = av_malloc(kernel_size * sizeof(double));
> > +if (!conv->kernel){
> > +return 1;
>
> this should return an AVERROR code for consistency with the rest of
> the codebase
>

Ok.


> > +}
>
> > +fread(conv->kernel, sizeof(double), kernel_size, config_file);
>
> directly reading data types is not portable, it would for example be
> endian specific
> and using avio for reading may be better, though fread is as far as iam
> concerned also ok
>

Ok, I understand the problem, but I have not really worked with it before,
so I need an advice of how to properly fix it. If I understand correctly,
for
int32_t I need to check endianness and reverse bytes if necessary. But with
doubles it is more complicated. Should I write a IEEE 754 converter from
binary
to double or maybe I can somehow check IEEE 754 doubles support and
depending
on it either stick to the default network weights, or just read bytes and
check
endianness, if IEEE 754 doubles are supported? Or maybe avio provide some
utility to deal with this problem?


> [...]
> > +/**
> > + * @file
> > + * Default cnn weights for x2 upsampling with srcnn filter.
> > + */
> > +
> > +/// First convolution kernel
>
> > +static double conv1_kernel[] = {
>
> static data should be also const, otherwise it may be changed and could
> cause
> thread saftey issues
>

Ok, I just wanted to not allocate additional memory in case of using
default weights.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [GSOC] [PATCH] SRCNN filter

2018-05-07 Thread Sergey Lavrushkin

2018-05-07 17:41 GMT+03:00 Pedro Arthur :

> 2018-05-07 0:30 GMT-03:00 Steven Liu :
> > Hi Sergey,
> >
> > How should i test this filter?
> > I tested it some days ago, the picture get worse from 2nd frame.
> > input resolution 640x480 to 1280x720;
> >
> > ffmpeg -i input -vf srcnn output
> Hi,
> The filter expects the input upscaled by 2x, therefore the proper
> command would be
>
> ffmpeg -i input -vf "scale=2*iw:2*ih, srcnn, scale=1280:720"
>
> The default filter is trained for 2x upscale, anything different from
> that may generate bad results.


Hi,
Moreover, the filter expects the input upscaled with bicubic upscaling, so
for other upscaling algorithms bad results are also possible.
Also other models for x2, x3, x4 upsampling can be specified using
following command:
ffmpeg -i input -vf scale=iw*factor:-1,srcnn=path_to_model
Configuration files with other models can be found here:
https://drive.google.com/drive/folders/1-M9azWTtZ4egf8ndRU7Y_tiGP6QtN-Fp?usp=sharing
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] avformat/cafenc: fixed packet_size calculation

2024-02-18 Thread sergey radionov

the problem is the very last packet
can be shorter than default packet_size
so it's required to exclude it from
packet_size calculations.
fixes #10465
---
 libavformat/cafenc.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libavformat/cafenc.c b/libavformat/cafenc.c
index 67be59806c..fcc4838392 100644
--- a/libavformat/cafenc.c
+++ b/libavformat/cafenc.c
@@ -34,6 +34,8 @@ typedef struct {
 int size_buffer_size;
 int size_entries_used;
 int packets;
+int64_t duration;
+int64_t last_packet_duration;
 } CAFContext;
 
 static uint32_t codec_flags(enum AVCodecID codec_id) {
@@ -238,6 +240,8 @@ static int caf_write_packet(AVFormatContext *s, AVPacket 
*pkt)
 pkt_sizes[caf->size_entries_used++] = 128 | top;
 }
 pkt_sizes[caf->size_entries_used++] = pkt->size & 127;
+caf->duration += pkt->duration;
+caf->last_packet_duration = pkt->duration;
 caf->packets++;
 }
 avio_write(s->pb, pkt->data, pkt->size);
@@ -259,7 +263,11 @@ static int caf_write_trailer(AVFormatContext *s)
 if (!par->block_align) {
 int packet_size = samples_per_packet(par);
 if (!packet_size) {
-packet_size = st->duration / (caf->packets - 1);
+if (caf->duration) {
+packet_size = (caf->duration - caf->last_packet_duration) 
/ (caf->packets - 1);
+} else {
+packet_size = st->duration / (caf->packets - 1);
+}
 avio_seek(pb, FRAME_SIZE_OFFSET, SEEK_SET);
 avio_wb32(pb, packet_size);
 }
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [GSOC] [PATCH] DNN module introduction and SRCNN filter update

2018-05-25 Thread Sergey Lavrushkin

2018-05-24 22:52 GMT+03:00 James Almer :

> On 5/24/2018 4:24 PM, Sergey Lavrushkin wrote:
> > Hello,
> >
> > This patch introduces DNN inference interface and simple native backend.
> > For now implemented backend supports only convolutions with relu
> activation
> > function, that are sufficient for simple convolutional networks,
> > particularly SRCNN.
> > SRCNN filter was updated using implemented DNN inference interface and
> > native backend.
> >
> >
> > adds_dnn_srcnn.patch
> >
> >
> > From 60247d3deca3c822da0ef8d7390cda08db958830 Mon Sep 17 00:00:00 2001
> > From: Sergey Lavrushkin 
> > Date: Thu, 24 May 2018 22:05:54 +0300
> > Subject: [PATCH] Adds dnn inference module for simple convolutional
> networks.
> >  Reimplements srcnn filter based on it.
> >
> > ---
> >  Changelog  |   2 +
> >  libavfilter/vf_srcnn.c | 300 +++
> >  libavfilter/vf_srcnn.h | 855 --
> ---
> >  libavutil/Makefile |   3 +
> >  libavutil/dnn_backend_native.c | 382 ++
> >  libavutil/dnn_backend_native.h |  40 ++
> >  libavutil/dnn_interface.c  |  48 +++
> >  libavutil/dnn_interface.h  |  64 +++
> >  libavutil/dnn_srcnn.h  | 854 ++
> ++
> >  9 files changed, 1455 insertions(+), 1093 deletions(-)
> >  delete mode 100644 libavfilter/vf_srcnn.h
> >  create mode 100644 libavutil/dnn_backend_native.c
> >  create mode 100644 libavutil/dnn_backend_native.h
> >  create mode 100644 libavutil/dnn_interface.c
> >  create mode 100644 libavutil/dnn_interface.h
> >  create mode 100644 libavutil/dnn_srcnn.h
>
> With this change you're trying to use libavformat API from libavutil,
> which is not ok as the latter must not depend on the former at all. So
> if anything, this belongs in libavformat.
>
> That aside, you're using the ff_ prefix on an installed header, which is
> unusable from outside the library that contains the symbol, and the
> structs are not using the AV* namespace either.
> Does this need to be public API to begin with? If it's only going to be
> used by one or more filters, then it might as well remain as an internal
> module in libavfilter.
>

Yes, I think it will be used only in libavfilter. I'll move it there then.
And should I use  ff_ prefix  and  AV* namespace for structs in
an internal module in libavfilter?


> And you need to indent and prettify the tables a bit.
>

Do you mean kernels and biases in dnn_srcnn.h?
Their formatting represents their 4D structure a little bit and it is
similar to one that I used for the first srcnn filter version that was
successfully pushed before.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [GSOC] [PATCH] DNN module introduction and SRCNN filter update

2018-05-29 Thread Sergey Lavrushkin

2018-05-29 4:08 GMT+03:00 Pedro Arthur :

> 2018-05-28 19:52 GMT-03:00 Sergey Lavrushkin :
> > 2018-05-28 9:32 GMT+03:00 Guo, Yejun :
> >
> >> looks that no tensorflow dependency is introduced, a new model format is
> >> created together with some CPU implementation for inference.   With this
> >> idea, Android Neural Network would be a very good reference, see
> >> https://developer.android.google.cn/ndk/guides/neuralnetworks/. It
> >> defines how the model is organized, and also provided a CPU optimized
> >> inference implementation (within the NNAPI runtime, it is open source).
> It
> >> is still under development but mature enough to run some popular dnn
> models
> >> with proper performance. We can absorb some basic design. Anyway, just a
> >> reference fyi.  (btw, I'm not sure about any IP issue)
> >>
> >
> > The idea was to first introduce something to use when tensorflow is not
> > available. Here is another patch, that introduces tensorflow backend.
> I think it would be better for reviewing if you send the second patch
> in a new email.


Then we need to push the first patch, I think.


> >
> >
> >> For this patch, I have two comments.
> >>
> >> 1. change from "DNNModel* (*load_default_model)(DNNDefaultModel
> >> model_type);" to " DNNModel* (*load_builtin_model)(DNNBuiltinModel
> >> model_type);"
> >> The DNNModule can be invoked by many filters,  default model is a good
> >> name at the filter level, while built-in model is better within the DNN
> >> scope.
> >>
> >> typedef struct DNNModule{
> >> // Loads model and parameters from given file. Returns NULL if it is
> >> not possible.
> >> DNNModel* (*load_model)(const char* model_filename);
> >> // Loads one of the default models
> >> DNNModel* (*load_default_model)(DNNDefaultModel model_type);
> >> // Executes model with specified input and output. Returns DNN_ERROR
> >> otherwise.
> >> DNNReturnType (*execute_model)(const DNNModel* model);
> >> // Frees memory allocated for model.
> >> void (*free_model)(DNNModel** model);
> >> } DNNModule;
> >>
> >>
> >> 2. add a new variable 'number' for DNNData/InputParams
> >> As a typical DNN concept, the data shape usually is:  >> width, channel> or , the last component
> >> denotes its index changes the fastest in the memory. We can add this
> >> concept into the API, and decide to support  or  or both.
> >
> >
> > I did not add number of elements in batch because I thought, that we
> would
> > not feed more than one element at once to a network in a ffmpeg filter.
> > But it can be easily added if necessary.
> >
> > So here is the patch that adds tensorflow backend with the previous
> patch.
> > I forgot to change include guards from AVUTIL_* to AVFILTER_* in it.
> You moved the files from libavutil to libavfilter while it was
> proposed to move them to libavformat.


Not only, it was also proposed to move it to libavfilter if it is going to
be used only
in filters. I do not know if this module is useful anywhere else besides
libavfilter.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module

2018-06-01 Thread Sergey Lavrushkin

2018-06-01 6:09 GMT+03:00 Guo, Yejun :

> Did you try to build ffmpeg with TENSORFLOW_BACKEND enabled, and run it
> without TF library?  This case is possible when an end user install
> pre-built package on a machine without TF library.
>
> In function init, the logic is to fall back to cpu path (DNN_NATIVE) if
> unable to load tensorflow backend. While in function ff_get_dnn_module, it
> has no chance to 'return NULL'.
>

I tried to run ffmpeg built with libtensorflow enabled and without
tensorflow library, it didn't start. I got this message:

ffmpeg: error while loading shared libraries: libtensorflow.so: cannot open
shared object file: No such file or directory

Is it even possible to run it without library that was enabled during
configuration? Maybe I need to change something in the
configure script? Otherwise there is no point to add any fallback to
DNN_NATIVE, if it just won't start.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module

2018-06-02 Thread Sergey Lavrushkin

2018-06-02 19:45 GMT+03:00 James Almer :

> On 5/31/2018 12:01 PM, Sergey Lavrushkin wrote:
> > diff --git a/Changelog b/Changelog
> > index df2024fb59..a667fd045d 100644
> > --- a/Changelog
> > +++ b/Changelog
> > @@ -11,6 +11,7 @@ version :
> >  - support mbedTLS based TLS
> >  - DNN inference interface
> >  - Reimplemented SRCNN filter using DNN inference interface
> > +- TensorFlow DNN backend
>
> This and the two entries you added earlier don't really belong here.
> It's enough with the line stating the filter was introduced back in
> ffmpeg 4.0
>

I should not add any line regarding introduced DNN inference module,
that can be usefull for someone writing another filter based on DNN?


> >
> >
> >  version 4.0:
> > diff --git a/configure b/configure
> > index 09ff0c55e2..47e21fec39 100755
> > --- a/configure
> > +++ b/configure
> > @@ -259,6 +259,7 @@ External library support:
> >--enable-libspeexenable Speex de/encoding via libspeex [no]
> >--enable-libsrt  enable Haivision SRT protocol via libsrt [no]
> >--enable-libssh  enable SFTP protocol via libssh [no]
> > +  --enable-libtensorflow   enable TensorFlow as a DNN module backend
> [no]
>
> Maybe mention it's for the srcnn filter.
>
> >--enable-libtesseractenable Tesseract, needed for ocr filter [no]
> >--enable-libtheora   enable Theora encoding via libtheora [no]
> >--enable-libtls  enable LibreSSL (via libtls), needed for
> https support
> > @@ -1713,6 +1714,7 @@ EXTERNAL_LIBRARY_LIST="
> >  libspeex
> >  libsrt
> >  libssh
> > +libtensorflow
> >  libtesseract
> >  libtheora
> >  libtwolame
> > @@ -3453,7 +3455,7 @@ avcodec_select="null_bsf"
> >  avdevice_deps="avformat avcodec avutil"
> >  avdevice_suggest="libm"
> >  avfilter_deps="avutil"
> > -avfilter_suggest="libm"
> > +avfilter_suggest="libm libtensorflow"
>
> Add instead
>
> srcnn_filter_suggest="libtensorflow"
>
> To the corresponding section.
>

But this DNN inference module can be used for other filters.
At least, I think, that after training more complicated models for super
resolution I'll have to add them as separate filters.
So, I thought, this module shouldn't be a part of srcnn filter from the
begining.
Or is it better to add  *_filter_suggest="libtensorflow" to the configure
script and
dnn_*.o to the Makefile for every new filter based on this module?


> >  avformat_deps="avcodec avutil"
> >  avformat_suggest="libm network zlib"
> >  avresample_deps="avutil"
> > @@ -6055,6 +6057,7 @@ enabled libsoxr   && require libsoxr
> soxr.h soxr_create -lsoxr
> >  enabled libssh&& require_pkg_config libssh libssh
> libssh/sftp.h sftp_init
> >  enabled libspeex  && require_pkg_config libspeex speex
> speex/speex.h speex_decoder_init
> >  enabled libsrt&& require_pkg_config libsrt "srt >= 1.2.0"
> srt/srt.h srt_socket
> > +enabled libtensorflow && require libtensorflow tensorflow/c/c_api.h
> TF_Version -ltensorflow && add_cflags -DTENSORFLOW_BACKEND
>
> Superfluous define. Just check for CONFIG_LIBTENSORFLOW instead.
>
> >  enabled libtesseract  && require_pkg_config libtesseract tesseract
> tesseract/capi.h TessBaseAPICreate
> >  enabled libtheora && require libtheora theora/theoraenc.h
> th_info_init -ltheoraenc -ltheoradec -logg
> >  enabled libtls&& require_pkg_config libtls libtls tls.h
> tls_configure
> > diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> > index 3201cbeacf..82915e2f75 100644
> > --- a/libavfilter/Makefile
> > +++ b/libavfilter/Makefile
> > @@ -14,6 +14,7 @@ OBJS = allfilters.o
>  \
> > buffersrc.o
> \
> > dnn_interface.o
> \
> > dnn_backend_native.o
>  \
> > +   dnn_backend_tf.o
>  \
>
> See Jan Ekström's patch. Add this to the filter's entry as all these
> source files should not be compiled unconditionally.
>
> > drawutils.o
> \
> > fifo.o
>  \
> > formats.o
> \
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module

2018-06-03 Thread Sergey Lavrushkin

2018-06-03 19:57 GMT+03:00 Pedro Arthur :

> 2018-05-31 12:01 GMT-03:00 Sergey Lavrushkin :
> > Hello,
> >
> > This patch introduces TensorFlow backend for DNN inference module.
> > This backend uses TensorFlow binary models and requires from model
> > to have the operation named 'x' as an input operation and the operation
> > named 'y' as an output operation. Models are executed using
> libtensorflow.
>
> Hi,
>
> You added the tf model in dnn_srcnn.h, it seems the data is being
> duplicated as it already contains the weights as C float arrays.
> Is it possible to construct the model graph via C api and set the
> weights using the ones we already have, eliminating the need for
> storing the whole tf model?


Hi,

I think, it is possible, but it will require to manually create every
operation
and specify each of their attributes and inputs in a certain order
specified by
operations declaration. Here is that model:
https://drive.google.com/file/d/1s7bW7QnUfmTaYoMLPdYYTOLujqNgRq0J/view?usp=sharing
It is just a lot easier to store the whole model and not construct it
manually.
Another way, I think of, is to pass weights in placeholders and not save
them in
model, but it has to be done when session is already created and not during
model
loading. Maybe some init operation can be specified with variables
assignment to values
passed through placeholders during model loading, if it is possible. But is
it really crucial
to not store the whole tf model? It is not that big.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module

2018-06-03 Thread Sergey Lavrushkin

>
> My concern is when we add more models, currently we have to store 2
> models, one for the "native" implementation and one for the TF
> backend.
> There is also the case were one wants to update the weights for a
> model, it will be necessary to update both the native and TF data.
> Having duplicated data is much easier to get inconsistencies between
> implementations.
>

I understand the problem, but I am afraid that manual graph construction can
take a lot of time, especially if we add something more complicated than
srcnn,
and the second approach passing weights in placeholders will require to
add some logic for it in other parts of API besides model loading.
I am thinking of another way, that is to get weights for native model from
this binary
tf model, if they are stored there consistently, and not specify them as
float arrays.
But then for each new model we need to find offsets for each weights array.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module

2018-06-05 Thread Sergey Lavrushkin

2018-06-05 17:20 GMT+03:00 James Almer :

> On 6/3/2018 3:02 PM, Sergey Lavrushkin wrote:
> > diff --git a/libavfilter/vf_srcnn.c b/libavfilter/vf_srcnn.c
> > index d6efe9b478..5c5e26b33a 100644
> > --- a/libavfilter/vf_srcnn.c
> > +++ b/libavfilter/vf_srcnn.c
> > @@ -41,7 +41,6 @@ typedef struct SRCNNContext {
> >  DNNData input_output;
> >  } SRCNNContext;
> >
> > -
> >  #define OFFSET(x) offsetof(SRCNNContext, x)
> >  #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM
> >  static const AVOption srcnn_options[] = {
> > @@ -55,10 +54,19 @@ static av_cold int init(AVFilterContext* context)
> >  {
> >  SRCNNContext* srcnn_context = context->priv;
> >
> > -srcnn_context->dnn_module = ff_get_dnn_module(DNN_NATIVE);
> > +srcnn_context->dnn_module = ff_get_dnn_module(DNN_TF);
>
> This should be a filter AVOption, not hardcoded to one or another. What
> if i, for whatever reason, want to use the native backend when i have
> libtensorflow enabled?
>
> >  if (!srcnn_context->dnn_module){
> > -av_log(context, AV_LOG_ERROR, "could not create dnn module\n");
> > -return AVERROR(ENOMEM);
> > +srcnn_context->dnn_module = ff_get_dnn_module(DNN_NATIVE);
> > +if (!srcnn_context->dnn_module){
> > +av_log(context, AV_LOG_ERROR, "could not create dnn
> module\n");
> > +return AVERROR(ENOMEM);
> > +}
> > +else{
> > +av_log(context, AV_LOG_INFO, "using native backend for DNN
> inference\n");
>
> VERBOSE, not INFO
>
> > +}
> > +}
> > +else{
> > +av_log(context, AV_LOG_INFO, "using tensorflow backend for DNN
> inference\n");
>
> Ditto.
>
> >  }
> >  if (!srcnn_context->model_filename){
> >  av_log(context, AV_LOG_INFO, "model file for network was not
> specified, using default network for x2 upsampling\n");


Here is the patch, that fixes described issues.
From 971e15b4b1e3f2747aa07d0221f99226cba622ac Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Wed, 6 Jun 2018 01:44:40 +0300
Subject: [PATCH] libavfilter/vf_srcnn.c: adds DNN module backend AVOption,
 changes AV_LOG_INFO message to AV_LOG_VERBOSE.

---
 libavfilter/vf_srcnn.c | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/libavfilter/vf_srcnn.c b/libavfilter/vf_srcnn.c
index 5c5e26b33a..17e380503e 100644
--- a/libavfilter/vf_srcnn.c
+++ b/libavfilter/vf_srcnn.c
@@ -36,6 +36,7 @@ typedef struct SRCNNContext {
 
 char* model_filename;
 float* input_output_buf;
+DNNBackendType backend_type;
 DNNModule* dnn_module;
 DNNModel* model;
 DNNData input_output;
@@ -44,6 +45,9 @@ typedef struct SRCNNContext {
 #define OFFSET(x) offsetof(SRCNNContext, x)
 #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM
 static const AVOption srcnn_options[] = {
+{ "dnn_backend", "DNN backend used for model execution", OFFSET(backend_type), AV_OPT_TYPE_FLAGS, { .i64 = 0 }, 0, 1, FLAGS, "backend" },
+{ "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" },
+{ "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" },
 { "model_filename", "path to model file specifying network architecture and its parameters", OFFSET(model_filename), AV_OPT_TYPE_STRING, {.str=NULL}, 0, 0, FLAGS },
 { NULL }
 };
@@ -54,29 +58,20 @@ static av_cold int init(AVFilterContext* context)
 {
 SRCNNContext* srcnn_context = context->priv;
 
-srcnn_context->dnn_module = ff_get_dnn_module(DNN_TF);
+srcnn_context->dnn_module = ff_get_dnn_module(srcnn_context->backend_type);
 if (!srcnn_context->dnn_module){
-srcnn_context->dnn_module = ff_get_dnn_module(DNN_NATIVE);
-if (!srcnn_context->dnn_module){
-av_log(context, AV_LOG_ERROR, "could not create dnn module\n");
-return AVERROR(ENOMEM);
-}
-else{
-av_log(context, AV_LOG_INFO, "using native backend for DNN inference\n");
-}
-}
-else{
-av_log(context, AV_LOG_INFO, "using tensorflow backend for DNN inference\n");
+av_log(context, AV_LOG_ERROR, "could not create DNN module for requested backend\n");
+return AVERROR(ENOMEM);
 }
 if (!srcnn_context->model_filename){
-av_log(context, AV_LOG_INFO, "model file for network was not specified, using default network for x2 upsampling\n");
+

Re: [FFmpeg-devel] [GSOC] [PATCH] TensorFlow backend introduction for DNN module

2018-06-07 Thread Sergey Lavrushkin

2018-06-06 17:22 GMT+03:00 Pedro Arthur :

> Hi,
>
> 2018-06-05 20:23 GMT-03:00 Sergey Lavrushkin :
> > Here is the patch, that fixes described issues.
> When I try to run (video input), when tf is not enabled in configure it
> crashes.
>
>
> $ffmpeg -i in.mp4 -vf srcnn=dnn_backend=tensorflow out.mp4
>
> ffmpeg version N-91232-g256386fd3e Copyright (c) 2000-2018 the FFmpeg
> developers
>   built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
>   configuration:
>   libavutil  56. 18.102 / 56. 18.102
>   libavcodec 58. 19.105 / 58. 19.105
>   libavformat58. 17.100 / 58. 17.100
>   libavdevice58.  4.100 / 58.  4.100
>   libavfilter 7. 25.100 /  7. 25.100
>   libswscale  5.  2.100 /  5.  2.100
>   libswresample   3.  2.100 /  3.  2.100
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'in.mp4':
>   Metadata:
> major_brand : isom
> minor_version   : 512
> compatible_brands: isomiso2mp41
> encoder : Lavf58.17.100
>   Duration: 00:06:13.70, start: 0.00, bitrate: 5912 kb/s
> Stream #0:0(und): Video: mpeg4 (Simple Profile) (mp4v /
> 0x7634706D), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 5777 kb/s, 29.97
> fps, 29.97 tbr, 30k tbn, 30k tbc (default)
> Metadata:
>   handler_name: VideoHandler
> Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz,
> stereo, fltp, 128 kb/s (default)
> Metadata:
>   handler_name: SoundHandler
> Stream mapping:
>   Stream #0:0 -> #0:0 (mpeg4 (native) -> mpeg4 (native))
>   Stream #0:1 -> #0:1 (aac (native) -> aac (native))
> Press [q] to stop, [?] for help
> free(): invalid pointer
> Aborted (core dumped)
>
>
>
> When the output is an image, t does not crashes but neither fallback to
> native
>
>
> $ffmpeg -i in.jpg -vf srcnn=dnn_backend=tensorflow out.png
>
> ffmpeg version N-91232-g256386fd3e Copyright (c) 2000-2018 the FFmpeg
> developers
>   built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
>   configuration:
>   libavutil  56. 18.102 / 56. 18.102
>   libavcodec 58. 19.105 / 58. 19.105
>   libavformat58. 17.100 / 58. 17.100
>   libavdevice58.  4.100 / 58.  4.100
>   libavfilter 7. 25.100 /  7. 25.100
>   libswscale  5.  2.100 /  5.  2.100
>   libswresample   3.  2.100 /  3.  2.100
> Input #0, image2, from 'in.jpg':
>   Duration: 00:00:00.04, start: 0.00, bitrate: 43469 kb/s
> Stream #0:0: Video: mjpeg, yuvj444p(pc, bt470bg/unknown/unknown),
> 1192x670 [SAR 1:1 DAR 596:335], 25 tbr, 25 tbn, 25 tbc
> Stream mapping:
>   Stream #0:0 -> #0:0 (mjpeg (native) -> png (native))
> Press [q] to stop, [?] for help
> [Parsed_srcnn_0 @ 0x557d3ea55980] could not create DNN module for
> requested backend
> [AVFilterGraph @ 0x557d3ea102c0] Error initializing filter 'srcnn'
> with args 'dnn_backend=tensorflow'
> Error reinitializing filters!
> Failed to inject frame into filter network: Cannot allocate memory
> Error while processing the decoded data for stream #0:0
> Conversion failed!
>
>
> I think you could disable the tensorflow option if it is not enable in
> configure or fallback to native, either solution is ok for me.


I disabled tensorflow option when it is not configured with it. Here is the
updated patch.
I think, crash occurred due to improper call to av_freep for dnn_module.
Here is also the patch, that fixes this bug.
From 33c1e08b650f3724c1317f024d716c8234e283b6 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Wed, 6 Jun 2018 01:44:40 +0300
Subject: [PATCH 1/2] libavfilter/vf_srcnn.c: adds DNN module backend AVOption,
 changes AV_LOG_INFO message to AV_LOG_VERBOSE.

---
 libavfilter/vf_srcnn.c | 25 +++--
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/libavfilter/vf_srcnn.c b/libavfilter/vf_srcnn.c
index 5c5e26b33a..bba54f6780 100644
--- a/libavfilter/vf_srcnn.c
+++ b/libavfilter/vf_srcnn.c
@@ -36,6 +36,7 @@ typedef struct SRCNNContext {
 
 char* model_filename;
 float* input_output_buf;
+DNNBackendType backend_type;
 DNNModule* dnn_module;
 DNNModel* model;
 DNNData input_output;
@@ -44,6 +45,11 @@ typedef struct SRCNNContext {
 #define OFFSET(x) offsetof(SRCNNContext, x)
 #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM
 static const AVOption srcnn_options[] = {
+{ "dnn_backend", "DNN backend used for model execution", OFFSET(backend_type), AV_OPT_TYPE_FLAGS, { .i64 = 0 }, 0, 1, FLAGS, "backend" },
+{ "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" },
+#if (CONFIG_LIBTENSORFLOW == 1)
+{ "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i

Re: [FFmpeg-devel] [GSOC] [PATCH] On the fly generation of default DNN models and code style fixes

2018-07-28 Thread Sergey Lavrushkin

2018-07-28 4:31 GMT+03:00 Michael Niedermayer :

> On Fri, Jul 27, 2018 at 08:06:15PM +0300, Sergey Lavrushkin wrote:
> > Hello,
> >
> > The first patch provides on the fly generation of default DNN models,
> > that eliminates data duplication for model weights. Also, files with
> > internal weights
> > were replaced with automatically generated one for models I trained.
> > Scripts for training and generating these files can be found here:
> > https://github.com/HighVoltageRocknRoll/sr
> > Later, I will add a description to this repo on how to use it and
> benchmark
> > results for trained models.
> >
> > The second patch fixes some code style issues for pointers in DNN module
> > and sr filter. Are there any other code style fixes I should make for
> this
> > code?
>
>
> It seems the code with these patches produces some warnings:
>
> In file included from libavfilter/dnn_backend_native.c:27:0:
> libavfilter/dnn_srcnn.h:2113:21: warning: ‘srcnn_consts’ defined but not
> used [-Wunused-variable]
>  static const float *srcnn_consts[] = {
>  ^
> libavfilter/dnn_srcnn.h:2122:24: warning: ‘srcnn_consts_dims’ defined but
> not used [-Wunused-variable]
>  static const long int *srcnn_consts_dims[] = {
> ^
> libavfilter/dnn_srcnn.h:2142:20: warning: ‘srcnn_activations’ defined but
> not used [-Wunused-variable]
>  static const char *srcnn_activations[] = {
> ^
> In file included from libavfilter/dnn_backend_native.c:28:0:
> libavfilter/dnn_espcn.h:5401:21: warning: ‘espcn_consts’ defined but not
> used [-Wunused-variable]
>  static const float *espcn_consts[] = {
>  ^
> libavfilter/dnn_espcn.h:5410:24: warning: ‘espcn_consts_dims’ defined but
> not used [-Wunused-variable]
>  static const long int *espcn_consts_dims[] = {
> ^
> libavfilter/dnn_espcn.h:5432:20: warning: ‘espcn_activations’ defined but
> not used [-Wunused-variable]
>  static const char *espcn_activations[] = {
> ^
>

Here is the patch, that fixes these warnings.
From 37cd7bdf2610e1c3e89210a49e8f5f3832726281 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Sat, 28 Jul 2018 12:55:02 +0300
Subject: [PATCH 3/3] libavfilter: Fixes warnings for unused variables in
 dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c.

---
 libavfilter/dnn_backend_tf.c | 64 +++-
 libavfilter/dnn_espcn.h  | 37 -
 libavfilter/dnn_srcnn.h  | 35 
 3 files changed, 63 insertions(+), 73 deletions(-)

diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
index 6307c794a5..7a4ad72d27 100644
--- a/libavfilter/dnn_backend_tf.c
+++ b/libavfilter/dnn_backend_tf.c
@@ -374,9 +374,71 @@ DNNModel *ff_dnn_load_default_model_tf(DNNDefaultModel model_type)
 TFModel *tf_model = NULL;
 TF_OperationDescription *op_desc;
 TF_Operation *op;
-TF_Operation *const_ops_buffer[6];
 TF_Output input;
 int64_t input_shape[] = {1, -1, -1, 1};
+const char tanh[] = "Tanh";
+const char sigmoid[] = "Sigmoid";
+const char relu[] = "Relu";
+
+const float *srcnn_consts[] = {
+srcnn_conv1_kernel,
+srcnn_conv1_bias,
+srcnn_conv2_kernel,
+srcnn_conv2_bias,
+srcnn_conv3_kernel,
+srcnn_conv3_bias
+};
+const long int *srcnn_consts_dims[] = {
+srcnn_conv1_kernel_dims,
+srcnn_conv1_bias_dims,
+srcnn_conv2_kernel_dims,
+srcnn_conv2_bias_dims,
+srcnn_conv3_kernel_dims,
+srcnn_conv3_bias_dims
+};
+const int srcnn_consts_dims_len[] = {
+4,
+1,
+4,
+1,
+4,
+1
+};
+const char *srcnn_activations[] = {
+relu,
+relu,
+relu
+};
+
+const float *espcn_consts[] = {
+espcn_conv1_kernel,
+espcn_conv1_bias,
+espcn_conv2_kernel,
+espcn_conv2_bias,
+espcn_conv3_kernel,
+espcn_conv3_bias
+};
+const long int *espcn_consts_dims[] = {
+espcn_conv1_kernel_dims,
+espcn_conv1_bias_dims,
+espcn_conv2_kernel_dims,
+espcn_conv2_bias_dims,
+espcn_conv3_kernel_dims,
+espcn_conv3_bias_dims
+};
+const int espcn_consts_dims_len[] = {
+4,
+1,
+4,
+1,
+4,
+1
+};
+const char *espcn_activations[] = {
+tanh,
+tanh,
+sigmoid
+};
 
 input.index = 0;
 
diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h
index a0dd61cd0d..9344aa90fe 100644
--- a/libavfilter/dnn_espcn.h
+++ b/libavfilter/dnn_espcn.h
@@ -5398,41 +5398,4 @@ static const long int espcn_conv3_bias_dims

Re: [FFmpeg-devel] [GSOC] [PATCH] On the fly generation of default DNN models and code style fixes

2018-07-30 Thread Sergey Lavrushkin

2018-07-30 2:01 GMT+03:00 Michael Niedermayer :

> On Sat, Jul 28, 2018 at 01:00:53PM +0300, Sergey Lavrushkin wrote:
> > 2018-07-28 4:31 GMT+03:00 Michael Niedermayer :
> >
> > > On Fri, Jul 27, 2018 at 08:06:15PM +0300, Sergey Lavrushkin wrote:
> > > > Hello,
> > > >
> > > > The first patch provides on the fly generation of default DNN models,
> > > > that eliminates data duplication for model weights. Also, files with
> > > > internal weights
> > > > were replaced with automatically generated one for models I trained.
> > > > Scripts for training and generating these files can be found here:
> > > > https://github.com/HighVoltageRocknRoll/sr
> > > > Later, I will add a description to this repo on how to use it and
> > > benchmark
> > > > results for trained models.
> > > >
> > > > The second patch fixes some code style issues for pointers in DNN
> module
> > > > and sr filter. Are there any other code style fixes I should make for
> > > this
> > > > code?
> > >
> > >
> > > It seems the code with these patches produces some warnings:
> > >
> > > In file included from libavfilter/dnn_backend_native.c:27:0:
> > > libavfilter/dnn_srcnn.h:2113:21: warning: ‘srcnn_consts’ defined but
> not
> > > used [-Wunused-variable]
> > >  static const float *srcnn_consts[] = {
> > >  ^
> > > libavfilter/dnn_srcnn.h:2122:24: warning: ‘srcnn_consts_dims’ defined
> but
> > > not used [-Wunused-variable]
> > >  static const long int *srcnn_consts_dims[] = {
> > > ^
> > > libavfilter/dnn_srcnn.h:2142:20: warning: ‘srcnn_activations’ defined
> but
> > > not used [-Wunused-variable]
> > >  static const char *srcnn_activations[] = {
> > > ^
> > > In file included from libavfilter/dnn_backend_native.c:28:0:
> > > libavfilter/dnn_espcn.h:5401:21: warning: ‘espcn_consts’ defined but
> not
> > > used [-Wunused-variable]
> > >  static const float *espcn_consts[] = {
> > >  ^
> > > libavfilter/dnn_espcn.h:5410:24: warning: ‘espcn_consts_dims’ defined
> but
> > > not used [-Wunused-variable]
> > >  static const long int *espcn_consts_dims[] = {
> > > ^
> > > libavfilter/dnn_espcn.h:5432:20: warning: ‘espcn_activations’ defined
> but
> > > not used [-Wunused-variable]
> > >  static const char *espcn_activations[] = {
> > > ^
> > >
> >
> > Here is the patch, that fixes these warnings.
>
> >  dnn_backend_tf.c |   64 ++
> -
> >  dnn_espcn.h  |   37 ---
> >  dnn_srcnn.h  |   35 --
> >  3 files changed, 63 insertions(+), 73 deletions(-)
> > 1faef51b86165326a4693c07a203113e2c85f7fb  0003-libavfilter-Fixes-
> warnings-for-unused-variables-in-d.patch
> > From 37cd7bdf2610e1c3e89210a49e8f5f3832726281 Mon Sep 17 00:00:00 2001
> > From: Sergey Lavrushkin 
> > Date: Sat, 28 Jul 2018 12:55:02 +0300
> > Subject: [PATCH 3/3] libavfilter: Fixes warnings for unused variables in
> >  dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c.
> >
> > ---
> >  libavfilter/dnn_backend_tf.c | 64 ++
> +-
> >  libavfilter/dnn_espcn.h  | 37 -
> >  libavfilter/dnn_srcnn.h  | 35 
> >  3 files changed, 63 insertions(+), 73 deletions(-)
> >
> > diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
> > index 6307c794a5..7a4ad72d27 100644
> > --- a/libavfilter/dnn_backend_tf.c
> > +++ b/libavfilter/dnn_backend_tf.c
> > @@ -374,9 +374,71 @@ DNNModel *ff_dnn_load_default_model_tf(DNNDefaultModel
> model_type)
> >  TFModel *tf_model = NULL;
> >  TF_OperationDescription *op_desc;
> >  TF_Operation *op;
> > -TF_Operation *const_ops_buffer[6];
> >  TF_Output input;
> >  int64_t input_shape[] = {1, -1, -1, 1};
> > +const char tanh[] = "Tanh";
> > +const char sigmoid[] = "Sigmoid";
> > +const char relu[] = "Relu";
> > +
> > +const float *srcnn_consts[] = {
> > +srcnn_conv1_kernel,
> > +srcnn_conv1_bias,
> > +srcnn_conv2_kernel,
> > +srcnn_conv2_bias,
> > +srcnn_conv3_kernel,
&

[FFmpeg-devel] [PATCH 7/7] libavfilter: Adds proper file descriptions to dnn_srcnn.h and dnn_espcn.h.

2018-08-02 Thread Sergey Lavrushkin

---
 libavfilter/dnn_espcn.h | 3 ++-
 libavfilter/dnn_srcnn.h | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h
index 9344aa90fe..e0013fe1dd 100644
--- a/libavfilter/dnn_espcn.h
+++ b/libavfilter/dnn_espcn.h
@@ -20,7 +20,8 @@
 
 /**
  * @file
- * Default cnn weights for x2 upscaling with espcn model.
+ * This file contains CNN weights for ESPCN model 
(https://arxiv.org/abs/1609.05158),
+ * auto generated by scripts provided in the repository: 
https://github.com/HighVoltageRocknRoll/sr.git.
  */
 
 #ifndef AVFILTER_DNN_ESPCN_H
diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h
index 4f5332ce18..8bf563bd62 100644
--- a/libavfilter/dnn_srcnn.h
+++ b/libavfilter/dnn_srcnn.h
@@ -20,7 +20,8 @@
 
 /**
  * @file
- * Default cnn weights for x2 upscaling with srcnn model.
+ * This file contains CNN weights for SRCNN model 
(https://arxiv.org/abs/1501.00092),
+ * auto generated by scripts provided in the repository: 
https://github.com/HighVoltageRocknRoll/sr.git.
  */
 
 #ifndef AVFILTER_DNN_SRCNN_H
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH 5/7] libavfilter/dnn_backend_tf.c: Fixes ff_dnn_free_model_tf.

2018-08-02 Thread Sergey Lavrushkin

---
 libavfilter/dnn_backend_tf.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
index 7a4ad72d27..662a2a3c6e 100644
--- a/libavfilter/dnn_backend_tf.c
+++ b/libavfilter/dnn_backend_tf.c
@@ -570,7 +570,9 @@ void ff_dnn_free_model_tf(DNNModel **model)
 if (tf_model->input_tensor){
 TF_DeleteTensor(tf_model->input_tensor);
 }
-av_freep(&tf_model->output_data->data);
+if (tf_model->output_data){
+av_freep(&(tf_model->output_data->data));
+}
 av_freep(&tf_model);
 av_freep(model);
 }
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH 6/7] libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8 conversions.

2018-08-02 Thread Sergey Lavrushkin

This patch removes conversions, declared inside the sr filter, and uses 
libswscale inside
the filter to perform them for only Y channel of input. The sr filter still has 
uint
formats as input, as it does not use chroma channels in models and these 
channels are
upscaled using libswscale, float formats for input would cause unnecessary 
conversions
during scaling for these channels.

---
 libavfilter/vf_sr.c | 134 +++-
 1 file changed, 48 insertions(+), 86 deletions(-)

diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
index 944a0e28e7..5ad1baa4c0 100644
--- a/libavfilter/vf_sr.c
+++ b/libavfilter/vf_sr.c
@@ -45,8 +45,8 @@ typedef struct SRContext {
 DNNModel *model;
 DNNData input, output;
 int scale_factor;
-struct SwsContext *sws_context;
-int sws_slice_h;
+struct SwsContext *sws_contexts[3];
+int sws_slice_h, sws_input_linesize, sws_output_linesize;
 } SRContext;
 
 #define OFFSET(x) offsetof(SRContext, x)
@@ -95,6 +95,10 @@ static av_cold int init(AVFilterContext *context)
 return AVERROR(EIO);
 }
 
+sr_context->sws_contexts[0] = NULL;
+sr_context->sws_contexts[1] = NULL;
+sr_context->sws_contexts[2] = NULL;
+
 return 0;
 }
 
@@ -110,6 +114,7 @@ static int query_formats(AVFilterContext *context)
 av_log(context, AV_LOG_ERROR, "could not create formats list\n");
 return AVERROR(ENOMEM);
 }
+
 return ff_set_common_formats(context, formats_list);
 }
 
@@ -140,21 +145,31 @@ static int config_props(AVFilterLink *inlink)
 else{
 outlink->h = sr_context->output.height;
 outlink->w = sr_context->output.width;
+sr_context->sws_contexts[1] = sws_getContext(sr_context->input.width, 
sr_context->input.height, AV_PIX_FMT_GRAY8,
+ sr_context->input.width, 
sr_context->input.height, AV_PIX_FMT_GRAYF32,
+ 0, NULL, NULL, NULL);
+sr_context->sws_input_linesize = sr_context->input.width << 2;
+sr_context->sws_contexts[2] = sws_getContext(sr_context->output.width, 
sr_context->output.height, AV_PIX_FMT_GRAYF32,
+ sr_context->output.width, 
sr_context->output.height, AV_PIX_FMT_GRAY8,
+ 0, NULL, NULL, NULL);
+sr_context->sws_output_linesize = sr_context->output.width << 2;
+if (!sr_context->sws_contexts[1] || !sr_context->sws_contexts[2]){
+av_log(context, AV_LOG_ERROR, "could not create SwsContext for 
conversions\n");
+return AVERROR(ENOMEM);
+}
 switch (sr_context->model_type){
 case SRCNN:
-sr_context->sws_context = sws_getContext(inlink->w, inlink->h, 
inlink->format,
- outlink->w, outlink->h, 
outlink->format, SWS_BICUBIC, NULL, NULL, NULL);
-if (!sr_context->sws_context){
-av_log(context, AV_LOG_ERROR, "could not create SwsContext\n");
+sr_context->sws_contexts[0] = sws_getContext(inlink->w, inlink->h, 
inlink->format,
+ outlink->w, 
outlink->h, outlink->format,
+ SWS_BICUBIC, NULL, 
NULL, NULL);
+if (!sr_context->sws_contexts[0]){
+av_log(context, AV_LOG_ERROR, "could not create SwsContext for 
scaling\n");
 return AVERROR(ENOMEM);
 }
 sr_context->sws_slice_h = inlink->h;
 break;
 case ESPCN:
-if (inlink->format == AV_PIX_FMT_GRAY8){
-sr_context->sws_context = NULL;
-}
-else{
+if (inlink->format != AV_PIX_FMT_GRAY8){
 sws_src_h = sr_context->input.height;
 sws_src_w = sr_context->input.width;
 sws_dst_h = sr_context->output.height;
@@ -184,13 +199,14 @@ static int config_props(AVFilterLink *inlink)
 sws_dst_w = AV_CEIL_RSHIFT(sws_dst_w, 2);
 break;
 default:
-av_log(context, AV_LOG_ERROR, "could not create SwsContext 
for input pixel format");
+av_log(context, AV_LOG_ERROR, "could not create SwsContext 
for scaling for given input pixel format");
 return AVERROR(EIO);
 }
-sr_context->sws_context = sws_getContext(sws_src_w, sws_src_h, 
AV_PIX_FMT_GRAY8,
- sws_dst_w, sws_dst_h, 
AV_PIX_FMT_GRAY8, SWS_BICUBIC, NULL, NULL, NULL);
-if (!sr_context->sws_context){
-av_log(context, AV_LOG_ERROR, "could not create 
SwsContext\n");
+sr_context->sws_contexts[0] = sws_getContext(sws_src_w, 
sws_src_h, AV_PIX_FMT_GRAY8,
+

[FFmpeg-devel] [PATCH 3/7] libavfilter: Fixes warnings for unused variables in dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c.

2018-08-02 Thread Sergey Lavrushkin

---
 libavfilter/dnn_backend_tf.c | 64 +++-
 libavfilter/dnn_espcn.h  | 37 -
 libavfilter/dnn_srcnn.h  | 35 
 3 files changed, 63 insertions(+), 73 deletions(-)

diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
index 6307c794a5..7a4ad72d27 100644
--- a/libavfilter/dnn_backend_tf.c
+++ b/libavfilter/dnn_backend_tf.c
@@ -374,9 +374,71 @@ DNNModel *ff_dnn_load_default_model_tf(DNNDefaultModel 
model_type)
 TFModel *tf_model = NULL;
 TF_OperationDescription *op_desc;
 TF_Operation *op;
-TF_Operation *const_ops_buffer[6];
 TF_Output input;
 int64_t input_shape[] = {1, -1, -1, 1};
+const char tanh[] = "Tanh";
+const char sigmoid[] = "Sigmoid";
+const char relu[] = "Relu";
+
+const float *srcnn_consts[] = {
+srcnn_conv1_kernel,
+srcnn_conv1_bias,
+srcnn_conv2_kernel,
+srcnn_conv2_bias,
+srcnn_conv3_kernel,
+srcnn_conv3_bias
+};
+const long int *srcnn_consts_dims[] = {
+srcnn_conv1_kernel_dims,
+srcnn_conv1_bias_dims,
+srcnn_conv2_kernel_dims,
+srcnn_conv2_bias_dims,
+srcnn_conv3_kernel_dims,
+srcnn_conv3_bias_dims
+};
+const int srcnn_consts_dims_len[] = {
+4,
+1,
+4,
+1,
+4,
+1
+};
+const char *srcnn_activations[] = {
+relu,
+relu,
+relu
+};
+
+const float *espcn_consts[] = {
+espcn_conv1_kernel,
+espcn_conv1_bias,
+espcn_conv2_kernel,
+espcn_conv2_bias,
+espcn_conv3_kernel,
+espcn_conv3_bias
+};
+const long int *espcn_consts_dims[] = {
+espcn_conv1_kernel_dims,
+espcn_conv1_bias_dims,
+espcn_conv2_kernel_dims,
+espcn_conv2_bias_dims,
+espcn_conv3_kernel_dims,
+espcn_conv3_bias_dims
+};
+const int espcn_consts_dims_len[] = {
+4,
+1,
+4,
+1,
+4,
+1
+};
+const char *espcn_activations[] = {
+tanh,
+tanh,
+sigmoid
+};
 
 input.index = 0;
 
diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h
index a0dd61cd0d..9344aa90fe 100644
--- a/libavfilter/dnn_espcn.h
+++ b/libavfilter/dnn_espcn.h
@@ -5398,41 +5398,4 @@ static const long int espcn_conv3_bias_dims[] = {
 4
 };
 
-static const float *espcn_consts[] = {
-espcn_conv1_kernel,
-espcn_conv1_bias,
-espcn_conv2_kernel,
-espcn_conv2_bias,
-espcn_conv3_kernel,
-espcn_conv3_bias
-};
-
-static const long int *espcn_consts_dims[] = {
-espcn_conv1_kernel_dims,
-espcn_conv1_bias_dims,
-espcn_conv2_kernel_dims,
-espcn_conv2_bias_dims,
-espcn_conv3_kernel_dims,
-espcn_conv3_bias_dims
-};
-
-static const int espcn_consts_dims_len[] = {
-4,
-1,
-4,
-1,
-4,
-1
-};
-
-static const char espcn_tanh[] = "Tanh";
-
-static const char espcn_sigmoid[] = "Sigmoid";
-
-static const char *espcn_activations[] = {
-espcn_tanh,
-espcn_tanh,
-espcn_sigmoid
-};
-
 #endif
diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h
index 26143654b8..4f5332ce18 100644
--- a/libavfilter/dnn_srcnn.h
+++ b/libavfilter/dnn_srcnn.h
@@ -2110,39 +2110,4 @@ static const long int srcnn_conv3_bias_dims[] = {
 1
 };
 
-static const float *srcnn_consts[] = {
-srcnn_conv1_kernel,
-srcnn_conv1_bias,
-srcnn_conv2_kernel,
-srcnn_conv2_bias,
-srcnn_conv3_kernel,
-srcnn_conv3_bias
-};
-
-static const long int *srcnn_consts_dims[] = {
-srcnn_conv1_kernel_dims,
-srcnn_conv1_bias_dims,
-srcnn_conv2_kernel_dims,
-srcnn_conv2_bias_dims,
-srcnn_conv3_kernel_dims,
-srcnn_conv3_bias_dims
-};
-
-static const int srcnn_consts_dims_len[] = {
-4,
-1,
-4,
-1,
-4,
-1
-};
-
-static const char srcnn_relu[] = "Relu";
-
-static const char *srcnn_activations[] = {
-srcnn_relu,
-srcnn_relu,
-srcnn_relu
-};
-
 #endif
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [GSOC][PATCH 0/7] Improvements for sr filter and DNN module

2018-08-02 Thread Sergey Lavrushkin

Hello,

These patches address several raised concerns regarding sr filter
and DNN module. I included three patches, that I've already sent,
but they still have not been reviewed properly.

  libavfilter: Adds on the fly generation of default DNN models for
tensorflow backend instead of storing binary model.
  libavfilter: Code style fixes for pointers in DNN module and sr
filter.
  libavfilter: Fixes warnings for unused variables in dnn_srcnn.h,
dnn_espcn.h, dnn_backend_tf.c.
  Adds gray floating-point pixel formats.
  libavfilter/dnn_backend_tf.c: Fixes ff_dnn_free_model_tf.
  libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8
conversions.
  libavfilter: Adds proper file descriptions to dnn_srcnn.h and
dnn_espcn.h.

 libavfilter/dnn_backend_native.c |96 +-
 libavfilter/dnn_backend_native.h | 8 +-
 libavfilter/dnn_backend_tf.c |   396 +-
 libavfilter/dnn_backend_tf.h | 8 +-
 libavfilter/dnn_espcn.h  | 17947 +++--
 libavfilter/dnn_interface.c  | 4 +-
 libavfilter/dnn_interface.h  |16 +-
 libavfilter/dnn_srcnn.h  |  6979 +-
 libavfilter/vf_sr.c  |   194 +-
 libavutil/pixdesc.c  |22 +
 libavutil/pixfmt.h   | 5 +
 libswscale/swscale_internal.h| 7 +
 libswscale/swscale_unscaled.c|54 +-
 libswscale/utils.c   | 5 +-
 14 files changed, 7983 insertions(+), 17758 deletions(-)

-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH 2/7] libavfilter: Code style fixes for pointers in DNN module and sr filter.

2018-08-02 Thread Sergey Lavrushkin

---
 libavfilter/dnn_backend_native.c |  84 +++---
 libavfilter/dnn_backend_native.h |   8 +--
 libavfilter/dnn_backend_tf.c | 108 +++
 libavfilter/dnn_backend_tf.h |   8 +--
 libavfilter/dnn_espcn.h  |   6 +--
 libavfilter/dnn_interface.c  |   4 +-
 libavfilter/dnn_interface.h  |  16 +++---
 libavfilter/dnn_srcnn.h  |   6 +--
 libavfilter/vf_sr.c  |  60 +++---
 9 files changed, 150 insertions(+), 150 deletions(-)

diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_native.c
index 3e6b86280d..baefea7fcb 100644
--- a/libavfilter/dnn_backend_native.c
+++ b/libavfilter/dnn_backend_native.c
@@ -34,15 +34,15 @@ typedef enum {RELU, TANH, SIGMOID} ActivationFunc;
 
 typedef struct Layer{
 LayerType type;
-float* output;
-void* params;
+float *output;
+void *params;
 } Layer;
 
 typedef struct ConvolutionalParams{
 int32_t input_num, output_num, kernel_size;
 ActivationFunc activation;
-float* kernel;
-float* biases;
+float *kernel;
+float *biases;
 } ConvolutionalParams;
 
 typedef struct InputParams{
@@ -55,16 +55,16 @@ typedef struct DepthToSpaceParams{
 
 // Represents simple feed-forward convolutional network.
 typedef struct ConvolutionalNetwork{
-Layer* layers;
+Layer *layers;
 int32_t layers_num;
 } ConvolutionalNetwork;
 
-static DNNReturnType set_input_output_native(void* model, DNNData* input, 
DNNData* output)
+static DNNReturnType set_input_output_native(void *model, DNNData *input, 
DNNData *output)
 {
-ConvolutionalNetwork* network = (ConvolutionalNetwork*)model;
-InputParams* input_params;
-ConvolutionalParams* conv_params;
-DepthToSpaceParams* depth_to_space_params;
+ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
+InputParams *input_params;
+ConvolutionalParams *conv_params;
+DepthToSpaceParams *depth_to_space_params;
 int cur_width, cur_height, cur_channels;
 int32_t layer;
 
@@ -72,7 +72,7 @@ static DNNReturnType set_input_output_native(void* model, 
DNNData* input, DNNDat
 return DNN_ERROR;
 }
 else{
-input_params = (InputParams*)network->layers[0].params;
+input_params = (InputParams *)network->layers[0].params;
 input_params->width = cur_width = input->width;
 input_params->height = cur_height = input->height;
 input_params->channels = cur_channels = input->channels;
@@ -88,14 +88,14 @@ static DNNReturnType set_input_output_native(void* model, 
DNNData* input, DNNDat
 for (layer = 1; layer < network->layers_num; ++layer){
 switch (network->layers[layer].type){
 case CONV:
-conv_params = (ConvolutionalParams*)network->layers[layer].params;
+conv_params = (ConvolutionalParams *)network->layers[layer].params;
 if (conv_params->input_num != cur_channels){
 return DNN_ERROR;
 }
 cur_channels = conv_params->output_num;
 break;
 case DEPTH_TO_SPACE:
-depth_to_space_params = 
(DepthToSpaceParams*)network->layers[layer].params;
+depth_to_space_params = (DepthToSpaceParams 
*)network->layers[layer].params;
 if (cur_channels % (depth_to_space_params->block_size * 
depth_to_space_params->block_size) != 0){
 return DNN_ERROR;
 }
@@ -127,16 +127,16 @@ static DNNReturnType set_input_output_native(void* model, 
DNNData* input, DNNDat
 // layers_num,layer_type,layer_parameterss,layer_type,layer_parameters...
 // For CONV layer: activation_function, input_num, output_num, kernel_size, 
kernel, biases
 // For DEPTH_TO_SPACE layer: block_size
-DNNModel* ff_dnn_load_model_native(const char* model_filename)
+DNNModel *ff_dnn_load_model_native(const char *model_filename)
 {
-DNNModel* model = NULL;
-ConvolutionalNetwork* network = NULL;
-AVIOContext* model_file_context;
+DNNModel *model = NULL;
+ConvolutionalNetwork *network = NULL;
+AVIOContext *model_file_context;
 int file_size, dnn_size, kernel_size, i;
 int32_t layer;
 LayerType layer_type;
-ConvolutionalParams* conv_params;
-DepthToSpaceParams* depth_to_space_params;
+ConvolutionalParams *conv_params;
+DepthToSpaceParams *depth_to_space_params;
 
 model = av_malloc(sizeof(DNNModel));
 if (!model){
@@ -155,7 +155,7 @@ DNNModel* ff_dnn_load_model_native(const char* 
model_filename)
 av_freep(&model);
 return NULL;
 }
-model->model = (void*)network;
+model->model = (void *)network;
 
 network->layers_num = 1 + (int32_t)avio_rl32(model_file_context);
 dnn_size = 4;
@@ -251,10 +251,10 @@ DNNModel* ff_dnn_load_model_native(const char* 
model_filename)
 return model;
 }
 
-static int set_up_conv_layer(Layer* layer, const float* kernel, const float* 
biases, ActivationFunc activation,
+stat

[FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-02 Thread Sergey Lavrushkin

This patch adds two floating-point gray formats to use them in sr filter for 
conversion with libswscale. I added conversion from uint gray to float and
backwards in swscale_unscaled.c, that is enough for sr filter. But for
proper format addition, should I add anything else?

---
 libavutil/pixdesc.c   | 22 ++
 libavutil/pixfmt.h|  5 
 libswscale/swscale_internal.h |  7 ++
 libswscale/swscale_unscaled.c | 54 +--
 libswscale/utils.c|  5 +++-
 5 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
index 96e079584a..7d307d9120 100644
--- a/libavutil/pixdesc.c
+++ b/libavutil/pixdesc.c
@@ -2198,6 +2198,28 @@ static const AVPixFmtDescriptor 
av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
 .flags = AV_PIX_FMT_FLAG_PLANAR | AV_PIX_FMT_FLAG_ALPHA |
  AV_PIX_FMT_FLAG_RGB | AV_PIX_FMT_FLAG_FLOAT,
 },
+[AV_PIX_FMT_GRAYF32BE] = {
+.name = "grayf32be",
+.nb_components = 1,
+.log2_chroma_w = 0,
+.log2_chroma_h = 0,
+.comp = {
+{ 0, 4, 0, 0, 32, 3, 31, 1 },   /* Y */
+},
+.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_FLOAT,
+.alias = "yf32be",
+},
+[AV_PIX_FMT_GRAYF32LE] = {
+.name = "grayf32le",
+.nb_components = 1,
+.log2_chroma_w = 0,
+.log2_chroma_h = 0,
+.comp = {
+{ 0, 4, 0, 0, 32, 3, 31, 1 },   /* Y */
+},
+.flags = AV_PIX_FMT_FLAG_FLOAT,
+.alias = "yf32le",
+},
 [AV_PIX_FMT_DRM_PRIME] = {
 .name = "drm_prime",
 .flags = AV_PIX_FMT_FLAG_HWACCEL,
diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
index 2b3307845e..aa9a4f60c1 100644
--- a/libavutil/pixfmt.h
+++ b/libavutil/pixfmt.h
@@ -320,6 +320,9 @@ enum AVPixelFormat {
 AV_PIX_FMT_GBRAPF32BE, ///< IEEE-754 single precision planar GBRA 4:4:4:4, 
128bpp, big-endian
 AV_PIX_FMT_GBRAPF32LE, ///< IEEE-754 single precision planar GBRA 4:4:4:4, 
128bpp, little-endian
 
+AV_PIX_FMT_GRAYF32BE,  ///< IEEE-754 single precision Y, 32bpp, big-endian
+AV_PIX_FMT_GRAYF32LE,  ///< IEEE-754 single precision Y, 32bpp, 
little-endian
+
 /**
  * DRM-managed buffers exposed through PRIME buffer sharing.
  *
@@ -405,6 +408,8 @@ enum AVPixelFormat {
 #define AV_PIX_FMT_GBRPF32AV_PIX_FMT_NE(GBRPF32BE,  GBRPF32LE)
 #define AV_PIX_FMT_GBRAPF32   AV_PIX_FMT_NE(GBRAPF32BE, GBRAPF32LE)
 
+#define AV_PIX_FMT_GRAYF32 AV_PIX_FMT_NE(GRAYF32BE, GRAYF32LE)
+
 #define AV_PIX_FMT_YUVA420P9  AV_PIX_FMT_NE(YUVA420P9BE , YUVA420P9LE)
 #define AV_PIX_FMT_YUVA422P9  AV_PIX_FMT_NE(YUVA422P9BE , YUVA422P9LE)
 #define AV_PIX_FMT_YUVA444P9  AV_PIX_FMT_NE(YUVA444P9BE , YUVA444P9LE)
diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h
index 1703856ab2..4a2cdfe658 100644
--- a/libswscale/swscale_internal.h
+++ b/libswscale/swscale_internal.h
@@ -764,6 +764,13 @@ static av_always_inline int isAnyRGB(enum AVPixelFormat 
pix_fmt)
 pix_fmt == AV_PIX_FMT_MONOBLACK || pix_fmt == AV_PIX_FMT_MONOWHITE;
 }
 
+static av_always_inline int isFloat(enum AVPixelFormat pix_fmt)
+{
+const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
+av_assert0(desc);
+return desc->flags & AV_PIX_FMT_FLAG_FLOAT;
+}
+
 static av_always_inline int isALPHA(enum AVPixelFormat pix_fmt)
 {
 const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index 6480070cbf..f5b4c9be9d 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -1467,6 +1467,46 @@ static int yvu9ToYv12Wrapper(SwsContext *c, const 
uint8_t *src[],
 return srcSliceH;
 }
 
+static int uint_y_to_float_y_wrapper(SwsContext *c, const uint8_t *src[],
+ int srcStride[], int srcSliceY,
+ int srcSliceH, uint8_t *dst[], int 
dstStride[])
+{
+int y, x;
+int dstStrideFloat = dstStride[0] >> 2;;
+const uint8_t *srcPtr = src[0];
+float *dstPtr = (float *)(dst[0] + dstStride[0] * srcSliceY);
+
+for (y = 0; y < srcSliceH; ++y){
+for (x = 0; x < c->srcW; ++x){
+dstPtr[x] = (float)srcPtr[x] / 255.0f;
+}
+srcPtr += srcStride[0];
+dstPtr += dstStrideFloat;
+}
+
+return srcSliceH;
+}
+
+static int float_y_to_uint_y_wrapper(SwsContext *c, const uint8_t* src[],
+ int srcStride[], int srcSliceY,
+ int srcSliceH, uint8_t* dst[], int 
dstStride[])
+{
+int y, x;
+int srcStrideFloat = srcStride[0] >> 2;
+const float *srcPtr = (const float *)src[0];
+uint8_t *dstPtr = dst[0] + dstStride[0] * srcSliceY;
+
+for (y = 0; y < srcSliceH; ++y){
+for (x = 0; x < c->srcW; ++x){
+dstPtr[x] = (uint8_t)(255.0

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-03 Thread Sergey Lavrushkin

2018-08-03 16:07 GMT+03:00 Michael Niedermayer :

> On Thu, Aug 02, 2018 at 09:52:45PM +0300, Sergey Lavrushkin wrote:
> > This patch adds two floating-point gray formats to use them in sr filter
> for
> > conversion with libswscale. I added conversion from uint gray to float
> and
> > backwards in swscale_unscaled.c, that is enough for sr filter. But for
> > proper format addition, should I add anything else?
> >
> > ---
> >  libavutil/pixdesc.c   | 22 ++
> >  libavutil/pixfmt.h|  5 
> >  libswscale/swscale_internal.h |  7 ++
> >  libswscale/swscale_unscaled.c | 54 ++
> +++--
> >  libswscale/utils.c|  5 +++-
>
> please split this in a patch or libavutil and one for libswscale
> they also need some version.h bump
>

Ok.

also fate tests need an update, (make fate) fails otherwise, the update
> should
> be part of the patch that causes the failure otherwise


In one test for these formats I get:

filter-pixfmts-scale
grayf32be   grayf32le   monob
 f01cb0b623357387827902d9d0963435

I guess, it is because I only implemented conversion in swscale_unscaled.
What can I do to fix it? Should I implement conversion for scaling or maybe
change something in the test, so it would not check these formats (if it is
possible).
Anyway, I need to know what changes should I do and where.


> >  5 files changed, 90 insertions(+), 3 deletions(-)
> >
> > diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
> > index 96e079584a..7d307d9120 100644
> > --- a/libavutil/pixdesc.c
> > +++ b/libavutil/pixdesc.c
> > @@ -2198,6 +2198,28 @@ static const AVPixFmtDescriptor
> av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
> >  .flags = AV_PIX_FMT_FLAG_PLANAR | AV_PIX_FMT_FLAG_ALPHA |
> >   AV_PIX_FMT_FLAG_RGB | AV_PIX_FMT_FLAG_FLOAT,
> >  },
> > +[AV_PIX_FMT_GRAYF32BE] = {
> > +.name = "grayf32be",
> > +.nb_components = 1,
> > +.log2_chroma_w = 0,
> > +.log2_chroma_h = 0,
> > +.comp = {
> > +{ 0, 4, 0, 0, 32, 3, 31, 1 },   /* Y */
> > +},
> > +.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_FLOAT,
> > +.alias = "yf32be",
> > +},
> > +[AV_PIX_FMT_GRAYF32LE] = {
> > +.name = "grayf32le",
> > +.nb_components = 1,
> > +.log2_chroma_w = 0,
> > +.log2_chroma_h = 0,
> > +.comp = {
> > +{ 0, 4, 0, 0, 32, 3, 31, 1 },   /* Y */
> > +},
> > +.flags = AV_PIX_FMT_FLAG_FLOAT,
> > +.alias = "yf32le",
> > +},
> >  [AV_PIX_FMT_DRM_PRIME] = {
> >  .name = "drm_prime",
> >  .flags = AV_PIX_FMT_FLAG_HWACCEL,
>
> > diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
> > index 2b3307845e..aa9a4f60c1 100644
> > --- a/libavutil/pixfmt.h
> > +++ b/libavutil/pixfmt.h
> > @@ -320,6 +320,9 @@ enum AVPixelFormat {
> >  AV_PIX_FMT_GBRAPF32BE, ///< IEEE-754 single precision planar GBRA
> 4:4:4:4, 128bpp, big-endian
> >  AV_PIX_FMT_GBRAPF32LE, ///< IEEE-754 single precision planar GBRA
> 4:4:4:4, 128bpp, little-endian
> >
> > +AV_PIX_FMT_GRAYF32BE,  ///< IEEE-754 single precision Y, 32bpp,
> big-endian
> > +AV_PIX_FMT_GRAYF32LE,  ///< IEEE-754 single precision Y, 32bpp,
> little-endian
> > +
> >  /**
> >   * DRM-managed buffers exposed through PRIME buffer sharing.
> >   *
>
> new enum values can only be added in such a way that no value of an
> existing
> enum changes. This would change the value of the following enums


Ok.

> @@ -405,6 +408,8 @@ enum AVPixelFormat {
> >  #define AV_PIX_FMT_GBRPF32AV_PIX_FMT_NE(GBRPF32BE,  GBRPF32LE)
> >  #define AV_PIX_FMT_GBRAPF32   AV_PIX_FMT_NE(GBRAPF32BE, GBRAPF32LE)
> >
> > +#define AV_PIX_FMT_GRAYF32 AV_PIX_FMT_NE(GRAYF32BE, GRAYF32LE)
> > +
> >  #define AV_PIX_FMT_YUVA420P9  AV_PIX_FMT_NE(YUVA420P9BE , YUVA420P9LE)
> >  #define AV_PIX_FMT_YUVA422P9  AV_PIX_FMT_NE(YUVA422P9BE , YUVA422P9LE)
> >  #define AV_PIX_FMT_YUVA444P9  AV_PIX_FMT_NE(YUVA444P9BE , YUVA444P9LE)
> > diff --git a/libswscale/swscale_internal.h
> b/libswscale/swscale_internal.h
> > index 1703856ab2..4a2cdfe658 100644
> > --- a/libswscale/swscale_internal.h
> > +++ b/libswscale/swscale_internal.h
> > @@ -764,6 +764,13 @@ static av_always_inline int isAnyRGB(enum
> AVPixelFormat pix_fmt)
> >  pix_fmt == AV_PIX_FMT_MONOBLACK || pix

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-03 Thread Sergey Lavrushkin

2018-08-04 0:11 GMT+03:00 Michael Niedermayer :

> On Fri, Aug 03, 2018 at 10:33:00PM +0300, Sergey Lavrushkin wrote:
> > 2018-08-03 16:07 GMT+03:00 Michael Niedermayer :
> >
> > > On Thu, Aug 02, 2018 at 09:52:45PM +0300, Sergey Lavrushkin wrote:
> > > > This patch adds two floating-point gray formats to use them in sr
> filter
> > > for
> > > > conversion with libswscale. I added conversion from uint gray to
> float
> > > and
> > > > backwards in swscale_unscaled.c, that is enough for sr filter. But
> for
> > > > proper format addition, should I add anything else?
> > > >
> > > > ---
> > > >  libavutil/pixdesc.c   | 22 ++
> > > >  libavutil/pixfmt.h|  5 
> > > >  libswscale/swscale_internal.h |  7 ++
> > > >  libswscale/swscale_unscaled.c | 54 ++
> > > +++--
> > > >  libswscale/utils.c|  5 +++-
> > >
> > > please split this in a patch or libavutil and one for libswscale
> > > they also need some version.h bump
> > >
> >
> > Ok.
> >
> > also fate tests need an update, (make fate) fails otherwise, the update
> > > should
> > > be part of the patch that causes the failure otherwise
> >
> >
> > In one test for these formats I get:
> >
> > filter-pixfmts-scale
> > grayf32be   grayf32le   monob
> >  f01cb0b623357387827902d9d0963435
> >
> > I guess, it is because I only implemented conversion in swscale_unscaled.
> > What can I do to fix it? Should I implement conversion for scaling or
> maybe
> > change something in the test, so it would not check these formats (if it
> is
> > possible).
> > Anyway, I need to know what changes should I do and where.
>
> well, swscale shouldnt really have formats only half supported
> so for any supported format in and out it should work with any
> width / height in / out
>
> Theres a wide range of possibilities how to implement this.
> The correct / ideal way is of course to implement a full floating point
> path
> for scaling along side the integer code.
> a simpler aprouch would be to convert from/to float to/from  integers and
> use
> the existing code. (this of course has the disadvantage of loosing
> precission)
>

Well, I want to implement simpler approach, as I still have to finish
correcting sr filter.
But I need some explanations regarding what I should add. If I understand
correcly,
I need to add conversion from float to the ff_sws_init_input_funcs function
in libswscale/input.c
and conversion to float to the ff_sws_init_output_funcs function in
libswscale/output.c
If I am not mistaken, in the first case I need to provide c->lumToYV12 and
in the second case -
yuv2plane1 and yuv2planeX. So, in the first case, to what format should I
add
conversion, specifically what number of bits per pixel should be used? As I
look through other
conversion functions, it seems that somewhere uint8 is used and somewhere -
uint16.
Is it somehow determined later during scaling? If I am going to convert to
uint8 from
my float format, should I define it somewhere, that I am converting to
uint8?
And in the second case, I don't completely understand, what these two
functions are
doing, especially tha last one with filters. Is it also just simple
conversions or
these functions also cover something else? And in their descriptions it is
written, that:

 * @param src scaled source data, 15 bits for 8-10-bit output,
 *19 bits for 16-bit output (in int32_t)
 * @param destpointer to the output plane. For >8-bit
 *output, this is in uint16_t

In my case, the output is 32-bit. Does this mean, that float type,
basically, is not
supported and I also have to modify something in scaling? If so, what
should I add?

> [...]
> > > +const uint8_t *srcPtr = src[0];
> > > > +float *dstPtr = (float *)(dst[0] + dstStride[0] * srcSliceY);
> > > > +
> > > > +for (y = 0; y < srcSliceH; ++y){
> > > > +for (x = 0; x < c->srcW; ++x){
> > > > +dstPtr[x] = (float)srcPtr[x] / 255.0f;
> > >
> > > division is slow. This should either be a multiplication with the
> > > inverse or a LUT with 8bit index changing to float.
> > >
> > > The faster of them should be used
> > >
> >
> > LUT seems to be faster. Can I place it in SwsContext and initialize it in
> > sws_init_context when necessary?
>
> yes of course
>
> thanks
>
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 2/7] libavfilter: Code style fixes for pointers in DNN module and sr filter.

2018-08-06 Thread Sergey Lavrushkin

Updated patch.

2018-08-06 17:55 GMT+03:00 Pedro Arthur :

> 2018-08-02 15:52 GMT-03:00 Sergey Lavrushkin :
> > ---
> >  libavfilter/dnn_backend_native.c |  84 +++---
> >  libavfilter/dnn_backend_native.h |   8 +--
> >  libavfilter/dnn_backend_tf.c | 108 +++---
> -
> >  libavfilter/dnn_backend_tf.h |   8 +--
> >  libavfilter/dnn_espcn.h  |   6 +--
> >  libavfilter/dnn_interface.c  |   4 +-
> >  libavfilter/dnn_interface.h  |  16 +++---
> >  libavfilter/dnn_srcnn.h  |   6 +--
> >  libavfilter/vf_sr.c  |  60 +++---
> >  9 files changed, 150 insertions(+), 150 deletions(-)
> >
> > diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_
> native.c
> > index 3e6b86280d..baefea7fcb 100644
> > --- a/libavfilter/dnn_backend_native.c
> > +++ b/libavfilter/dnn_backend_native.c
> > @@ -34,15 +34,15 @@ typedef enum {RELU, TANH, SIGMOID} ActivationFunc;
> >
> >  typedef struct Layer{
> >  LayerType type;
> > -float* output;
> > -void* params;
> > +float *output;
> > +void *params;
> >  } Layer;
> >
> >  typedef struct ConvolutionalParams{
> >  int32_t input_num, output_num, kernel_size;
> >  ActivationFunc activation;
> > -float* kernel;
> > -float* biases;
> > +float *kernel;
> > +float *biases;
> >  } ConvolutionalParams;
> >
> >  typedef struct InputParams{
> > @@ -55,16 +55,16 @@ typedef struct DepthToSpaceParams{
> >
> >  // Represents simple feed-forward convolutional network.
> >  typedef struct ConvolutionalNetwork{
> > -Layer* layers;
> > +Layer *layers;
> >  int32_t layers_num;
> >  } ConvolutionalNetwork;
> >
> > -static DNNReturnType set_input_output_native(void* model, DNNData*
> input, DNNData* output)
> > +static DNNReturnType set_input_output_native(void *model, DNNData
> *input, DNNData *output)
> >  {
> > -ConvolutionalNetwork* network = (ConvolutionalNetwork*)model;
> > -InputParams* input_params;
> > -ConvolutionalParams* conv_params;
> > -DepthToSpaceParams* depth_to_space_params;
> > +ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
> > +InputParams *input_params;
> > +ConvolutionalParams *conv_params;
> > +DepthToSpaceParams *depth_to_space_params;
> >  int cur_width, cur_height, cur_channels;
> >  int32_t layer;
> >
> > @@ -72,7 +72,7 @@ static DNNReturnType set_input_output_native(void*
> model, DNNData* input, DNNDat
> >  return DNN_ERROR;
> >  }
> >  else{
> > -input_params = (InputParams*)network->layers[0].params;
> > +input_params = (InputParams *)network->layers[0].params;
> >  input_params->width = cur_width = input->width;
> >  input_params->height = cur_height = input->height;
> >  input_params->channels = cur_channels = input->channels;
> > @@ -88,14 +88,14 @@ static DNNReturnType set_input_output_native(void*
> model, DNNData* input, DNNDat
> >  for (layer = 1; layer < network->layers_num; ++layer){
> >  switch (network->layers[layer].type){
> >  case CONV:
> > -conv_params = (ConvolutionalParams*)network-
> >layers[layer].params;
> > +conv_params = (ConvolutionalParams *)network->layers[layer].
> params;
> >  if (conv_params->input_num != cur_channels){
> >  return DNN_ERROR;
> >  }
> >  cur_channels = conv_params->output_num;
> >  break;
> >  case DEPTH_TO_SPACE:
> > -depth_to_space_params = (DepthToSpaceParams*)network->
> layers[layer].params;
> > +depth_to_space_params = (DepthToSpaceParams
> *)network->layers[layer].params;
> >  if (cur_channels % (depth_to_space_params->block_size *
> depth_to_space_params->block_size) != 0){
> >  return DNN_ERROR;
> >  }
> > @@ -127,16 +127,16 @@ static DNNReturnType set_input_output_native(void*
> model, DNNData* input, DNNDat
> >  // layers_num,layer_type,layer_parameterss,layer_type,layer_
> parameters...
> >  // For CONV layer: activation_function, input_num, output_num,
> kernel_size, kernel, biases
> >  // For DEPTH_TO_SPACE layer: block_size
> > -DNNModel* ff_dnn_load_model_native(const char* model_filename)
> > +DNNMode

Re: [FFmpeg-devel] [PATCH 3/7] libavfilter: Fixes warnings for unused variables in dnn_srcnn.h, dnn_espcn.h, dnn_backend_tf.c.

2018-08-06 Thread Sergey Lavrushkin

Made variables static.

2018-08-06 21:19 GMT+03:00 Pedro Arthur :

> 2018-08-02 15:52 GMT-03:00 Sergey Lavrushkin :
> > ---
> >  libavfilter/dnn_backend_tf.c | 64 ++
> +-
> >  libavfilter/dnn_espcn.h  | 37 -
> >  libavfilter/dnn_srcnn.h  | 35 
> >  3 files changed, 63 insertions(+), 73 deletions(-)
> >
> > diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
> > index 6307c794a5..7a4ad72d27 100644
> > --- a/libavfilter/dnn_backend_tf.c
> > +++ b/libavfilter/dnn_backend_tf.c
> > @@ -374,9 +374,71 @@ DNNModel *ff_dnn_load_default_model_tf(DNNDefaultModel
> model_type)
> >  TFModel *tf_model = NULL;
> >  TF_OperationDescription *op_desc;
> >  TF_Operation *op;
> > -TF_Operation *const_ops_buffer[6];
> >  TF_Output input;
> >  int64_t input_shape[] = {1, -1, -1, 1};
> > +const char tanh[] = "Tanh";
> > +const char sigmoid[] = "Sigmoid";
> > +const char relu[] = "Relu";
> > +
> > +const float *srcnn_consts[] = {
> > +srcnn_conv1_kernel,
> > +srcnn_conv1_bias,
> > +srcnn_conv2_kernel,
> > +srcnn_conv2_bias,
> > +srcnn_conv3_kernel,
> > +srcnn_conv3_bias
> > +};
> > +const long int *srcnn_consts_dims[] = {
> > +srcnn_conv1_kernel_dims,
> > +srcnn_conv1_bias_dims,
> > +srcnn_conv2_kernel_dims,
> > +srcnn_conv2_bias_dims,
> > +srcnn_conv3_kernel_dims,
> > +srcnn_conv3_bias_dims
> > +};
> > +const int srcnn_consts_dims_len[] = {
> > +4,
> > +1,
> > +4,
> > +1,
> > +4,
> > +1
> > +};
> > +const char *srcnn_activations[] = {
> > +relu,
> > +relu,
> > +relu
> > +};
> > +
> > +const float *espcn_consts[] = {
> > +espcn_conv1_kernel,
> > +espcn_conv1_bias,
> > +espcn_conv2_kernel,
> > +espcn_conv2_bias,
> > +espcn_conv3_kernel,
> > +espcn_conv3_bias
> > +};
> > +const long int *espcn_consts_dims[] = {
> > +espcn_conv1_kernel_dims,
> > +espcn_conv1_bias_dims,
> > +espcn_conv2_kernel_dims,
> > +espcn_conv2_bias_dims,
> > +espcn_conv3_kernel_dims,
> > +espcn_conv3_bias_dims
> > +};
> > +const int espcn_consts_dims_len[] = {
> > +4,
> > +1,
> > +4,
> > +1,
> > +4,
> > +1
> > +};
> > +const char *espcn_activations[] = {
> > +tanh,
> > +tanh,
> > +sigmoid
> > +};
> >
> >  input.index = 0;
> >
> > diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h
> > index a0dd61cd0d..9344aa90fe 100644
> > --- a/libavfilter/dnn_espcn.h
> > +++ b/libavfilter/dnn_espcn.h
> > @@ -5398,41 +5398,4 @@ static const long int espcn_conv3_bias_dims[] = {
> >  4
> >  };
> >
> > -static const float *espcn_consts[] = {
> > -espcn_conv1_kernel,
> > -espcn_conv1_bias,
> > -espcn_conv2_kernel,
> > -espcn_conv2_bias,
> > -espcn_conv3_kernel,
> > -espcn_conv3_bias
> > -};
> > -
> > -static const long int *espcn_consts_dims[] = {
> > -espcn_conv1_kernel_dims,
> > -espcn_conv1_bias_dims,
> > -espcn_conv2_kernel_dims,
> > -espcn_conv2_bias_dims,
> > -espcn_conv3_kernel_dims,
> > -espcn_conv3_bias_dims
> > -};
> > -
> > -static const int espcn_consts_dims_len[] = {
> > -4,
> > -1,
> > -4,
> > -1,
> > -4,
> > -1
> > -};
> > -
> > -static const char espcn_tanh[] = "Tanh";
> > -
> > -static const char espcn_sigmoid[] = "Sigmoid";
> > -
> > -static const char *espcn_activations[] = {
> > -espcn_tanh,
> > -espcn_tanh,
> > -espcn_sigmoid
> > -};
> > -
> >  #endif
> > diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h
> > index 26143654b8..4f5332ce18 100644
> > --- a/libavfilter/dnn_srcnn.h
> > +++ b/libavfilter/dnn_srcnn.h
> > @@ -2110,39 +2110,4 @@ static const long int srcnn_conv3_bias_dims[] = {
> >  1
> >  };
> >
> > -static const floa

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-06 Thread Sergey Lavrushkin

I split patch to one for libavutil and another for libswscale,
also added LUT for unscaled conversion, added
conversions for scaling and updated fate tests.
From 8bcc10b49c41612b4d6549e64d90acf3f0b3fc6a Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Fri, 3 Aug 2018 18:02:49 +0300
Subject: [PATCH 4/9] libavutil: Adds gray floating-point pixel formats.

---
 libavutil/pixdesc.c  | 22 ++
 libavutil/pixfmt.h   |  5 +
 libavutil/version.h  |  2 +-
 tests/ref/fate/sws-pixdesc-query |  3 +++
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
index 96e079584a..970a83214c 100644
--- a/libavutil/pixdesc.c
+++ b/libavutil/pixdesc.c
@@ -2206,6 +2206,28 @@ static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
 .name  = "opencl",
 .flags = AV_PIX_FMT_FLAG_HWACCEL,
 },
+[AV_PIX_FMT_GRAYF32BE] = {
+.name = "grayf32be",
+.nb_components = 1,
+.log2_chroma_w = 0,
+.log2_chroma_h = 0,
+.comp = {
+{ 0, 4, 0, 0, 32, 3, 31, 1 },   /* Y */
+},
+.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_FLOAT,
+.alias = "yf32be",
+},
+[AV_PIX_FMT_GRAYF32LE] = {
+.name = "grayf32le",
+.nb_components = 1,
+.log2_chroma_w = 0,
+.log2_chroma_h = 0,
+.comp = {
+{ 0, 4, 0, 0, 32, 3, 31, 1 },   /* Y */
+},
+.flags = AV_PIX_FMT_FLAG_FLOAT,
+.alias = "yf32le",
+},
 };
 #if FF_API_PLUS1_MINUS1
 FF_ENABLE_DEPRECATION_WARNINGS
diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
index 2b3307845e..7b254732d8 100644
--- a/libavutil/pixfmt.h
+++ b/libavutil/pixfmt.h
@@ -337,6 +337,9 @@ enum AVPixelFormat {
 AV_PIX_FMT_GRAY14BE,   ///<Y, 14bpp, big-endian
 AV_PIX_FMT_GRAY14LE,   ///<Y, 14bpp, little-endian
 
+AV_PIX_FMT_GRAYF32BE,  ///< IEEE-754 single precision Y, 32bpp, big-endian
+AV_PIX_FMT_GRAYF32LE,  ///< IEEE-754 single precision Y, 32bpp, little-endian
+
 AV_PIX_FMT_NB ///< number of pixel formats, DO NOT USE THIS if you want to link with shared libav* because the number of formats might differ between versions
 };
 
@@ -405,6 +408,8 @@ enum AVPixelFormat {
 #define AV_PIX_FMT_GBRPF32AV_PIX_FMT_NE(GBRPF32BE,  GBRPF32LE)
 #define AV_PIX_FMT_GBRAPF32   AV_PIX_FMT_NE(GBRAPF32BE, GBRAPF32LE)
 
+#define AV_PIX_FMT_GRAYF32 AV_PIX_FMT_NE(GRAYF32BE, GRAYF32LE)
+
 #define AV_PIX_FMT_YUVA420P9  AV_PIX_FMT_NE(YUVA420P9BE , YUVA420P9LE)
 #define AV_PIX_FMT_YUVA422P9  AV_PIX_FMT_NE(YUVA422P9BE , YUVA422P9LE)
 #define AV_PIX_FMT_YUVA444P9  AV_PIX_FMT_NE(YUVA444P9BE , YUVA444P9LE)
diff --git a/libavutil/version.h b/libavutil/version.h
index 44bdebdc93..5205c5bc60 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -79,7 +79,7 @@
  */
 
 #define LIBAVUTIL_VERSION_MAJOR  56
-#define LIBAVUTIL_VERSION_MINOR  18
+#define LIBAVUTIL_VERSION_MINOR  19
 #define LIBAVUTIL_VERSION_MICRO 102
 
 #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
diff --git a/tests/ref/fate/sws-pixdesc-query b/tests/ref/fate/sws-pixdesc-query
index 8071ec484d..451c7d83b9 100644
--- a/tests/ref/fate/sws-pixdesc-query
+++ b/tests/ref/fate/sws-pixdesc-query
@@ -126,6 +126,7 @@ isBE:
   gray14be
   gray16be
   gray9be
+  grayf32be
   nv20be
   p010be
   p016be
@@ -412,6 +413,8 @@ Gray:
   gray16le
   gray9be
   gray9le
+  grayf32be
+  grayf32le
   ya16be
   ya16le
   ya8
-- 
2.14.1

From 35f97f77465bec4344ac7d5a6742388d9c1470cc Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Fri, 3 Aug 2018 18:06:50 +0300
Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray format.

---
 libswscale/input.c   | 38 
 libswscale/output.c  | 60 
 libswscale/ppc/swscale_altivec.c |  1 +
 libswscale/swscale_internal.h|  9 +
 libswscale/swscale_unscaled.c| 54 ++--
 libswscale/utils.c   | 20 ++-
 libswscale/x86/swscale_template.c|  3 +-
 tests/ref/fate/filter-pixdesc-grayf32be  |  1 +
 tests/ref/fate/filter-pixdesc-grayf32le  |  1 +
 tests/ref/fate/filter-pixfmts-copy   |  2 ++
 tests/ref/fate/filter-pixfmts-crop   |  2 ++
 tests/ref/fate/filter-pixfmts-field  |  2 ++
 tests/ref/fate/filter-pixfmts-fieldorder |  2 ++
 tests/ref/fate/filter-pixfmts-hflip  |  2 ++
 tests/ref/fate/filter-pixfmts-il |  2 ++
 tests/ref/fate/filter-pixfmts-null   |  2 ++
 tests/ref/fate/filter-pixfmts-scale  |  2 ++
 tests/ref/fate/filter-pixfmts-transpose  |  2 ++
 tests/ref/fate/filter-pixfmts-vflip  |  2 ++
 19 files changed, 203 insertions(+), 4 deletions(-)
 create mode 100644 tests/ref/fate/filter-pi

Re: [FFmpeg-devel] [PATCH 5/7] libavfilter/dnn_backend_tf.c: Fixes ff_dnn_free_model_tf.

2018-08-06 Thread Sergey Lavrushkin

Updated patch.
From 11186187d0b5a4725415a91947f38d5e166e024c Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Tue, 31 Jul 2018 18:40:24 +0300
Subject: [PATCH 6/9] libavfilter/dnn_backend_tf.c: Fixes ff_dnn_free_model_tf.

---
 libavfilter/dnn_backend_tf.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
index bd21137a8a..971a914c67 100644
--- a/libavfilter/dnn_backend_tf.c
+++ b/libavfilter/dnn_backend_tf.c
@@ -571,7 +571,9 @@ void ff_dnn_free_model_tf(DNNModel **model)
 if (tf_model->input_tensor){
 TF_DeleteTensor(tf_model->input_tensor);
 }
-av_freep(&tf_model->output_data->data);
+if (tf_model->output_data){
+av_freep(&(tf_model->output_data->data));
+}
 av_freep(&tf_model);
 av_freep(model);
 }
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 7/7] libavfilter: Adds proper file descriptions to dnn_srcnn.h and dnn_espcn.h.

2018-08-06 Thread Sergey Lavrushkin

Updated patch.

2018-08-02 21:52 GMT+03:00 Sergey Lavrushkin :

> ---
>  libavfilter/dnn_espcn.h | 3 ++-
>  libavfilter/dnn_srcnn.h | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h
> index 9344aa90fe..e0013fe1dd 100644
> --- a/libavfilter/dnn_espcn.h
> +++ b/libavfilter/dnn_espcn.h
> @@ -20,7 +20,8 @@
>
>  /**
>   * @file
> - * Default cnn weights for x2 upscaling with espcn model.
> + * This file contains CNN weights for ESPCN model (
> https://arxiv.org/abs/1609.05158),
> + * auto generated by scripts provided in the repository:
> https://github.com/HighVoltageRocknRoll/sr.git.
>   */
>
>  #ifndef AVFILTER_DNN_ESPCN_H
> diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h
> index 4f5332ce18..8bf563bd62 100644
> --- a/libavfilter/dnn_srcnn.h
> +++ b/libavfilter/dnn_srcnn.h
> @@ -20,7 +20,8 @@
>
>  /**
>   * @file
> - * Default cnn weights for x2 upscaling with srcnn model.
> + * This file contains CNN weights for SRCNN model (
> https://arxiv.org/abs/1501.00092),
> + * auto generated by scripts provided in the repository:
> https://github.com/HighVoltageRocknRoll/sr.git.
>   */
>
>  #ifndef AVFILTER_DNN_SRCNN_H
> --
> 2.14.1
>
>
From c2060d992664087fcfffa447768a6ad8f5e38623 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Thu, 2 Aug 2018 19:56:23 +0300
Subject: [PATCH 8/9] libavfilter: Adds proper file descriptions to dnn_srcnn.h
 and dnn_espcn.h.

---
 libavfilter/dnn_espcn.h | 3 ++-
 libavfilter/dnn_srcnn.h | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/libavfilter/dnn_espcn.h b/libavfilter/dnn_espcn.h
index 9344aa90fe..e0013fe1dd 100644
--- a/libavfilter/dnn_espcn.h
+++ b/libavfilter/dnn_espcn.h
@@ -20,7 +20,8 @@
 
 /**
  * @file
- * Default cnn weights for x2 upscaling with espcn model.
+ * This file contains CNN weights for ESPCN model (https://arxiv.org/abs/1609.05158),
+ * auto generated by scripts provided in the repository: https://github.com/HighVoltageRocknRoll/sr.git.
  */
 
 #ifndef AVFILTER_DNN_ESPCN_H
diff --git a/libavfilter/dnn_srcnn.h b/libavfilter/dnn_srcnn.h
index 4f5332ce18..8bf563bd62 100644
--- a/libavfilter/dnn_srcnn.h
+++ b/libavfilter/dnn_srcnn.h
@@ -20,7 +20,8 @@
 
 /**
  * @file
- * Default cnn weights for x2 upscaling with srcnn model.
+ * This file contains CNN weights for SRCNN model (https://arxiv.org/abs/1501.00092),
+ * auto generated by scripts provided in the repository: https://github.com/HighVoltageRocknRoll/sr.git.
  */
 
 #ifndef AVFILTER_DNN_SRCNN_H
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] Documentation for sr filter

2018-08-06 Thread Sergey Lavrushkin


From f076c4be5455331958b928fcea6b3dd8da287527 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Fri, 3 Aug 2018 17:24:00 +0300
Subject: [PATCH 9/9] doc/filters.texi: Adds documentation for sr filter.

---
 doc/filters.texi | 60 
 1 file changed, 60 insertions(+)

diff --git a/doc/filters.texi b/doc/filters.texi
index 0b0903e5a7..e2436a24e7 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -15394,6 +15394,66 @@ option may cause flicker since the B-Frames have often larger QP. Default is
 @code{0} (not enabled).
 @end table
 
+@section sr
+
+Scale the input by applying one of the super-resolution methods based on
+convolutional neural networks.
+
+Training scripts as well as scripts for model generation are provided in
+the repository @url{https://github.com/HighVoltageRocknRoll/sr.git}.
+
+The filter accepts the following options:
+
+@table @option
+@item model
+Specify what super-resolution model to use. This option accepts the following values:
+
+@table @samp
+@item srcnn
+Super-Resolution Convolutional Neural Network model
+@url{https://arxiv.org/abs/1501.00092}.
+
+@item espcn
+Efficient Sub-Pixel Convolutional Neural Network model
+@url{https://arxiv.org/abs/1609.05158}.
+
+@end table
+
+Default value is @samp{srcnn}.
+
+@item dnn_backend
+Specify what DNN backend to use for model loading and execution. This option accepts
+the following values:
+
+@table @samp
+@item native
+Native implementation of DNN loading and execution.
+
+@item tensorflow
+TensorFlow backend @url{https://www.tensorflow.org/}. To enable this backend you
+need to install the TensorFlow for C library (see
+@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with
+@code{--enable-libtensorflow}
+
+@end table
+
+Default value is @samp{native}.
+
+@item scale_factor
+Set scale factor for SRCNN model, for which custom model file was provided.
+Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is neccessary
+for SRCNN model, because it accepts input upscaled using bicubic upscaling with
+proper scale factor.
+
+Default value is @code{2}.
+
+@item model_filename
+Set path to model file specifying network architecture and its parameters.
+Note that different backends use different file format. If path to model
+file is not specified, built-in models for 2x upscaling are used.
+
+@end table
+
 @anchor{subtitles}
 @section subtitles
 
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 6/7] libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8 conversions.

2018-08-06 Thread Sergey Lavrushkin

Updated patch.

2018-08-02 21:52 GMT+03:00 Sergey Lavrushkin :

> This patch removes conversions, declared inside the sr filter, and uses
> libswscale inside
> the filter to perform them for only Y channel of input. The sr filter
> still has uint
> formats as input, as it does not use chroma channels in models and these
> channels are
> upscaled using libswscale, float formats for input would cause unnecessary
> conversions
> during scaling for these channels.
>
> ---
>  libavfilter/vf_sr.c | 134 +++---
> --
>  1 file changed, 48 insertions(+), 86 deletions(-)
>
> diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
> index 944a0e28e7..5ad1baa4c0 100644
> --- a/libavfilter/vf_sr.c
> +++ b/libavfilter/vf_sr.c
> @@ -45,8 +45,8 @@ typedef struct SRContext {
>  DNNModel *model;
>  DNNData input, output;
>  int scale_factor;
> -struct SwsContext *sws_context;
> -int sws_slice_h;
> +struct SwsContext *sws_contexts[3];
> +int sws_slice_h, sws_input_linesize, sws_output_linesize;
>  } SRContext;
>
>  #define OFFSET(x) offsetof(SRContext, x)
> @@ -95,6 +95,10 @@ static av_cold int init(AVFilterContext *context)
>  return AVERROR(EIO);
>  }
>
> +sr_context->sws_contexts[0] = NULL;
> +sr_context->sws_contexts[1] = NULL;
> +sr_context->sws_contexts[2] = NULL;
> +
>  return 0;
>  }
>
> @@ -110,6 +114,7 @@ static int query_formats(AVFilterContext *context)
>  av_log(context, AV_LOG_ERROR, "could not create formats list\n");
>  return AVERROR(ENOMEM);
>  }
> +
>  return ff_set_common_formats(context, formats_list);
>  }
>
> @@ -140,21 +145,31 @@ static int config_props(AVFilterLink *inlink)
>  else{
>  outlink->h = sr_context->output.height;
>  outlink->w = sr_context->output.width;
> +sr_context->sws_contexts[1] = sws_getContext(sr_context->input.width,
> sr_context->input.height, AV_PIX_FMT_GRAY8,
> +
>  sr_context->input.width, sr_context->input.height, AV_PIX_FMT_GRAYF32,
> + 0, NULL, NULL, NULL);
> +sr_context->sws_input_linesize = sr_context->input.width << 2;
> +sr_context->sws_contexts[2] = 
> sws_getContext(sr_context->output.width,
> sr_context->output.height, AV_PIX_FMT_GRAYF32,
> +
>  sr_context->output.width, sr_context->output.height, AV_PIX_FMT_GRAY8,
> + 0, NULL, NULL, NULL);
> +sr_context->sws_output_linesize = sr_context->output.width << 2;
> +if (!sr_context->sws_contexts[1] || !sr_context->sws_contexts[2]){
> +av_log(context, AV_LOG_ERROR, "could not create SwsContext
> for conversions\n");
> +return AVERROR(ENOMEM);
> +}
>  switch (sr_context->model_type){
>  case SRCNN:
> -sr_context->sws_context = sws_getContext(inlink->w,
> inlink->h, inlink->format,
> - outlink->w,
> outlink->h, outlink->format, SWS_BICUBIC, NULL, NULL, NULL);
> -if (!sr_context->sws_context){
> -av_log(context, AV_LOG_ERROR, "could not create
> SwsContext\n");
> +sr_context->sws_contexts[0] = sws_getContext(inlink->w,
> inlink->h, inlink->format,
> + outlink->w,
> outlink->h, outlink->format,
> + SWS_BICUBIC,
> NULL, NULL, NULL);
> +if (!sr_context->sws_contexts[0]){
> +av_log(context, AV_LOG_ERROR, "could not create
> SwsContext for scaling\n");
>  return AVERROR(ENOMEM);
>  }
>  sr_context->sws_slice_h = inlink->h;
>  break;
>  case ESPCN:
> -if (inlink->format == AV_PIX_FMT_GRAY8){
> -sr_context->sws_context = NULL;
> -}
> -else{
> +if (inlink->format != AV_PIX_FMT_GRAY8){
>  sws_src_h = sr_context->input.height;
>  sws_src_w = sr_context->input.width;
>  sws_dst_h = sr_context->output.height;
> @@ -184,13 +199,14 @@ static int config_props(AVFilterLink *inlink)
>  sws_dst_w = AV_CEIL_RSHIFT(sws_dst_w, 2);
>  break;
>  default:
> -av_log(context, AV_LOG_ERROR, "could not

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-09 Thread Sergey Lavrushkin

Here are updated patches with fixes. I updated conversion functions, so
they should
properly work with format for different endianness.

2018-08-08 1:47 GMT+03:00 Michael Niedermayer :

> On Tue, Aug 07, 2018 at 12:17:58AM +0300, Sergey Lavrushkin wrote:
> > I split patch to one for libavutil and another for libswscale,
> > also added LUT for unscaled conversion, added
> > conversions for scaling and updated fate tests.
>
> >  libavutil/pixdesc.c  |   22 ++
> >  libavutil/pixfmt.h   |5 +
> >  libavutil/version.h  |2 +-
> >  tests/ref/fate/sws-pixdesc-query |3 +++
> >  4 files changed, 31 insertions(+), 1 deletion(-)
> > b58f328f5d90954c62957f127b1acbfad5795a4d  0004-libavutil-Adds-gray-
> floating-point-pixel-formats.patch
> > From 8bcc10b49c41612b4d6549e64d90acf3f0b3fc6a Mon Sep 17 00:00:00 2001
> > From: Sergey Lavrushkin 
> > Date: Fri, 3 Aug 2018 18:02:49 +0300
> > Subject: [PATCH 4/9] libavutil: Adds gray floating-point pixel formats.
> >
> > ---
> >  libavutil/pixdesc.c  | 22 ++
> >  libavutil/pixfmt.h   |  5 +
> >  libavutil/version.h  |  2 +-
> >  tests/ref/fate/sws-pixdesc-query |  3 +++
> >  4 files changed, 31 insertions(+), 1 deletion(-)
> >
> > diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
> > index 96e079584a..970a83214c 100644
> > --- a/libavutil/pixdesc.c
> > +++ b/libavutil/pixdesc.c
> > @@ -2206,6 +2206,28 @@ static const AVPixFmtDescriptor
> av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
> >  .name  = "opencl",
> >  .flags = AV_PIX_FMT_FLAG_HWACCEL,
> >  },
> > +[AV_PIX_FMT_GRAYF32BE] = {
> > +.name = "grayf32be",
> > +.nb_components = 1,
> > +.log2_chroma_w = 0,
> > +.log2_chroma_h = 0,
> > +.comp = {
> > +{ 0, 4, 0, 0, 32, 3, 31, 1 },   /* Y */
> > +},
> > +.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_FLOAT,
> > +.alias = "yf32be",
> > +},
> > +[AV_PIX_FMT_GRAYF32LE] = {
> > +.name = "grayf32le",
> > +.nb_components = 1,
> > +.log2_chroma_w = 0,
> > +.log2_chroma_h = 0,
> > +.comp = {
> > +{ 0, 4, 0, 0, 32, 3, 31, 1 },   /* Y */
> > +},
> > +.flags = AV_PIX_FMT_FLAG_FLOAT,
> > +.alias = "yf32le",
> > +},
> >  };
> >  #if FF_API_PLUS1_MINUS1
> >  FF_ENABLE_DEPRECATION_WARNINGS
> > diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
> > index 2b3307845e..7b254732d8 100644
> > --- a/libavutil/pixfmt.h
> > +++ b/libavutil/pixfmt.h
> > @@ -337,6 +337,9 @@ enum AVPixelFormat {
> >  AV_PIX_FMT_GRAY14BE,   ///<Y, 14bpp, big-endian
> >  AV_PIX_FMT_GRAY14LE,   ///<Y, 14bpp, little-endian
> >
> > +AV_PIX_FMT_GRAYF32BE,  ///< IEEE-754 single precision Y, 32bpp,
> big-endian
> > +AV_PIX_FMT_GRAYF32LE,  ///< IEEE-754 single precision Y, 32bpp,
> little-endian
> > +
> >  AV_PIX_FMT_NB ///< number of pixel formats, DO NOT USE THIS
> if you want to link with shared libav* because the number of formats might
> differ between versions
> >  };
> >
> > @@ -405,6 +408,8 @@ enum AVPixelFormat {
> >  #define AV_PIX_FMT_GBRPF32AV_PIX_FMT_NE(GBRPF32BE,  GBRPF32LE)
> >  #define AV_PIX_FMT_GBRAPF32   AV_PIX_FMT_NE(GBRAPF32BE, GBRAPF32LE)
> >
> > +#define AV_PIX_FMT_GRAYF32 AV_PIX_FMT_NE(GRAYF32BE, GRAYF32LE)
> > +
> >  #define AV_PIX_FMT_YUVA420P9  AV_PIX_FMT_NE(YUVA420P9BE , YUVA420P9LE)
> >  #define AV_PIX_FMT_YUVA422P9  AV_PIX_FMT_NE(YUVA422P9BE , YUVA422P9LE)
> >  #define AV_PIX_FMT_YUVA444P9  AV_PIX_FMT_NE(YUVA444P9BE , YUVA444P9LE)
> > diff --git a/libavutil/version.h b/libavutil/version.h
> > index 44bdebdc93..5205c5bc60 100644
> > --- a/libavutil/version.h
> > +++ b/libavutil/version.h
>
> > @@ -79,7 +79,7 @@
> >   */
> >
> >  #define LIBAVUTIL_VERSION_MAJOR  56
> > -#define LIBAVUTIL_VERSION_MINOR  18
> > +#define LIBAVUTIL_VERSION_MINOR  19
> >  #define LIBAVUTIL_VERSION_MICRO 102
>
> a bump to minor must reset micro to 100
>
> [...]
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> In a rich man's house there is no place to spit but his face.
> -- Diogenes of Sinope
>
> ___
> ffmp

Re: [FFmpeg-devel] [PATCH] Documentation for sr filter

2018-08-09 Thread Sergey Lavrushkin

2018-08-07 13:14 GMT+03:00 Moritz Barsnick :

> On Tue, Aug 07, 2018 at 00:24:29 +0300, Sergey Lavrushkin wrote:
> > +@table @option
> > +@item model
> > +Specify what super-resolution model to use. This option accepts the
> following values:
>^ nit: which
>
> > +Specify what DNN backend to use for model loading and execution. This
> option accepts
> Ditto
>
> > +Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is
> neccessary
>^
> necessary
>
> > +Note that different backends use different file format. If path to model
>^ formats
>

Here is updated patch.
From 1fed1ea07a5727d937228307bffbde13e6727669 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Fri, 3 Aug 2018 17:24:00 +0300
Subject: [PATCH 9/9] doc/filters.texi: Adds documentation for sr filter.

---
 doc/filters.texi | 60 
 1 file changed, 60 insertions(+)

diff --git a/doc/filters.texi b/doc/filters.texi
index 0b0903e5a7..e2436a24e7 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -15394,6 +15394,66 @@ option may cause flicker since the B-Frames have often larger QP. Default is
 @code{0} (not enabled).
 @end table
 
+@section sr
+
+Scale the input by applying one of the super-resolution methods based on
+convolutional neural networks.
+
+Training scripts as well as scripts for model generation are provided in
+the repository @url{https://github.com/HighVoltageRocknRoll/sr.git}.
+
+The filter accepts the following options:
+
+@table @option
+@item model
+Specify what super-resolution model to use. This option accepts the following values:
+
+@table @samp
+@item srcnn
+Super-Resolution Convolutional Neural Network model
+@url{https://arxiv.org/abs/1501.00092}.
+
+@item espcn
+Efficient Sub-Pixel Convolutional Neural Network model
+@url{https://arxiv.org/abs/1609.05158}.
+
+@end table
+
+Default value is @samp{srcnn}.
+
+@item dnn_backend
+Specify what DNN backend to use for model loading and execution. This option accepts
+the following values:
+
+@table @samp
+@item native
+Native implementation of DNN loading and execution.
+
+@item tensorflow
+TensorFlow backend @url{https://www.tensorflow.org/}. To enable this backend you
+need to install the TensorFlow for C library (see
+@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with
+@code{--enable-libtensorflow}
+
+@end table
+
+Default value is @samp{native}.
+
+@item scale_factor
+Set scale factor for SRCNN model, for which custom model file was provided.
+Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is neccessary
+for SRCNN model, because it accepts input upscaled using bicubic upscaling with
+proper scale factor.
+
+Default value is @code{2}.
+
+@item model_filename
+Set path to model file specifying network architecture and its parameters.
+Note that different backends use different file format. If path to model
+file is not specified, built-in models for 2x upscaling are used.
+
+@end table
+
 @anchor{subtitles}
 @section subtitles
 
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] Documentation for sr filter

2018-08-11 Thread Sergey Lavrushkin

Sorry, I accidentally sent previous patch, here is updated version.
From 99afeefe4add5b932140388f48ec4111734aa593 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Fri, 3 Aug 2018 17:24:00 +0300
Subject: [PATCH 9/9] doc/filters.texi: Adds documentation for sr filter.

---
 doc/filters.texi | 60 
 1 file changed, 60 insertions(+)

diff --git a/doc/filters.texi b/doc/filters.texi
index 0b0903e5a7..9995ca532b 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -15394,6 +15394,66 @@ option may cause flicker since the B-Frames have often larger QP. Default is
 @code{0} (not enabled).
 @end table
 
+@section sr
+
+Scale the input by applying one of the super-resolution methods based on
+convolutional neural networks.
+
+Training scripts as well as scripts for model generation are provided in
+the repository @url{https://github.com/HighVoltageRocknRoll/sr.git}.
+
+The filter accepts the following options:
+
+@table @option
+@item model
+Specify which super-resolution model to use. This option accepts the following values:
+
+@table @samp
+@item srcnn
+Super-Resolution Convolutional Neural Network model
+@url{https://arxiv.org/abs/1501.00092}.
+
+@item espcn
+Efficient Sub-Pixel Convolutional Neural Network model
+@url{https://arxiv.org/abs/1609.05158}.
+
+@end table
+
+Default value is @samp{srcnn}.
+
+@item dnn_backend
+Specify which DNN backend to use for model loading and execution. This option accepts
+the following values:
+
+@table @samp
+@item native
+Native implementation of DNN loading and execution.
+
+@item tensorflow
+TensorFlow backend @url{https://www.tensorflow.org/}. To enable this backend you
+need to install the TensorFlow for C library (see
+@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with
+@code{--enable-libtensorflow}
+
+@end table
+
+Default value is @samp{native}.
+
+@item scale_factor
+Set scale factor for SRCNN model, for which custom model file was provided.
+Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is necessary
+for SRCNN model, because it accepts input upscaled using bicubic upscaling with
+proper scale factor.
+
+Default value is @code{2}.
+
+@item model_filename
+Set path to model file specifying network architecture and its parameters.
+Note that different backends use different file formats. If path to model
+file is not specified, built-in models for 2x upscaling are used.
+
+@end table
+
 @anchor{subtitles}
 @section subtitles
 
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-11 Thread Sergey Lavrushkin

2018-08-10 20:24 GMT+03:00 Michael Niedermayer :

> On Thu, Aug 09, 2018 at 08:15:16PM +0300, Sergey Lavrushkin wrote:
> > Here are updated patches with fixes. I updated conversion functions, so
> > they should
> > properly work with format for different endianness.
> [...]
> > diff --git a/libswscale/input.c b/libswscale/input.c
> > index 3fd3a5d81e..0e016d387f 100644
> > --- a/libswscale/input.c
> > +++ b/libswscale/input.c
> > @@ -942,6 +942,30 @@ static av_always_inline void
> planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV,
> >  }
> >  #undef rdpx
> >
> > +static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const
> uint8_t *_src, const uint8_t *unused1,
> > +const uint8_t *unused2, int
> width, uint32_t *unused)
> > +{
> > +int i;
> > +const float *src = (const float *)_src;
> > +uint16_t *dst= (uint16_t *)_dst;
> > +
> > +for (i = 0; i < width; ++i){
> > +dst[i] = lrintf(65535.0f * FFMIN(FFMAX(src[i], 0.0f), 1.0f));
> > +}
> > +}
>
> is it faster to clip the float before lrintf() than the integer afterwards
> ?
>

Clipping integers is faster, switched to it.


> [...]
> > diff --git a/libswscale/output.c b/libswscale/output.c
> > index 0af2fffea4..cd408fb285 100644
> > --- a/libswscale/output.c
> > +++ b/libswscale/output.c
> > @@ -208,6 +208,121 @@ static void yuv2p016cX_c(SwsContext *c, const
> int16_t *chrFilter, int chrFilterS
> >  }
> >  }
> >
> > +static av_always_inline void
> > +yuv2plane1_float_c_template(const int32_t *src, float *dest, int dstW)
> > +{
> > +#if HAVE_BIGENDIAN
> > +static const int big_endian = 1;
> > +#else
> > +static const int big_endian = 0;
> > +#endif
>
> you can use HAVE_BIGENDIAN in place of big_endian
> its either 0 or 1 already
> or static const int big_endian = HAVE_BIGENDIAN
>

Ok.

Here is updated patch.
From cf523bcb50537abbf6daf0eb799341d8b706d366 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Fri, 3 Aug 2018 18:06:50 +0300
Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray format.

---
 libswscale/input.c   |  38 +++
 libswscale/output.c  | 105 +++
 libswscale/ppc/swscale_altivec.c |   1 +
 libswscale/swscale_internal.h|   9 +++
 libswscale/swscale_unscaled.c|  54 +++-
 libswscale/utils.c   |  20 +-
 libswscale/x86/swscale_template.c|   3 +-
 tests/ref/fate/filter-pixdesc-grayf32be  |   1 +
 tests/ref/fate/filter-pixdesc-grayf32le  |   1 +
 tests/ref/fate/filter-pixfmts-copy   |   2 +
 tests/ref/fate/filter-pixfmts-crop   |   2 +
 tests/ref/fate/filter-pixfmts-field  |   2 +
 tests/ref/fate/filter-pixfmts-fieldorder |   2 +
 tests/ref/fate/filter-pixfmts-hflip  |   2 +
 tests/ref/fate/filter-pixfmts-il |   2 +
 tests/ref/fate/filter-pixfmts-null   |   2 +
 tests/ref/fate/filter-pixfmts-scale  |   2 +
 tests/ref/fate/filter-pixfmts-transpose  |   2 +
 tests/ref/fate/filter-pixfmts-vflip  |   2 +
 19 files changed, 248 insertions(+), 4 deletions(-)
 create mode 100644 tests/ref/fate/filter-pixdesc-grayf32be
 create mode 100644 tests/ref/fate/filter-pixdesc-grayf32le

diff --git a/libswscale/input.c b/libswscale/input.c
index 3fd3a5d81e..7e45df50ce 100644
--- a/libswscale/input.c
+++ b/libswscale/input.c
@@ -942,6 +942,30 @@ static av_always_inline void planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV,
 }
 #undef rdpx
 
+static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1,
+const uint8_t *unused2, int width, uint32_t *unused)
+{
+int i;
+const float *src = (const float *)_src;
+uint16_t *dst= (uint16_t *)_dst;
+
+for (i = 0; i < width; ++i){
+dst[i] = FFMIN(FFMAX(lrintf(65535.0f * src[i]), 0), 65535);
+}
+}
+
+static av_always_inline void grayf32ToY16_bswap_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1,
+  const uint8_t *unused2, int width, uint32_t *unused)
+{
+int i;
+const uint32_t *src = (const uint32_t *)_src;
+uint16_t *dst= (uint16_t *)_dst;
+
+for (i = 0; i < width; ++i){
+dst[i] = FFMIN(FFMAX(lrintf(65535.0f * av_int2float(av_bswap32(src[i]))), 0.0f), 65535);
+}
+}
+
 #define rgb9plus_planar_funcs_endian(nbits, endian_name, endian)\
 static void planar_rgb##nbits##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4],  \
   int w, int32_t *rgb2yuv)

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-11 Thread Sergey Lavrushkin

2018-08-12 0:45 GMT+03:00 Michael Niedermayer :

> On Sat, Aug 11, 2018 at 05:52:32PM +0300, Sergey Lavrushkin wrote:
> > 2018-08-10 20:24 GMT+03:00 Michael Niedermayer :
> >
> > > On Thu, Aug 09, 2018 at 08:15:16PM +0300, Sergey Lavrushkin wrote:
> > > > Here are updated patches with fixes. I updated conversion functions,
> so
> > > > they should
> > > > properly work with format for different endianness.
> > > [...]
> > > > diff --git a/libswscale/input.c b/libswscale/input.c
> > > > index 3fd3a5d81e..0e016d387f 100644
> > > > --- a/libswscale/input.c
> > > > +++ b/libswscale/input.c
> > > > @@ -942,6 +942,30 @@ static av_always_inline void
> > > planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV,
> > > >  }
> > > >  #undef rdpx
> > > >
> > > > +static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const
> > > uint8_t *_src, const uint8_t *unused1,
> > > > +const uint8_t *unused2,
> int
> > > width, uint32_t *unused)
> > > > +{
> > > > +int i;
> > > > +const float *src = (const float *)_src;
> > > > +uint16_t *dst= (uint16_t *)_dst;
> > > > +
> > > > +for (i = 0; i < width; ++i){
> > > > +dst[i] = lrintf(65535.0f * FFMIN(FFMAX(src[i], 0.0f),
> 1.0f));
> > > > +}
> > > > +}
> > >
> > > is it faster to clip the float before lrintf() than the integer
> afterwards
> > > ?
> > >
> >
> > Clipping integers is faster, switched to it.
> >
> >
> > > [...]
> > > > diff --git a/libswscale/output.c b/libswscale/output.c
> > > > index 0af2fffea4..cd408fb285 100644
> > > > --- a/libswscale/output.c
> > > > +++ b/libswscale/output.c
> > > > @@ -208,6 +208,121 @@ static void yuv2p016cX_c(SwsContext *c, const
> > > int16_t *chrFilter, int chrFilterS
> > > >  }
> > > >  }
> > > >
> > > > +static av_always_inline void
> > > > +yuv2plane1_float_c_template(const int32_t *src, float *dest, int
> dstW)
> > > > +{
> > > > +#if HAVE_BIGENDIAN
> > > > +static const int big_endian = 1;
> > > > +#else
> > > > +static const int big_endian = 0;
> > > > +#endif
> > >
> > > you can use HAVE_BIGENDIAN in place of big_endian
> > > its either 0 or 1 already
> > > or static const int big_endian = HAVE_BIGENDIAN
> > >
> >
> > Ok.
> >
> > Here is updated patch.
>
> >  libswscale/input.c   |   38 +++
> >  libswscale/output.c  |  105
> +++
> >  libswscale/ppc/swscale_altivec.c |1
> >  libswscale/swscale_internal.h|9 ++
> >  libswscale/swscale_unscaled.c|   54 +++
> >  libswscale/utils.c   |   20 +
> >  libswscale/x86/swscale_template.c|3
> >  tests/ref/fate/filter-pixdesc-grayf32be  |1
> >  tests/ref/fate/filter-pixdesc-grayf32le  |1
> >  tests/ref/fate/filter-pixfmts-copy   |2
> >  tests/ref/fate/filter-pixfmts-crop   |2
> >  tests/ref/fate/filter-pixfmts-field  |2
> >  tests/ref/fate/filter-pixfmts-fieldorder |2
> >  tests/ref/fate/filter-pixfmts-hflip  |2
> >  tests/ref/fate/filter-pixfmts-il |2
> >  tests/ref/fate/filter-pixfmts-null   |2
> >  tests/ref/fate/filter-pixfmts-scale  |2
> >  tests/ref/fate/filter-pixfmts-transpose  |2
> >  tests/ref/fate/filter-pixfmts-vflip  |2
> >  19 files changed, 248 insertions(+), 4 deletions(-)
> > db401051d0e42132f7ce76cb78de584951be704b  0005-libswscale-Adds-
> conversions-from-to-float-gray-forma.patch
> > From cf523bcb50537abbf6daf0eb799341d8b706d366 Mon Sep 17 00:00:00 2001
> > From: Sergey Lavrushkin 
> > Date: Fri, 3 Aug 2018 18:06:50 +0300
> > Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray
> format.
> >
> > ---
> >  libswscale/input.c   |  38 +++
> >  libswscale/output.c  | 105
> +++
> >  libswscale/ppc/swscale_altivec.c |   1 +
> >  libswscale/swscale_internal.h|   9 +++
> >  libswscale/swscale_unscaled.c|  54 +++-
> >  libswscale/utils.c

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-13 Thread Sergey Lavrushkin

>
> Just use av_clipf instead of FFMIN/FFMAX.


Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
From 210e497d76328947fdf424b169728fa728cc18f2 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Fri, 3 Aug 2018 18:06:50 +0300
Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray format.

---
 libswscale/input.c   |  38 +++
 libswscale/output.c  | 105 +++
 libswscale/ppc/swscale_altivec.c |   1 +
 libswscale/swscale_internal.h|   9 +++
 libswscale/swscale_unscaled.c|  54 +++-
 libswscale/utils.c   |  20 +-
 libswscale/x86/swscale_template.c|   3 +-
 tests/ref/fate/filter-pixdesc-grayf32be  |   1 +
 tests/ref/fate/filter-pixdesc-grayf32le  |   1 +
 tests/ref/fate/filter-pixfmts-copy   |   2 +
 tests/ref/fate/filter-pixfmts-crop   |   2 +
 tests/ref/fate/filter-pixfmts-field  |   2 +
 tests/ref/fate/filter-pixfmts-fieldorder |   2 +
 tests/ref/fate/filter-pixfmts-hflip  |   2 +
 tests/ref/fate/filter-pixfmts-il |   2 +
 tests/ref/fate/filter-pixfmts-null   |   2 +
 tests/ref/fate/filter-pixfmts-scale  |   2 +
 tests/ref/fate/filter-pixfmts-transpose  |   2 +
 tests/ref/fate/filter-pixfmts-vflip  |   2 +
 19 files changed, 248 insertions(+), 4 deletions(-)
 create mode 100644 tests/ref/fate/filter-pixdesc-grayf32be
 create mode 100644 tests/ref/fate/filter-pixdesc-grayf32le

diff --git a/libswscale/input.c b/libswscale/input.c
index 3fd3a5d81e..4099c19c2b 100644
--- a/libswscale/input.c
+++ b/libswscale/input.c
@@ -942,6 +942,30 @@ static av_always_inline void planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV,
 }
 #undef rdpx
 
+static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1,
+const uint8_t *unused2, int width, uint32_t *unused)
+{
+int i;
+const float *src = (const float *)_src;
+uint16_t *dst= (uint16_t *)_dst;
+
+for (i = 0; i < width; ++i){
+dst[i] = av_clip_uint16(lrintf(65535.0f * src[i]));
+}
+}
+
+static av_always_inline void grayf32ToY16_bswap_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1,
+  const uint8_t *unused2, int width, uint32_t *unused)
+{
+int i;
+const uint32_t *src = (const uint32_t *)_src;
+uint16_t *dst= (uint16_t *)_dst;
+
+for (i = 0; i < width; ++i){
+dst[i] = av_clip_uint16(lrintf(65535.0f * av_int2float(av_bswap32(src[i];
+}
+}
+
 #define rgb9plus_planar_funcs_endian(nbits, endian_name, endian)\
 static void planar_rgb##nbits##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4],  \
   int w, int32_t *rgb2yuv)  \
@@ -1538,6 +1562,20 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c)
 case AV_PIX_FMT_P010BE:
 c->lumToYV12 = p010BEToY_c;
 break;
+case AV_PIX_FMT_GRAYF32LE:
+#if HAVE_BIGENDIAN
+c->lumToYV12 = grayf32ToY16_bswap_c;
+#else
+c->lumToYV12 = grayf32ToY16_c;
+#endif
+break;
+case AV_PIX_FMT_GRAYF32BE:
+#if HAVE_BIGENDIAN
+c->lumToYV12 = grayf32ToY16_c;
+#else
+c->lumToYV12 = grayf32ToY16_bswap_c;
+#endif
+break;
 }
 if (c->needAlpha) {
 if (is16BPS(srcFormat) || isNBPS(srcFormat)) {
diff --git a/libswscale/output.c b/libswscale/output.c
index 0af2fffea4..de8637aa3b 100644
--- a/libswscale/output.c
+++ b/libswscale/output.c
@@ -208,6 +208,105 @@ static void yuv2p016cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterS
 }
 }
 
+static av_always_inline void
+yuv2plane1_float_c_template(const int32_t *src, float *dest, int dstW)
+{
+static const int big_endian = HAVE_BIGENDIAN;
+static const int shift = 3;
+static const float float_mult = 1.0f / 65535.0f;
+int i, val;
+uint16_t val_uint;
+
+for (i = 0; i < dstW; ++i){
+val = src[i] + (1 << (shift - 1));
+output_pixel(&val_uint, val, 0, uint);
+dest[i] = float_mult * (float)val_uint;
+}
+}
+
+static av_always_inline void
+yuv2plane1_float_bswap_c_template(const int32_t *src, uint32_t *dest, int dstW)
+{
+static const int big_endian = HAVE_BIGENDIAN;
+static const int shift = 3;
+static const float float_mult = 1.0f / 65535.0f;
+int i, val;
+uint16_t val_uint;
+
+for (i = 0; i < dstW; ++i){
+val = src[i] + (1 << (shift - 1));
+output_pixel(&val_uint, val, 0, uint);
+dest[i] = av_bswap32(av_float2int(float_mult * (float)val_uint));
+}
+}
+
+static av_always_inline void
+yuv2planeX_float_c_template(const int16_t *filter, int filterSize, const int32_t **src,
+float *dest, int ds

Re: [FFmpeg-devel] [PATCH 6/7] libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8 conversions.

2018-08-15 Thread Sergey Lavrushkin

2018-08-15 1:49 GMT+03:00 Marton Balint :

>
> On Tue, 14 Aug 2018, Pedro Arthur wrote:
>
> 2018-08-14 15:45 GMT-03:00 Rostislav Pehlivanov :
>>
>>> On Thu, 2 Aug 2018 at 20:00, Sergey Lavrushkin 
>>> wrote:
>>>
>>> This patch removes conversions, declared inside the sr filter, and uses
>>>> libswscale inside
>>>> the filter to perform them for only Y channel of input. The sr filter
>>>> still has uint
>>>> formats as input, as it does not use chroma channels in models and these
>>>> channels are
>>>> upscaled using libswscale, float formats for input would cause
>>>> unnecessary
>>>> conversions
>>>> during scaling for these channels.
>>>>
>>>>
> [...]
>
> You are planning to remove *all* conversion still, right? Its still
>>> unacceptable that there *are* conversions.
>>>
>>
>> They are here because it is the most efficient way to do it. The
>> filter works only on luminance channel therefore we only apply
>> conversion to Y channel, and bicubic upscale to chrominance.
>> I can't see how one can achieve the same result, without doing useless
>> computations, if not in this way.
>>
>
> Is there a reason why only the luminance channel is scaled this way? Can't
> you also train scaling chroma planes the same way? This way you could
> really eliminate the internal calls to swscale. If the user prefers to
> scale only one channel, he can always split the planes and scale them
> separately (using different filters) and then merge them.
>

If it is possible, I can then change sr filter to work only for Y channel.
Can you give me some examples of how to split the planes, filter them
separately
and merge them back?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH 1/2] doc/filters.texi: Adds documentation for sr filter.

2018-08-15 Thread Sergey Lavrushkin

Resending patch with documentation for sr filter.

---
 doc/filters.texi | 60 
 1 file changed, 60 insertions(+)

diff --git a/doc/filters.texi b/doc/filters.texi
index 267bd04a43..b2a74cb1ce 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -15403,6 +15403,66 @@ option may cause flicker since the B-Frames have often 
larger QP. Default is
 @code{0} (not enabled).
 @end table
 
+@section sr
+
+Scale the input by applying one of the super-resolution methods based on
+convolutional neural networks.
+
+Training scripts as well as scripts for model generation are provided in
+the repository @url{https://github.com/HighVoltageRocknRoll/sr.git}.
+
+The filter accepts the following options:
+
+@table @option
+@item model
+Specify which super-resolution model to use. This option accepts the following 
values:
+
+@table @samp
+@item srcnn
+Super-Resolution Convolutional Neural Network model
+@url{https://arxiv.org/abs/1501.00092}.
+
+@item espcn
+Efficient Sub-Pixel Convolutional Neural Network model
+@url{https://arxiv.org/abs/1609.05158}.
+
+@end table
+
+Default value is @samp{srcnn}.
+
+@item dnn_backend
+Specify which DNN backend to use for model loading and execution. This option 
accepts
+the following values:
+
+@table @samp
+@item native
+Native implementation of DNN loading and execution.
+
+@item tensorflow
+TensorFlow backend @url{https://www.tensorflow.org/}. To enable this backend 
you
+need to install the TensorFlow for C library (see
+@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with
+@code{--enable-libtensorflow}
+
+@end table
+
+Default value is @samp{native}.
+
+@item scale_factor
+Set scale factor for SRCNN model, for which custom model file was provided.
+Allowed values are @code{2}, @code{3} and @code{4}. Scale factor is necessary
+for SRCNN model, because it accepts input upscaled using bicubic upscaling with
+proper scale factor.
+
+Default value is @code{2}.
+
+@item model_filename
+Set path to model file specifying network architecture and its parameters.
+Note that different backends use different file formats. If path to model
+file is not specified, built-in models for 2x upscaling are used.
+
+@end table
+
 @anchor{subtitles}
 @section subtitles
 
-- 
2.14.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/2] doc/filters.texi: Adds documentation for sr filter.

2018-08-15 Thread Sergey Lavrushkin

2018-08-15 19:59 GMT+03:00 Gyan Doshi :

>
>
> On 15-08-2018 10:05 PM, Sergey Lavrushkin wrote:
>
>> Resending patch with documentation for sr filter.
>>
>
> LGTM. Will apply with some small changes.
>
> I've merged the docs entry in the 2nd part, so remove it from there.
>

This entry corresponds to changes made in the second patch.
Without these changes it is not true.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-17 Thread Sergey Lavrushkin

пт, 17 авг. 2018 г., 6:47 James Almer :

> On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
> > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
> >>>
> >>> Just use av_clipf instead of FFMIN/FFMAX.
> >>
> >>
> >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
> >
> > will apply
> >
> > thanks
>
> This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
> tested for bitexact output. The gbrpf32 ones aren't, for example.
>
> http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx


If I am not mistaken, gbrpf32 formats are not supported in libswscale and
not tested because of that.

>
> Was a float gray pixfmt needed for this filter? Gray16 was not an option?
>

All calculations in neural network are done using floats.

What can I do to fix this issue? Can I get a VM image for this host to test
it?

>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 2/2] libavfilter: Removes stored DNN models. Adds support for native backend model file format in tf backend.

2018-08-17 Thread Sergey Lavrushkin

2018-08-17 17:46 GMT+03:00 Pedro Arthur :

> Hi,
>
> You did not provided any pre trained model files, so anyone trying to
> test it has to perform the whole training!
> I'm attaching the models I generated, if anyone is interested in testing
> it.
>
> When applying the filter with tf backend there are artifacts in the
> borders, for both srcnn and espcn (out_[srcnn|espcn]_tf.jpg).
> It seems that a few lines in the top row of the image are repeated for
> espcn using native backend (out_srcnn_nt.jpg).
>

I guess, it is because I didn't add any padding to the image and tf fills
borders with 0 for 'SAME' padding in convolutions. I'll add required padding
size calculation and insert padding operation to the graph.


> The model/model_filename options are not coherent, the model type
> should be defined in the file anyway therefore there is no need for
> both options.
> It is also buggy, if you specify the model_filename but not the model
> type it will default to srcnn even if the model file is for espcn, no
> error is generated and the output ofc is buggy.
>

I think, I can remove model type and check if model changes input size.
I think all my switches for model type actually depend on this condition.

If I remove conversions inside the filter and make it to work only for
one plane, it basically will become a filter that executes neural network
for one channel input. But there is a problem with float format - it brokes
fate on some 32 bit hosts, as James stated, and I need first to fix this
issue, or, otherwise, revert to doing conversions in the filter.


> I personally would prefer to use only model=file as it is shorter than
> model_filename=file.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-18 Thread Sergey Lavrushkin

2018-08-17 23:28 GMT+03:00 Michael Niedermayer :

> On Fri, Aug 17, 2018 at 12:46:52AM -0300, James Almer wrote:
> > On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
> > > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
> > >>>
> > >>> Just use av_clipf instead of FFMIN/FFMAX.
> > >>
> > >>
> > >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
> > >
> > > will apply
> > >
> > > thanks
> >
> > This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
> > tested for bitexact output. The gbrpf32 ones aren't, for example.
> > http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=
> x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx
>
> h
> i remember i had tested this locally on 32bit
> can something be slightly adjusted (like an offset or factor) to avoid any
> values becoming close to 0.5 and rounding differently on platforms ?

If not the tests should skip float pixel formats or try the nearest
> neighbor scaler
>

Can it really be the problem with scaler? Do all these failed test use
scaling?
Is not it the problem, that different platforms can give slightly different
results for
floating-point operations? Does input for the float format is somehow
generated
for these tests, so the input conversion is tested? Maybe it uses output
conversion first?
If it is the problem of different floating-point operations results on
different platforms,
maybe it is possible to use precomputed LUT for output conversion, so it
will give
the same results? Or is it possible to modify tests for the float format,
so it will
check if pixels of the result are just close to some reference.

> Sergey, can you look into this (its your patch) ? (just asking to make sure
> not eevryone thinks someone else will work on this)
>

Yes, I can, just need to know, what is possible to do to fix this issue,
besides skipping the tests.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 4/7] Adds gray floating-point pixel formats.

2018-08-20 Thread Sergey Lavrushkin

2018-08-18 23:20 GMT+03:00 Michael Niedermayer :

> On Sat, Aug 18, 2018 at 02:10:21PM +0300, Sergey Lavrushkin wrote:
> > 2018-08-17 23:28 GMT+03:00 Michael Niedermayer :
> >
> > > On Fri, Aug 17, 2018 at 12:46:52AM -0300, James Almer wrote:
> > > > On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
> > > > > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
> > > > >>>
> > > > >>> Just use av_clipf instead of FFMIN/FFMAX.
> > > > >>
> > > > >>
> > > > >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
> > > > >
> > > > > will apply
> > > > >
> > > > > thanks
> > > >
> > > > This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
> > > > tested for bitexact output. The gbrpf32 ones aren't, for example.
> > > > http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=
> > > x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx
> > >
> > > h
> > > i remember i had tested this locally on 32bit
> > > can something be slightly adjusted (like an offset or factor) to avoid
> any
> > > values becoming close to 0.5 and rounding differently on platforms ?
> >
> > If not the tests should skip float pixel formats or try the nearest
> > > neighbor scaler
> > >
> >
> > Can it really be the problem with scaler? Do all these failed test use
> > scaling?
> > Is not it the problem, that different platforms can give slightly
> different
> > results for
> > floating-point operations? Does input for the float format is somehow
> > generated
> > for these tests, so the input conversion is tested? Maybe it uses output
> > conversion first?
> > If it is the problem of different floating-point operations results on
> > different platforms,
>
> > maybe it is possible to use precomputed LUT for output conversion, so it
>
> I dont think we should change the "algorithm" to achive "bitexactness"
> we could of course but it feels like the wrong reason to make such a
> change. How its done should be choosen based on what is fast (and to a
> lesser extend clean, simple and maintainable)
>
>
>
> > will give
> > the same results? Or is it possible to modify tests for the float format,
> > so it will
> > check if pixels of the result are just close to some reference.
>
> Its possible to compare to a reference, we do this in some other tests,
> but thats surely more work than just disabling teh specific tests or trying
> to nudge them a little to see if that makes nothing fall too close to n +
> 0.5
>
> >
> >
> > > Sergey, can you look into this (its your patch) ? (just asking to make
> sure
> > > not eevryone thinks someone else will work on this)
> > >
> >
> > Yes, I can, just need to know, what is possible to do to fix this issue,
> > besides skipping the tests.
>
> most things are possible
>

Hi,

I am having trouble reproducing this error. These tests are fine for 32-bit
VMs on
my computers. So the only thing I can do is to disable these tests for
these formats.
Otherwise, I need to test other changes somehow. Here is the patch, that
skips
pixfmts tests for these formats.
From a92e6965f9c328fcaa18460ac9da975748272e0a Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin 
Date: Mon, 20 Aug 2018 23:14:07 +0300
Subject: [PATCH] tests: Disables pixfmts tests for float gray formats.

---
 tests/fate-run.sh| 4 ++--
 tests/ref/fate/filter-pixfmts-copy   | 2 --
 tests/ref/fate/filter-pixfmts-crop   | 2 --
 tests/ref/fate/filter-pixfmts-field  | 2 --
 tests/ref/fate/filter-pixfmts-fieldorder | 2 --
 tests/ref/fate/filter-pixfmts-hflip  | 2 --
 tests/ref/fate/filter-pixfmts-il | 2 --
 tests/ref/fate/filter-pixfmts-null   | 2 --
 tests/ref/fate/filter-pixfmts-scale  | 2 --
 tests/ref/fate/filter-pixfmts-transpose  | 2 --
 tests/ref/fate/filter-pixfmts-vflip  | 2 --
 11 files changed, 2 insertions(+), 22 deletions(-)

diff --git a/tests/fate-run.sh b/tests/fate-run.sh
index aece90a01d..e8d71707b0 100755
--- a/tests/fate-run.sh
+++ b/tests/fate-run.sh
@@ -288,8 +288,8 @@ pixfmts(){
 in_fmts=${outfile}_in_fmts
 
 # exclude pixel formats which are not supported as input
-$showfiltfmts scale | awk -F '[ \r]' '/^INPUT/{ fmt=substr($3, 5); print fmt }' | sort >$scale_in_fmts
-$showfiltfmts scale | awk -F '[ \r]' '/^OUTPUT/{ fmt=substr($3, 5); print fmt }' | sort >$scale_out_fmts
+$showfiltfmts scale | awk -F

[FFmpeg-devel] [PATCH] avformat/cafenc: fixed packet_size calculation

2024-02-05 Thread sergey radionov

the problem is the very last packet
can be shorter than default packet_size
so it's required to exclude it from
packet_size calculations.
fixes #10465
---
 libavformat/cafenc.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libavformat/cafenc.c b/libavformat/cafenc.c
index 67be59806c..fcc4838392 100644
--- a/libavformat/cafenc.c
+++ b/libavformat/cafenc.c
@@ -34,6 +34,8 @@ typedef struct {
 int size_buffer_size;
 int size_entries_used;
 int packets;
+int64_t duration;
+int64_t last_packet_duration;
 } CAFContext;
 
 static uint32_t codec_flags(enum AVCodecID codec_id) {
@@ -238,6 +240,8 @@ static int caf_write_packet(AVFormatContext *s, AVPacket 
*pkt)
 pkt_sizes[caf->size_entries_used++] = 128 | top;
 }
 pkt_sizes[caf->size_entries_used++] = pkt->size & 127;
+caf->duration += pkt->duration;
+caf->last_packet_duration = pkt->duration;
 caf->packets++;
 }
 avio_write(s->pb, pkt->data, pkt->size);
@@ -259,7 +263,11 @@ static int caf_write_trailer(AVFormatContext *s)
 if (!par->block_align) {
 int packet_size = samples_per_packet(par);
 if (!packet_size) {
-packet_size = st->duration / (caf->packets - 1);
+if (caf->duration) {
+packet_size = (caf->duration - caf->last_packet_duration) 
/ (caf->packets - 1);
+} else {
+packet_size = st->duration / (caf->packets - 1);
+}
 avio_seek(pb, FRAME_SIZE_OFFSET, SEEK_SET);
 avio_wb32(pb, packet_size);
 }
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

60 matches

Mail list logo