from:"Pedro Arthur"

Re: [FFmpeg-devel] [PATCH 00/21] New Version

2019-03-27 Thread Pedro Arthur

Em qua, 27 de mar de 2019 às 09:23, Andreas Rheinhardt via
ffmpeg-devel  escreveu:
>
> Carl Eugen Hoyos:
> > 2019-03-27 12:18 GMT+01:00, Andreas Rheinhardt:
> >
> >> I have cced Steve for this (I didn't the first time,
> >> because I thought that he (as a maintainer) would
> >> also be a subscriber to this list).
> >
> > Everybody welcomes reviews by Steve but I don't
> > think he maintains anything within FFmpeg.
> >
> He is listed as maintainer for dxva2 and d3d11va in the maintainers
> file. But this could be wrong/outdated.
> >> Oh, and I did not check with Valgrind that the new
> >> lacing code doesn't read uninitialized data. I don't
> >> even know how to use Valgrind. I just read the
> >> code. If someone more knowledgeable than I
> >> could please test it...
> >
> > Just use "valgrind ./ffmpeg_g ..."
> >
> Thanks for the help, but unfortunately I can't use Valgrind on Windows.
If you're using win10 you can test it using the windows linux subsystem.


>
> - Andreas
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec: add Amuse Graphics decoder

2019-03-30 Thread Pedro Arthur

Em qui, 28 de mar de 2019 às 18:12, Paul B Mahol  escreveu:
>
> +static int decode_motion_vectors(AVCodecContext *avctx, GetBitContext *gb)
> +{
> +AGMContext *s = avctx->priv_data;
> +int nb_mvs = ((avctx->height + 15) >> 4) * ((avctx->width + 15) >> 4);
> +int ret, skip = 0, value, end;

is that line intended?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec: add Amuse Graphics decoder

2019-03-30 Thread Pedro Arthur

Em sáb, 30 de mar de 2019 às 12:43, Paul B Mahol  escreveu:
>
> On 3/30/19, Pedro Arthur  wrote:
> > Em qui, 28 de mar de 2019 às 18:12, Paul B Mahol 
> > escreveu:
> >>
> >> +static int decode_motion_vectors(AVCodecContext *avctx, GetBitContext
> >> *gb)
> >> +{
> >> +AGMContext *s = avctx->priv_data;
> >> +int nb_mvs = ((avctx->height + 15) >> 4) * ((avctx->width + 15) >>
> >> 4);
> >> +int ret, skip = 0, value, end;
> >
> > is that line intended?
>
> Yes, what is wrong with it?
Just seemed unusual.


> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] libavfilter: Add derain filter init version--GSoC Qualification Task.

2019-04-09 Thread Pedro Arthur

Hi,

Em ter, 9 de abr de 2019 às 04:15,  escreveu:
> +@section derain
> +
> +Remove the rain in the input image/video by applying the derain methods 
> based on
> +convolutional neural networks. Supported models:
> +
> +@itemize
> +@item
> +Efficient Sub-Pixel Convolutional Neural Network model (ESPCN).
> +See @url{https://arxiv.org/abs/1609.05158}.
> +@end itemize

As the doc suggests, you're using the espcn model for deraining? if
so, it would be more relevant to link to paper which justifies this
usage as it currently seems to suggest you're using super-resolution.

In case you are the one which is proposing this usage, it worth at
least give some justification. is it better the current methods in any
way?


> +
> +Training scripts as well as scripts for model generation are provided in
> +the repository at @url{https://github.com/XueweiMeng/derain_filter.git}.
> +
> +The filter accepts the following options:
> +
> +@table @option
> +@item dnn_backend
> +Specify which DNN backend to use for model loading and execution. This 
> option accepts
> +the following values:
> +
> +@table @samp
> +@item native
> +Native implementation of DNN loading and execution.
> +
> +@item tensorflow
> +TensorFlow backend. To enable this backend you
> +need to install the TensorFlow for C library (see
> +@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with
> +@code{--enable-libtensorflow}
> +@end table
> +
> +Default value is @samp{native}.
> +
> +@item model
> +Set path to model file specifying network architecture and its parameters.
> +Note that different backends use different file formats. TensorFlow backend
> +can load files for both formats, while native backend can load files for only
> +its format.
> +@end table
> +
>  @section deshake
>
>  Attempt to fix small changes in horizontal and/or vertical shift. This
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index fef6ec5c55..7809bac565 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -194,6 +194,7 @@ OBJS-$(CONFIG_DATASCOPE_FILTER)  += 
> vf_datascope.o
>  OBJS-$(CONFIG_DCTDNOIZ_FILTER)   += vf_dctdnoiz.o
>  OBJS-$(CONFIG_DEBAND_FILTER) += vf_deband.o
>  OBJS-$(CONFIG_DEBLOCK_FILTER)+= vf_deblock.o
> +OBJS-$(CONFIG_DERAIN_FILTER) += vf_derain.o
>  OBJS-$(CONFIG_DECIMATE_FILTER)   += vf_decimate.o
>  OBJS-$(CONFIG_DECONVOLVE_FILTER) += vf_convolve.o framesync.o
>  OBJS-$(CONFIG_DEDOT_FILTER)  += vf_dedot.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index c51ae0f3c7..ee2a5b63e6 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -182,6 +182,7 @@ extern AVFilter ff_vf_datascope;
>  extern AVFilter ff_vf_dctdnoiz;
>  extern AVFilter ff_vf_deband;
>  extern AVFilter ff_vf_deblock;
> +extern AVFilter ff_vf_derain;
>  extern AVFilter ff_vf_decimate;
>  extern AVFilter ff_vf_deconvolve;
>  extern AVFilter ff_vf_dedot;
> diff --git a/libavfilter/vf_derain.c b/libavfilter/vf_derain.c
> new file mode 100644
> index 00..f72ae1cd3a
> --- /dev/null
> +++ b/libavfilter/vf_derain.c
> @@ -0,0 +1,204 @@
> +/*
> + * Copyright (c) 2019 Xuewei Meng
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +/**
> + * @file
> + * Filter implementing image derain filter using deep convolutional networks.
> + * https://arxiv.org/abs/1609.05158
> + * 
> http://openaccess.thecvf.com/content_ECCV_2018/html/Xia_Li_Recurrent_Squeeze-and-Excitation_Context_ECCV_2018_paper.html
> + */
> +
> +#include "libavutil/opt.h"
> +#include "libavformat/avio.h"
> +#include "libswscale/swscale.h"
> +#include "avfilter.h"
> +#include "formats.h"
> +#include "internal.h"
> +#include "dnn_interface.h"
> +
> +typedef struct DRContext {
> +const AVClass *class;
> +
> +char  *model_filename;
> +DNNBackendType backend_type;
> +DNNModule *dnn_module;
> +DNNModel  *model;
> +DNNDatainput;
> +DNNDataoutput;
> +} DRContext;
> +
> +#define OFFSET(x) offsetof(DRContext, x)
> +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM
> +static const AVOp

Re: [FFmpeg-devel] [PATCH] libavfilter: Add derain filter init version--GSoC Qualification Task.

2019-04-09 Thread Pedro Arthur

Em ter, 9 de abr de 2019 às 04:15,  escreveu:
>
> +Training scripts as well as scripts for model generation are provided in
> +the repository at @url{https://github.com/XueweiMeng/derain_filter.git}.

This repository is a copy of a previous year student [1], which is MIT
licensed, and therefore you should have included the original author
copyrights.
IMO the most polite behavior would have been to fork his repository
and make the necessary changes you need, keeping the copyrights.

[1] - https://github.com/HighVoltageRocknRoll/sr
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] libavfilter: Add derain filter init version--GSoC Qualification Task.

2019-04-10 Thread Pedro Arthur

Hi,
Em ter, 9 de abr de 2019 às 22:42,  escreveu:
>
> Yes, I use the espcn model for deraining as the initial version as it's a 
> easier way to implement the filter, although the paper proposes it for 
> super-resolution. And the model does have some effect on deraining project. 
> While, it is just the first version. I will use more suitable and more 
> powerful model for derain filter according to the latest models proposed in 
> derain task, and I will upload the new model soon.
>

There is no problem in using the espcn infrastructure to start
learning, but I still can't see how using the espcn model fits your
purpose.
The espcn model should output an image bigger than the input, why
would a derain filter do that?

Also you did not provide any data regarding the results you might have
obtained neither the model file and appropriate input/reference so we
can test it.
What are the results you get so far (psnr before/after applying the filter)?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] libavfilter: Add derain filter init version--GSoC Qualification Task.

2019-04-11 Thread Pedro Arthur

Em qui, 11 de abr de 2019 às 02:55,  escreveu:
>
> We made some modifications on the original ESPCN model, such as change the 
> input image from one channel(Y) to three channels(RGB) and remove the 
> up-sampling procedure. The model file has been uploaded in 
> https://github.com/XueweiMeng/derain_filter and you can download the 
> training/testing dataset from 
> http://www.icst.pku.edu.cn/struct/Projects/joint_rain_removal.html. I didn't 
> save the PSNR/SSIM score during the training and evaluating process. So the 
> data will be uploaded later.
>

Indeed the model is not the espcn anymore, as it would imply the use
of the up-sampling layer. I think it is better to label this network
as a generic convolutional network and just describe its layout
(number of layers and layer dimension). Using the espcn name is
misleading.

Please, always include your training results and relevant observations
when sending a patch otherwise it is hard to evaluate your work if we
did not even know what to expect from the output.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] native mode in FFmpeg DNN module

2019-04-19 Thread Pedro Arthur

Hi,

Em sex, 19 de abr de 2019 às 05:41, Guo, Yejun  escreveu:
>
> Option 2)
> Write c code in FFmpeg to convert tensorflow file format (format 1) directly 
> into memory representation (format 3), and so we controls everything in 
> ffmpeg community. And the conversion can be extended to import more file 
> formats such as torch, darknet, etc. One example is that OpenCV uses this 
> method.
>
> The in memory representation (format 3) can still be current.
>

Option 2 would be ideal, as it does not introduce any dependency for
using the native backend.
Yet I'm not sure  how complex implementing the tf model reader can be,
If I remember correctly the student said it was not trivial at the
time.

Is the tf model file stable? if not it will be a maintenance burden to
keep it working whenever tf releases a new version. This point makes
me think having control over our file format is good.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] native mode in FFmpeg DNN module

2019-04-26 Thread Pedro Arthur

Em sex, 26 de abr de 2019 às 02:41, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > Guo, Yejun
> > Sent: Friday, April 19, 2019 11:22 PM
> > To: FFmpeg development discussions and patches 
> > Subject: Re: [FFmpeg-devel] native mode in FFmpeg DNN module
> >
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > > Pedro Arthur
> > > Sent: Friday, April 19, 2019 10:43 PM
> > > To: FFmpeg development discussions and patches
> > 
> > > Subject: Re: [FFmpeg-devel] native mode in FFmpeg DNN module
> > >
> > > Hi,
> > >
> > > Em sex, 19 de abr de 2019 às 05:41, Guo, Yejun 
> > > escreveu:
> > > >
> > > > Option 2)
> > > > Write c code in FFmpeg to convert tensorflow file format (format 1) 
> > > > directly
> > > into memory representation (format 3), and so we controls everything in
> > > ffmpeg community. And the conversion can be extended to import more file
> > > formats such as torch, darknet, etc. One example is that OpenCV uses this
> > > method.
> > > >
> > > > The in memory representation (format 3) can still be current.
> > > >
> > >
> > > Option 2 would be ideal, as it does not introduce any dependency for
> > > using the native backend.
> > > Yet I'm not sure  how complex implementing the tf model reader can be,
> > > If I remember correctly the student said it was not trivial at the
> > > time.
> >
> > yes, it is not easy, but I think it is worthy to do it. Here is a reference 
> > example
> > for the complexity, see
> > https://github.com/opencv/opencv/blob/master/modules/dnn/src/tensorflow/
> > tf_importer.cpp.
> >
> > >
> > > Is the tf model file stable? if not it will be a maintenance burden to
> > > keep it working whenever tf releases a new version. This point makes
> > > me think having control over our file format is good.
> >
> > imho, this issue is always there, no matter which method used, unless our
> > format could be exported by tensorflow (it has little possibility).
> >
> > Whenever tf releases a new version with a new file format, we still have to
> > change the python script in phase 1 (convert tf file model to our format) 
> > which
> > is even an external dependency at
> > https://github.com/HighVoltageRocknRoll/sr,
> >
> > As from effort perspective, the current implementation is better since 
> > python
> > script is simpler. But I think we are still worth implementing option 2 as 
> > the
> > ideal technical direction.
>
> I checked a bit more about https://github.com/HighVoltageRocknRoll/sr, it is 
> actually
> not an converter (from tf model to native model), but hard code for given 
> models.
> And the native model is not exactly the same as tf model, it even changes the 
> behavior
> of pad parameter of conv layer.
>
> If community is open to option 2, I'll try it.
>
Option 2 is fine for me.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] libavfilter: Add more operation supports in FFmpeg dnn native mode.

2019-04-28 Thread Pedro Arthur

Em dom, 28 de abr de 2019 às 23:07, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > xwm...@pku.edu.cn
> > Sent: Sunday, April 28, 2019 5:27 PM
> > To: ffmpeg development discussions and patches 
> > Subject: [FFmpeg-devel] [PATCH] libavfilter: Add more operation supports in
> > FFmpeg dnn native mode.
> >
> > This patch is for the support of derain filter project in GSoC. It adds 
> > supports for
> > the following operations:
> >
> >
> >
> >
> >  (1) Conv padding method: "SAME" and "VALID"
> >
> >  (2) Dilation
> >
> >  (3) Activation: "NONE" and "LEAKY_RELU"
>
> how about separate this single patch into 3 patches.
>
> >
> >
> >
> >
> > These operations are all needed in derain filter. And if modify the dnn 
> > native
> > mode in FFmpeg, the generation process of Super Resolution model should be
> > changed accordingly, e.g. add padding method parameter (= 0) and dilation
> > parameter (= 1).
>
> you can create a PR at https://github.com/HighVoltageRocknRoll/sr
>
> >
> >
> >
> >
> > In addition, I have a question about the Super Resulotion implementation. 
> > The
> > model training process of SR uses "VALID" method. According to my
> > understanding of "VALID" mode in tensorflow, the size of output image should
> > be smaller than the current design in SR. Because pixels near the boundary 
> > are
> > not processed in "VALID" mode, however, these unprocessed pixels are filled
> > with adjacent pixels in current dnn native mode. I wonder why to do like 
> > this
> > here.
>
> I have the same concern that why the native model is not exactly the same as 
> tf model,
> the pad layer is missed, and the native model also change the behavior of pad 
> parameter of conv layer.
>
> it is only suitable for vf_sr, and not general for other models.
>
I think for training these filters the preferred method is VALID as it
uses only the data available (without filling the borders) and gives
the best possible result.
However for inference usually one expects to output an image with the
same size of the original (imagine the case of chained filters where
each one reduces the image by a few pixels, in the end one may have a
useless output).
Therefore it makes perfect sense to use different padding methods for
training/inference.

The clamp_to_edge padding was introduced before the TF backend thus it
stayed in the native backend even after the introduction of the TF
backend.
Indeed the clamp_to_edge is simpler than the other padding methods and
also gives a slight better result, If I remember correct the student
which implemented the TF backend did not find an equivalent padding
method in TF, thats why it uses different paddings.

> >
> >
> >
> >
> > From 4d92ef21a5acf064122c51f442d0e2f5437b3343 Mon Sep 17 00:00:00
> > 2001
> > From: Xuewei Meng 
> > Date: Sun, 28 Apr 2019 17:21:35 +0800
> > Subject: [PATCH] Add operation supports in dnn_native
> >
> > Signed-off-by: Xuewei Meng 
> > ---
> >  libavfilter/dnn_backend_native.c | 36 +---
> >  libavfilter/dnn_backend_native.h |  6 +-
> >  2 files changed, 29 insertions(+), 13 deletions(-)
> >
> > diff --git a/libavfilter/dnn_backend_native.c 
> > b/libavfilter/dnn_backend_native.c
> > index 70d857f5f2..0e3ef5d64d 100644
> > --- a/libavfilter/dnn_backend_native.c
> > +++ b/libavfilter/dnn_backend_native.c
> > @@ -157,13 +157,15 @@ DNNModel *ff_dnn_load_model_native(const char
> > *model_filename)
> >  ff_dnn_free_model_native(&model);
> >  return NULL;
> >  }
> > +conv_params->dilation =
> > (int32_t)avio_rl32(model_file_context);
> > +conv_params->padding_method =
> > (int32_t)avio_rl32(model_file_context);
> >  conv_params->activation =
> > (int32_t)avio_rl32(model_file_context);
> >  conv_params->input_num =
> > (int32_t)avio_rl32(model_file_context);
> >  conv_params->output_num =
> > (int32_t)avio_rl32(model_file_context);
> >  conv_params->kernel_size =
> > (int32_t)avio_rl32(model_file_context);
> >  kernel_size = conv_params->input_num *
> > conv_params->output_num *
> >conv_params->kernel_size *
> > conv_params->kernel_size;
> > -dnn_size += 16 + (kernel_size + conv_params->output_num <<
> > 2);
> > +dnn_size += 24 + (kernel_size + conv_params->output_num <<
> > 2);
> >  if (dnn_size > file_size || conv_params->input_num <= 0 ||
> >  conv_params->output_num <= 0 ||
> > conv_params->kernel_size <= 0){
> >  avio_closep(&model_file_context);
> > @@ -221,23 +223,28 @@ DNNModel *ff_dnn_load_model_native(const char
> > *model_filename)
> >
> >  static void convolve(const float *input, float *output, const
> > ConvolutionalParams *conv_params, int width, int height)
> >  {
> > -int y, x, n_filter, ch, kernel_y, kernel_x;
> >

Re: [FFmpeg-devel] [PATCH] libavfilter: Add more operation supports in FFmpeg dnn native mode.

2019-04-29 Thread Pedro Arthur

Em seg, 29 de abr de 2019 às 00:06,  escreveu:
>
>
>
>
> > -原始邮件-----
> > 发件人: "Pedro Arthur" 
> > 发送时间: 2019-04-29 10:42:42 (星期一)
> > 收件人: "FFmpeg development discussions and patches" 
> > 抄送:
> > 主题: Re: [FFmpeg-devel] [PATCH] libavfilter: Add more operation supports in 
> > FFmpeg dnn native mode.
> >
> > I think for training these filters the preferred method is VALID as it
> > uses only the data available (without filling the borders) and gives
> > the best possible result.
> > However for inference usually one expects to output an image with the
> > same size of the original (imagine the case of chained filters where
> > each one reduces the image by a few pixels, in the end one may have a
> > useless output).
> > Therefore it makes perfect sense to use different padding methods for
> > training/inference.
> >
> > The clamp_to_edge padding was introduced before the TF backend thus it
> > stayed in the native backend even after the introduction of the TF
> > backend.
> > Indeed the clamp_to_edge is simpler than the other padding methods and
> > also gives a slight better result, If I remember correct the student
> > which implemented the TF backend did not find an equivalent padding
> > method in TF, thats why it uses different paddings.
> >
> Yes, I think clamp_to_edge is a good method to keep the output with the same 
> size as input. However, I don't think "VALID" is the best method giving best 
> possible result. So, for "VALID" mode, maybe we can use the clamp_to_edge 
> method in the current dnn native mode? And then, we should also add "SAME" 
> option to support other filters.
>

I think it is best to not make any assumptions like VALID =>
clamp_to_edge, but you can keep it for now.
Ideally the model should have a padding layer which the backend
properly implements. Currently the TF backend when reading a native
model adds this padding layer implicitly, therefore it would be a
matter of changing it to have an explicity padding layer in the model.

Maybe you can assume VALID => clamp_to_edge, so you can add what you
need without changing the SR code and later you implement the
explicity padding support and send a PR to the original repo
(https://github.com/HighVoltageRocknRoll/sr) properly modifying the
model.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/7] libavfilter/dnn_backend_tf.c: set layer_add_res for input layer

2019-04-29 Thread Pedro Arthur

Em qua, 24 de abr de 2019 às 23:14, Guo, Yejun  escreveu:
>
> otherwise, the following check will return error if layer_add_res
> is randomly initialized.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn_backend_tf.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
> index 5bc7f06..9e0c127 100644
> --- a/libavfilter/dnn_backend_tf.c
> +++ b/libavfilter/dnn_backend_tf.c
> @@ -440,6 +440,7 @@ static DNNReturnType load_native_model(TFModel *tf_model, 
> const char *model_file
>  for (layer = 0; layer < conv_network->layers_num; ++layer){
>  switch (conv_network->layers[layer].type){
>  case INPUT:
> +layer_add_res = DNN_SUCCESS;
>  break;
>  case CONV:
>  layer_add_res = add_conv_layer(tf_model, transpose_op, &op,
> --
> 2.7.4

LGTM.

>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/7] libavfilter/vf_sr: refine code to remove keyword 'else'

2019-04-29 Thread Pedro Arthur

Em qua, 24 de abr de 2019 às 23:14, Guo, Yejun  escreveu:
>
> remove 'else' since there is always 'return' in 'if' scope,
> so the code will be clean for later maintenance
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/vf_sr.c | 143 
> ++--
>  1 file changed, 71 insertions(+), 72 deletions(-)
>
> diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
> index 6423d2e..9bb0fc5 100644
> --- a/libavfilter/vf_sr.c
> +++ b/libavfilter/vf_sr.c
> @@ -127,88 +127,87 @@ static int config_props(AVFilterLink *inlink)
>  av_log(context, AV_LOG_ERROR, "could not set input and output for 
> the model\n");
>  return AVERROR(EIO);
>  }
> -else{
> -if (sr_context->input.height != sr_context->output.height || 
> sr_context->input.width != sr_context->output.width){
> -sr_context->input.width = inlink->w;
> -sr_context->input.height = inlink->h;
> -result = 
> (sr_context->model->set_input_output)(sr_context->model->model, 
> &sr_context->input, &sr_context->output);
> -if (result != DNN_SUCCESS){
> -av_log(context, AV_LOG_ERROR, "could not set input and 
> output for the model\n");
> -return AVERROR(EIO);
> -}
> -sr_context->scale_factor = 0;
> +
> +if (sr_context->input.height != sr_context->output.height || 
> sr_context->input.width != sr_context->output.width){
> +sr_context->input.width = inlink->w;
> +sr_context->input.height = inlink->h;
> +result = 
> (sr_context->model->set_input_output)(sr_context->model->model, 
> &sr_context->input, &sr_context->output);
> +if (result != DNN_SUCCESS){
> +av_log(context, AV_LOG_ERROR, "could not set input and output 
> for the model\n");
> +return AVERROR(EIO);
>  }
> -outlink->h = sr_context->output.height;
> -outlink->w = sr_context->output.width;
> -sr_context->sws_contexts[1] = 
> sws_getContext(sr_context->input.width, sr_context->input.height, 
> AV_PIX_FMT_GRAY8,
> - 
> sr_context->input.width, sr_context->input.height, AV_PIX_FMT_GRAYF32,
> - 0, NULL, NULL, NULL);
> -sr_context->sws_input_linesize = sr_context->input.width << 2;
> -sr_context->sws_contexts[2] = 
> sws_getContext(sr_context->output.width, sr_context->output.height, 
> AV_PIX_FMT_GRAYF32,
> - 
> sr_context->output.width, sr_context->output.height, AV_PIX_FMT_GRAY8,
> - 0, NULL, NULL, NULL);
> -sr_context->sws_output_linesize = sr_context->output.width << 2;
> -if (!sr_context->sws_contexts[1] || !sr_context->sws_contexts[2]){
> -av_log(context, AV_LOG_ERROR, "could not create SwsContext for 
> conversions\n");
> +sr_context->scale_factor = 0;
> +}
> +outlink->h = sr_context->output.height;
> +outlink->w = sr_context->output.width;
> +sr_context->sws_contexts[1] = sws_getContext(sr_context->input.width, 
> sr_context->input.height, AV_PIX_FMT_GRAY8,
> + sr_context->input.width, 
> sr_context->input.height, AV_PIX_FMT_GRAYF32,
> + 0, NULL, NULL, NULL);
> +sr_context->sws_input_linesize = sr_context->input.width << 2;
> +sr_context->sws_contexts[2] = sws_getContext(sr_context->output.width, 
> sr_context->output.height, AV_PIX_FMT_GRAYF32,
> + sr_context->output.width, 
> sr_context->output.height, AV_PIX_FMT_GRAY8,
> + 0, NULL, NULL, NULL);
> +sr_context->sws_output_linesize = sr_context->output.width << 2;
> +if (!sr_context->sws_contexts[1] || !sr_context->sws_contexts[2]){
> +av_log(context, AV_LOG_ERROR, "could not create SwsContext for 
> conversions\n");
> +return AVERROR(ENOMEM);
> +}
> +if (sr_context->scale_factor){
> +sr_context->sws_contexts[0] = sws_getContext(inlink->w, inlink->h, 
> inlink->format,
> + outlink->w, outlink->h, 
> outlink->format,
> + SWS_BICUBIC, NULL, 
> NULL, NULL);
> +if (!sr_context->sws_contexts[0]){
> +av_log(context, AV_LOG_ERROR, "could not create SwsContext for 
> scaling\n");
>  return AVERROR(ENOMEM);
>  }
> -if (sr_context->scale_factor){
> -sr_context->sws_contexts[0] = sws_getContext(inlink->w, 
> inlink->h, inlink->format,
> - outlink->w, 
> outlink->h, outlink->format,
> +sr_context->sws_slice_h = inlink->h;
> +}
> +else{
> +if (inlink->format

Re: [FFmpeg-devel] [PATCH V2 3/7] libavfilter/dnn: remove limit for the name of DNN model input/output

2019-04-29 Thread Pedro Arthur

Em qua, 24 de abr de 2019 às 23:14, Guo, Yejun  escreveu:
>
> remove the requirment that the name of DNN model input/output
> should be "x"/"y",
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn_backend_native.c |  2 +-
>  libavfilter/dnn_backend_tf.c | 10 +-
>  libavfilter/dnn_interface.h  |  2 +-
>  libavfilter/vf_sr.c  |  4 ++--
>  4 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/libavfilter/dnn_backend_native.c 
> b/libavfilter/dnn_backend_native.c
> index 70d857f..fe43116 100644
> --- a/libavfilter/dnn_backend_native.c
> +++ b/libavfilter/dnn_backend_native.c
> @@ -25,7 +25,7 @@
>
>  #include "dnn_backend_native.h"
>
> -static DNNReturnType set_input_output_native(void *model, DNNData *input, 
> DNNData *output)
> +static DNNReturnType set_input_output_native(void *model, DNNData *input, 
> const char *input_name, DNNData *output, const char *output_name)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
>  InputParams *input_params;
> diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
> index 9e0c127..a838907 100644
> --- a/libavfilter/dnn_backend_tf.c
> +++ b/libavfilter/dnn_backend_tf.c
> @@ -76,7 +76,7 @@ static TF_Buffer *read_graph(const char *model_filename)
>  return graph_buf;
>  }
>
> -static DNNReturnType set_input_output_tf(void *model, DNNData *input, 
> DNNData *output)
> +static DNNReturnType set_input_output_tf(void *model, DNNData *input, const 
> char *input_name, DNNData *output, const char *output_name)
>  {
>  TFModel *tf_model = (TFModel *)model;
>  int64_t input_dims[] = {1, input->height, input->width, input->channels};
> @@ -84,8 +84,8 @@ static DNNReturnType set_input_output_tf(void *model, 
> DNNData *input, DNNData *o
>  const TF_Operation *init_op = TF_GraphOperationByName(tf_model->graph, 
> "init");
>  TF_Tensor *output_tensor;
>
> -// Input operation should be named 'x'
> -tf_model->input.oper = TF_GraphOperationByName(tf_model->graph, "x");
> +// Input operation
> +tf_model->input.oper = TF_GraphOperationByName(tf_model->graph, 
> input_name);
>  if (!tf_model->input.oper){
>  return DNN_ERROR;
>  }
> @@ -100,8 +100,8 @@ static DNNReturnType set_input_output_tf(void *model, 
> DNNData *input, DNNData *o
>  }
>  input->data = (float *)TF_TensorData(tf_model->input_tensor);
>
> -// Output operation should be named 'y'
> -tf_model->output.oper = TF_GraphOperationByName(tf_model->graph, "y");
> +// Output operation
> +tf_model->output.oper = TF_GraphOperationByName(tf_model->graph, 
> output_name);
>  if (!tf_model->output.oper){
>  return DNN_ERROR;
>  }
> diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h
> index e367343..0390e39 100644
> --- a/libavfilter/dnn_interface.h
> +++ b/libavfilter/dnn_interface.h
> @@ -40,7 +40,7 @@ typedef struct DNNModel{
>  void *model;
>  // Sets model input and output, while allocating additional memory for 
> intermediate calculations.
>  // Should be called at least once before model execution.
> -DNNReturnType (*set_input_output)(void *model, DNNData *input, DNNData 
> *output);
> +DNNReturnType (*set_input_output)(void *model, DNNData *input, const 
> char *input_name, DNNData *output, const char *output_name);
>  } DNNModel;
>
>  // Stores pointers to functions for loading, executing, freeing DNN models 
> for one of the backends.
> diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
> index 9bb0fc5..085ac19 100644
> --- a/libavfilter/vf_sr.c
> +++ b/libavfilter/vf_sr.c
> @@ -122,7 +122,7 @@ static int config_props(AVFilterLink *inlink)
>  sr_context->input.height = inlink->h * sr_context->scale_factor;
>  sr_context->input.channels = 1;
>
> -result = (sr_context->model->set_input_output)(sr_context->model->model, 
> &sr_context->input, &sr_context->output);
> +result = (sr_context->model->set_input_output)(sr_context->model->model, 
> &sr_context->input, "x", &sr_context->output, "y");
>  if (result != DNN_SUCCESS){
>  av_log(context, AV_LOG_ERROR, "could not set input and output for 
> the model\n");
>  return AVERROR(EIO);
> @@ -131,7 +131,7 @@ static int config_props(AVFilterLink *inlink)
>  if (sr_context->input.height != sr_context->output.height || 
> sr_context->input.width != sr_context->output.width){
>  sr_context->input.width = inlink->w;
>  sr_context->input.height = inlink->h;
> -result = 
> (sr_context->model->set_input_output)(sr_context->model->model, 
> &sr_context->input, &sr_context->output);
> +result = 
> (sr_context->model->set_input_output)(sr_context->model->model, 
> &sr_context->input, "x", &sr_context->output, "y");
>  if (result != DNN_SUCCESS){
>  av_log(context, AV_LOG_ERROR, "could not set input and output 
> for the model\n");
>  return AVERROR(EIO);
> --
> 2.7.4
>

Re: [FFmpeg-devel] [PATCH V2 5/7] libavfilter/dnn: avoid memcpy for tensorflow dnn output

2019-04-29 Thread Pedro Arthur

Em qua, 24 de abr de 2019 às 23:14, Guo, Yejun  escreveu:
>
> use TF_Tensor's cpu address to avoid extra memcpy.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn_backend_tf.c | 36 
>  libavfilter/vf_sr.c  |  3 ---
>  2 files changed, 12 insertions(+), 27 deletions(-)
>
> diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
> index 7bee45c..be8401e 100644
> --- a/libavfilter/dnn_backend_tf.c
> +++ b/libavfilter/dnn_backend_tf.c
> @@ -35,6 +35,7 @@ typedef struct TFModel{
>  TF_Status *status;
>  TF_Output input, output;
>  TF_Tensor *input_tensor;
> +TF_Tensor *output_tensor;
>  } TFModel;
>
>  static void free_buffer(void *data, size_t length)
> @@ -460,13 +461,11 @@ DNNModel *ff_dnn_load_model_tf(const char 
> *model_filename)
>  return NULL;
>  }
>
> -tf_model = av_malloc(sizeof(TFModel));
> +tf_model = av_mallocz(sizeof(TFModel));
>  if (!tf_model){
>  av_freep(&model);
>  return NULL;
>  }
> -tf_model->session = NULL;
> -tf_model->input_tensor = NULL;
>
>  if (load_tf_model(tf_model, model_filename) != DNN_SUCCESS){
>  if (load_native_model(tf_model, model_filename) != DNN_SUCCESS){
> @@ -488,36 +487,22 @@ DNNModel *ff_dnn_load_model_tf(const char 
> *model_filename)
>  DNNReturnType ff_dnn_execute_model_tf(const DNNModel *model, DNNData *output)
>  {
>  TFModel *tf_model = (TFModel *)model->model;
> -TF_Tensor *output_tensor;
> -uint64_t count;
> -uint64_t old_count = output->height * output->width * output->channels * 
> sizeof(float);
> +if (tf_model->output_tensor)
> +TF_DeleteTensor(tf_model->output_tensor);
>
>  TF_SessionRun(tf_model->session, NULL,
>&tf_model->input, &tf_model->input_tensor, 1,
> -  &tf_model->output, &output_tensor, 1,
> +  &tf_model->output, &tf_model->output_tensor, 1,
>NULL, 0, NULL, tf_model->status);
>
>  if (TF_GetCode(tf_model->status) != TF_OK){
>  return DNN_ERROR;
>  }
>
> -output->height = TF_Dim(output_tensor, 1);
> -output->width = TF_Dim(output_tensor, 2);
> -output->channels = TF_Dim(output_tensor, 3);
> -count = output->height * output->width * output->channels * 
> sizeof(float);
> -if (output->data) {
> -if (count > old_count) {
> -av_freep(&output->data);
> -}
> -}
> -if (!output->data) {
> -output->data = av_malloc(count);
> -if (!output->data){
> -return DNN_ERROR;
> -}
> -}
> -memcpy(output->data, TF_TensorData(output_tensor), count);
> -TF_DeleteTensor(output_tensor);
> +output->height = TF_Dim(tf_model->output_tensor, 1);
> +output->width = TF_Dim(tf_model->output_tensor, 2);
> +output->channels = TF_Dim(tf_model->output_tensor, 3);
> +output->data = TF_TensorData(tf_model->output_tensor);
>
>  return DNN_SUCCESS;
>  }
> @@ -541,6 +526,9 @@ void ff_dnn_free_model_tf(DNNModel **model)
>  if (tf_model->input_tensor){
>  TF_DeleteTensor(tf_model->input_tensor);
>  }
> +if (tf_model->output_tensor){
> +TF_DeleteTensor(tf_model->output_tensor);
> +}
>  av_freep(&tf_model);
>  av_freep(model);
>  }
> diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
> index 7c92730..53bd8ea 100644
> --- a/libavfilter/vf_sr.c
> +++ b/libavfilter/vf_sr.c
> @@ -277,9 +277,6 @@ static av_cold void uninit(AVFilterContext *context)
>  int i;
>  SRContext *sr_context = context->priv;
>
> -if (sr_context->backend_type == DNN_TF)
> -av_freep(&sr_context->output.data);
> -
>  if (sr_context->dnn_module){
>  (sr_context->dnn_module->free_model)(&sr_context->model);
>  av_freep(&sr_context->dnn_module);
> --
> 2.7.4
>

LGTM.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 4/7] libavfilter/dnn: determine dnn output during execute_model instead of set_input_output

2019-04-29 Thread Pedro Arthur

Em qua, 24 de abr de 2019 às 23:14, Guo, Yejun  escreveu:
>
> Currently, within interface set_input_output, the dims/memory of the 
> tensorflow
> dnn model output is determined by executing the model with zero input,
> actually, the output dims might vary with different input data for networks
> such as object detection models faster-rcnn, ssd and yolo.
>
> This patch moves the logic from set_input_output to execute_model which
> is suitable for all the cases. Since interface changed, and so 
> dnn_backend_native
> also changes.
>
> In vf_sr.c, it knows it's srcnn or espcn by executing the model with zero 
> input,
> so execute_model has to be called in function config_props
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn_backend_native.c | 14 +-
>  libavfilter/dnn_backend_native.h |  2 +-
>  libavfilter/dnn_backend_tf.c | 56 
> 
>  libavfilter/dnn_backend_tf.h |  2 +-
>  libavfilter/dnn_interface.h  |  6 ++---
>  libavfilter/vf_sr.c  | 20 +++---
>  6 files changed, 51 insertions(+), 49 deletions(-)
>
> diff --git a/libavfilter/dnn_backend_native.c 
> b/libavfilter/dnn_backend_native.c
> index fe43116..18735c0 100644
> --- a/libavfilter/dnn_backend_native.c
> +++ b/libavfilter/dnn_backend_native.c
> @@ -25,7 +25,7 @@
>
>  #include "dnn_backend_native.h"
>
> -static DNNReturnType set_input_output_native(void *model, DNNData *input, 
> const char *input_name, DNNData *output, const char *output_name)
> +static DNNReturnType set_input_output_native(void *model, DNNData *input, 
> const char *input_name, const char *output_name)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
>  InputParams *input_params;
> @@ -81,11 +81,6 @@ static DNNReturnType set_input_output_native(void *model, 
> DNNData *input, const
>  }
>  }
>
> -output->data = network->layers[network->layers_num - 1].output;
> -output->height = cur_height;
> -output->width = cur_width;
> -output->channels = cur_channels;
> -
>  return DNN_SUCCESS;
>  }
>
> @@ -280,7 +275,7 @@ static void depth_to_space(const float *input, float 
> *output, int block_size, in
>  }
>  }
>
> -DNNReturnType ff_dnn_execute_model_native(const DNNModel *model)
> +DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNData 
> *output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model->model;
>  int cur_width, cur_height, cur_channels;
> @@ -322,6 +317,11 @@ DNNReturnType ff_dnn_execute_model_native(const DNNModel 
> *model)
>  }
>  }
>
> +output->data = network->layers[network->layers_num - 1].output;
> +output->height = cur_height;
> +output->width = cur_width;
> +output->channels = cur_channels;
> +
>  return DNN_SUCCESS;
>  }
>
> diff --git a/libavfilter/dnn_backend_native.h 
> b/libavfilter/dnn_backend_native.h
> index 51d4cac..adaf4a7 100644
> --- a/libavfilter/dnn_backend_native.h
> +++ b/libavfilter/dnn_backend_native.h
> @@ -63,7 +63,7 @@ typedef struct ConvolutionalNetwork{
>
>  DNNModel *ff_dnn_load_model_native(const char *model_filename);
>
> -DNNReturnType ff_dnn_execute_model_native(const DNNModel *model);
> +DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNData 
> *output);
>
>  void ff_dnn_free_model_native(DNNModel **model);
>
> diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
> index a838907..7bee45c 100644
> --- a/libavfilter/dnn_backend_tf.c
> +++ b/libavfilter/dnn_backend_tf.c
> @@ -35,7 +35,6 @@ typedef struct TFModel{
>  TF_Status *status;
>  TF_Output input, output;
>  TF_Tensor *input_tensor;
> -DNNData *output_data;
>  } TFModel;
>
>  static void free_buffer(void *data, size_t length)
> @@ -76,13 +75,12 @@ static TF_Buffer *read_graph(const char *model_filename)
>  return graph_buf;
>  }
>
> -static DNNReturnType set_input_output_tf(void *model, DNNData *input, const 
> char *input_name, DNNData *output, const char *output_name)
> +static DNNReturnType set_input_output_tf(void *model, DNNData *input, const 
> char *input_name, const char *output_name)
>  {
>  TFModel *tf_model = (TFModel *)model;
>  int64_t input_dims[] = {1, input->height, input->width, input->channels};
>  TF_SessionOptions *sess_opts;
>  const TF_Operation *init_op = TF_GraphOperationByName(tf_model->graph, 
> "init");
> -TF_Tensor *output_tensor;
>
>  // Input operation
>  tf_model->input.oper = TF_GraphOperationByName(tf_model->graph, 
> input_name);
> @@ -132,26 +130,6 @@ static DNNReturnType set_input_output_tf(void *model, 
> DNNData *input, const char
>  }
>  }
>
> -// Execute network to get output height, width and number of channels
> -TF_SessionRun(tf_model->session, NULL,
> -  &tf_model->input, &tf_model->input_tensor, 1,
> -  &tf_model->output, &output_tensor, 1,
> -  NULL, 0, NULL, tf_mo

Re: [FFmpeg-devel] [PATCH V2 6/7] libavfilter/dnn: support multiple outputs for tensorflow model

2019-04-29 Thread Pedro Arthur

Em qua, 24 de abr de 2019 às 23:14, Guo, Yejun  escreveu:
>
> some models such as ssd, yolo have more than one output.
>
> the clean up code in this patch is a little complex, it is because
> that set_input_output_tf could be called for many times together
> with ff_dnn_execute_model_tf, we have to clean resources for the
> case that the two interfaces are called interleaved.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn_backend_native.c | 15 +---
>  libavfilter/dnn_backend_native.h |  2 +-
>  libavfilter/dnn_backend_tf.c | 80 
> 
>  libavfilter/dnn_backend_tf.h |  2 +-
>  libavfilter/dnn_interface.h  |  6 ++-
>  libavfilter/vf_sr.c  | 11 +++---
>  6 files changed, 85 insertions(+), 31 deletions(-)
>
> diff --git a/libavfilter/dnn_backend_native.c 
> b/libavfilter/dnn_backend_native.c
> index 18735c0..8a83c63 100644
> --- a/libavfilter/dnn_backend_native.c
> +++ b/libavfilter/dnn_backend_native.c
> @@ -25,7 +25,7 @@
>
>  #include "dnn_backend_native.h"
>
> -static DNNReturnType set_input_output_native(void *model, DNNData *input, 
> const char *input_name, const char *output_name)
> +static DNNReturnType set_input_output_native(void *model, DNNData *input, 
> const char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
>  InputParams *input_params;
> @@ -275,7 +275,7 @@ static void depth_to_space(const float *input, float 
> *output, int block_size, in
>  }
>  }
>
> -DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNData 
> *output)
> +DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNData 
> *outputs, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model->model;
>  int cur_width, cur_height, cur_channels;
> @@ -317,10 +317,13 @@ DNNReturnType ff_dnn_execute_model_native(const 
> DNNModel *model, DNNData *output
>  }
>  }
>
> -output->data = network->layers[network->layers_num - 1].output;
> -output->height = cur_height;
> -output->width = cur_width;
> -output->channels = cur_channels;
> +// native mode does not support multiple outputs yet
> +if (nb_output > 1)
> +return DNN_ERROR;
> +outputs[0].data = network->layers[network->layers_num - 1].output;
> +outputs[0].height = cur_height;
> +outputs[0].width = cur_width;
> +outputs[0].channels = cur_channels;
>
>  return DNN_SUCCESS;
>  }
> diff --git a/libavfilter/dnn_backend_native.h 
> b/libavfilter/dnn_backend_native.h
> index adaf4a7..e13a68a 100644
> --- a/libavfilter/dnn_backend_native.h
> +++ b/libavfilter/dnn_backend_native.h
> @@ -63,7 +63,7 @@ typedef struct ConvolutionalNetwork{
>
>  DNNModel *ff_dnn_load_model_native(const char *model_filename);
>
> -DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNData 
> *output);
> +DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNData 
> *outputs, uint32_t nb_output);
>
>  void ff_dnn_free_model_native(DNNModel **model);
>
> diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
> index be8401e..ca6472d 100644
> --- a/libavfilter/dnn_backend_tf.c
> +++ b/libavfilter/dnn_backend_tf.c
> @@ -26,6 +26,7 @@
>  #include "dnn_backend_tf.h"
>  #include "dnn_backend_native.h"
>  #include "libavformat/avio.h"
> +#include "libavutil/avassert.h"
>
>  #include 
>
> @@ -33,9 +34,11 @@ typedef struct TFModel{
>  TF_Graph *graph;
>  TF_Session *session;
>  TF_Status *status;
> -TF_Output input, output;
> +TF_Output input;
>  TF_Tensor *input_tensor;
> -TF_Tensor *output_tensor;
> +TF_Output *outputs;
> +TF_Tensor **output_tensors;
> +uint32_t nb_output;
>  } TFModel;
>
>  static void free_buffer(void *data, size_t length)
> @@ -76,7 +79,7 @@ static TF_Buffer *read_graph(const char *model_filename)
>  return graph_buf;
>  }
>
> -static DNNReturnType set_input_output_tf(void *model, DNNData *input, const 
> char *input_name, const char *output_name)
> +static DNNReturnType set_input_output_tf(void *model, DNNData *input, const 
> char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  TFModel *tf_model = (TFModel *)model;
>  int64_t input_dims[] = {1, input->height, input->width, input->channels};
> @@ -100,11 +103,38 @@ static DNNReturnType set_input_output_tf(void *model, 
> DNNData *input, const char
>  input->data = (float *)TF_TensorData(tf_model->input_tensor);
>
>  // Output operation
> -tf_model->output.oper = TF_GraphOperationByName(tf_model->graph, 
> output_name);
> -if (!tf_model->output.oper){
> +if (nb_output == 0)
> +return DNN_ERROR;
> +
> +av_freep(&tf_model->outputs);
> +tf_model->outputs = av_malloc_array(nb_output, 
> sizeof(*tf_model->outputs));
> +if (!tf_model->outputs)
> +return DNN_ERROR;
> +for (int i = 0; i < nb_outpu

Re: [FFmpeg-devel] [PATCH V2 7/7] libavfilter/dnn: add more data type support for dnn model input

2019-04-29 Thread Pedro Arthur

Em qua, 24 de abr de 2019 às 23:15, Guo, Yejun  escreveu:
>
> currently, only float is supported as model input, actually, there
> are other data types, this patch adds uint8.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn_backend_native.c |  4 +++-
>  libavfilter/dnn_backend_tf.c | 28 
>  libavfilter/dnn_interface.h  | 10 +-
>  libavfilter/vf_sr.c  |  4 +++-
>  4 files changed, 39 insertions(+), 7 deletions(-)
>
> diff --git a/libavfilter/dnn_backend_native.c 
> b/libavfilter/dnn_backend_native.c
> index 8a83c63..06fbdf3 100644
> --- a/libavfilter/dnn_backend_native.c
> +++ b/libavfilter/dnn_backend_native.c
> @@ -24,8 +24,9 @@
>   */
>
>  #include "dnn_backend_native.h"
> +#include "libavutil/avassert.h"
>
> -static DNNReturnType set_input_output_native(void *model, DNNData *input, 
> const char *input_name, const char **output_names, uint32_t nb_output)
> +static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
>  InputParams *input_params;
> @@ -45,6 +46,7 @@ static DNNReturnType set_input_output_native(void *model, 
> DNNData *input, const
>  if (input->data){
>  av_freep(&input->data);
>  }
> +av_assert0(input->dt == DNN_FLOAT);
>  network->layers[0].output = input->data = av_malloc(cur_height * 
> cur_width * cur_channels * sizeof(float));
>  if (!network->layers[0].output){
>  return DNN_ERROR;
> diff --git a/libavfilter/dnn_backend_tf.c b/libavfilter/dnn_backend_tf.c
> index ca6472d..ba959ae 100644
> --- a/libavfilter/dnn_backend_tf.c
> +++ b/libavfilter/dnn_backend_tf.c
> @@ -79,10 +79,31 @@ static TF_Buffer *read_graph(const char *model_filename)
>  return graph_buf;
>  }
>
> -static DNNReturnType set_input_output_tf(void *model, DNNData *input, const 
> char *input_name, const char **output_names, uint32_t nb_output)
> +static TF_Tensor *allocate_input_tensor(const DNNInputData *input)
>  {
> -TFModel *tf_model = (TFModel *)model;
> +TF_DataType dt;
> +size_t size;
>  int64_t input_dims[] = {1, input->height, input->width, input->channels};
> +switch (input->dt) {
> +case DNN_FLOAT:
> +dt = TF_FLOAT;
> +size = sizeof(float);
> +break;
> +case DNN_UINT8:
> +dt = TF_UINT8;
> +size = sizeof(char);
> +break;
> +default:
> +av_assert0(!"should not reach here");
> +}
> +
> +return TF_AllocateTensor(dt, input_dims, 4,
> + input_dims[1] * input_dims[2] * input_dims[3] * 
> size);
> +}
> +
> +static DNNReturnType set_input_output_tf(void *model, DNNInputData *input, 
> const char *input_name, const char **output_names, uint32_t nb_output)
> +{
> +TFModel *tf_model = (TFModel *)model;
>  TF_SessionOptions *sess_opts;
>  const TF_Operation *init_op = TF_GraphOperationByName(tf_model->graph, 
> "init");
>
> @@ -95,8 +116,7 @@ static DNNReturnType set_input_output_tf(void *model, 
> DNNData *input, const char
>  if (tf_model->input_tensor){
>  TF_DeleteTensor(tf_model->input_tensor);
>  }
> -tf_model->input_tensor = TF_AllocateTensor(TF_FLOAT, input_dims, 4,
> -   input_dims[1] * input_dims[2] 
> * input_dims[3] * sizeof(float));
> +tf_model->input_tensor = allocate_input_tensor(input);
>  if (!tf_model->input_tensor){
>  return DNN_ERROR;
>  }
> diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h
> index 73d226e..c24df0e 100644
> --- a/libavfilter/dnn_interface.h
> +++ b/libavfilter/dnn_interface.h
> @@ -32,6 +32,14 @@ typedef enum {DNN_SUCCESS, DNN_ERROR} DNNReturnType;
>
>  typedef enum {DNN_NATIVE, DNN_TF} DNNBackendType;
>
> +typedef enum {DNN_FLOAT, DNN_UINT8} DNNDataType;
> +
> +typedef struct DNNInputData{
> +void *data;
> +DNNDataType dt;
> +int width, height, channels;
> +} DNNInputData;
> +
>  typedef struct DNNData{
>  float *data;
>  int width, height, channels;
> @@ -42,7 +50,7 @@ typedef struct DNNModel{
>  void *model;
>  // Sets model input and output.
>  // Should be called at least once before model execution.
> -DNNReturnType (*set_input_output)(void *model, DNNData *input, const 
> char *input_name, const char **output_names, uint32_t nb_output);
> +DNNReturnType (*set_input_output)(void *model, DNNInputData *input, 
> const char *input_name, const char **output_names, uint32_t nb_output);
>  } DNNModel;
>
>  // Stores pointers to functions for loading, executing, freeing DNN models 
> for one of the backends.
> diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
> index b4d4165..c0d7126 100644
> --- a/libavfilter/vf_sr.c
> +++ b/libavfilter/vf_sr.c
> @@ -40,7 +40,8 @@ typedef struct SRContext {
>  DNNB

Re: [FFmpeg-devel] [PATCH V2 7/7] libavfilter/dnn: add more data type support for dnn model input

2019-04-29 Thread Pedro Arthur

Em seg, 29 de abr de 2019 às 22:21, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: Pedro Arthur [mailto:bygran...@gmail.com]
> > Sent: Tuesday, April 30, 2019 1:47 AM
> > To: FFmpeg development discussions and patches 
> > Cc: Guo, Yejun 
> > Subject: Re: [FFmpeg-devel] [PATCH V2 7/7] libavfilter/dnn: add more data 
> > type
> > support for dnn model input
> >
> > Em qua, 24 de abr de 2019 às 23:15, Guo, Yejun 
> > escreveu:
> > >
> > > currently, only float is supported as model input, actually, there
> > > are other data types, this patch adds uint8.
> > >
> > > Signed-off-by: Guo, Yejun 
> >
> >
> > LGTM.
> >
> > I think it would be valuable to add a few tests covering the features
> > added by this patch series.
>
> thanks, good point. Do you mean FATE? Is there any previous test for DNN 
> module that I can refer to? thanks. I'll investigate it after my holiday, I'm 
> starting vacation today.

Yes, I mean FATE, unfortunately there isn't any tests for DNN atm.

Have a nice vacation!
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 7/7] libavfilter/dnn: add more data type support for dnn model input

2019-05-08 Thread Pedro Arthur

Em qua, 8 de mai de 2019 às 05:28, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: Pedro Arthur [mailto:bygran...@gmail.com]
> > Sent: Tuesday, April 30, 2019 1:47 AM
> > To: FFmpeg development discussions and patches 
> > Cc: Guo, Yejun 
> > Subject: Re: [FFmpeg-devel] [PATCH V2 7/7] libavfilter/dnn: add more data 
> > type
> > support for dnn model input
> > > +sr_context->input.dt = DNN_FLOAT;
> > >  sr_context->sws_contexts[0] = NULL;
> > >  sr_context->sws_contexts[1] = NULL;
> > >  sr_context->sws_contexts[2] = NULL;
> > > --
> > > 2.7.4
> > >
> >
> > LGTM.
> >
> > I think it would be valuable to add a few tests covering the features
> > added by this patch series.
>
> I tried a bit to add FATE for dnn module, see basic code in attached file.
> We can only test native mode because FATE does not allow external dependency.
>
> The native mode is still in early stage, and I plan to add support to
> import TF model as native model with ffmpeg c code (as discussed in another 
> thread).
> That's might be a better time to add FATE after the import is finished.
> We can add unit tests for all the native ops at that time.
>
> As for this patch series, it mainly focus on the TF mode, it might not be
> suitable to add FATE for it.
>
> So, how about to push this patch set, and add FATE when the native mode is a 
> little more mature? thanks.
Patch set pushed, sorry for the delay.

Later I'll properly review the unit test patch.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] libavutil: add an FFT & MDCT implementation

2019-05-11 Thread Pedro Arthur

Em sáb, 11 de mai de 2019 às 20:26, James Almer  escreveu:
>
> On 5/11/2019 8:08 PM, Carl Eugen Hoyos wrote:
> > Am So., 12. Mai 2019 um 01:00 Uhr schrieb Lynne :
> >>
> >> May 11, 2019, 11:08 PM by ceffm...@gmail.com:
> >>
> >>> Am Sa., 11. Mai 2019 um 14:41 Uhr schrieb Lynne <> d...@lynne.ee 
> >>> > >:
> >>>
> 
>  May 10, 2019, 8:59 PM by >> ceffm...@gmail.com 
>  >> :
> 
> > Am Fr., 10. Mai 2019 um 19:54 Uhr schrieb Lynne <> >> d...@lynne.ee 
> > >>  > d...@lynne.ee 
> > >> >> >:
> >
> >>
> >> May 10, 2019, 4:14 PM by >> >> d...@lynne.ee >>  
> >> > d...@lynne.ee >> >>> :
> >>
> >>> Patch updated again.
> >>> Made some more cleanups to the transforms, the tables and the main 
> >>> context.
> >>> API changed again, now the init function populates the function 
> >>> pointer for transform.
> >>> I decided that having a separate function would encourage bad usage 
> >>> (e.g. calling
> >>> the function every time before doing a transform rather than storing 
> >>> the pointer) when
> >>> we're trying to avoid the overhead of function calls.
> >>> Also adjusted file names to match the API.
> >>>
> >>
> >> Forgot to change an include, new patch attached.
> >>
> >
> > If I understand the commit message correctly, some of the code
> > in the new file you are adding comes from other parts of FFmpeg.
> > I am surprised that there is no copyright claim on the top of this
> > new file.
> > Is there none on top of the files you took the code from?
> >
> 
>  The project isn't consistent with updating nor putting copyright headers 
>  on files so
>  I'd rather keep the headers clean. Commit messages and authors are the 
>  only way to
>  know who authored what.
> 
> >>>
> >>> I don't think this is correct, but that is not the question: Copyright
> >>> law is (at least here)
> >>> very clear, if somebody put his name on top of the file, you must not 
> >>> remove it,
> >>> especially not when moving code from one file into another.
> >>>
> >>
> >> "Here"? You're probably referring to some county's laws, those don't apply 
> >> universally,
> >> especially not to the internet.
> >> Either way, that rule hasn't really been respected despite the major 
> >> refactoring that has
> >> happened in the past so I don't see why it has to be respected now.
> >
> > Please point me to the commit you indicate so I can fix this (claimed) 
> > copyright
> > violation.
> >
> >> The only parts I didn't rewrite are the power of two FFT, which I can NIH 
> >> in a week if
> >> necessary, and in fact lately with the research papers I've recently read 
> >> I'm thinking
> >> I should.
> >
> > Why don't you simply copy the copyright statement from the file where you 
> > copied
> > it from instead (if there is one)?
> > Wouldn't that be much quicker than this email exchange?
> >
> > Carl Eugen
>
> The commit message already states it takes parts of lavc's fft
> implementation, and the git story can't be rewritten, so authorship is
> known or can be easily figured out. Could we please focus on technical
> matters instead of wasting time in a back and forth about stuff like this?
>

Not saying this is the case, but if one interprets it as copying the
code, adding more code, removing some parts, and removing the
copyrights previously present it is clearly violating the license.
It seems wise to include copyrights from original code in the file.

> Thanks.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] libavutil: add an FFT & MDCT implementation

2019-05-12 Thread Pedro Arthur

Em dom, 12 de mai de 2019 às 18:11, Hendrik Leppkes
 escreveu:
>
> On Sun, May 12, 2019 at 11:05 PM Carl Eugen Hoyos  wrote:
> >
> > But seriously: We are of course not allowed to remove copyright
> > statements, no matter if we consider them relevant or not.
> >
>
> Please provide a source for such claims.
The GPL license included in the header states that.

GPL2 [1] - "keep intact all the notices that refer to this License"
GPL3 [2]  - "Requiring preservation of specified reasonable legal
notices or author attributions in that material"
MIT [3] (for completeness) - "The above copyright notice and this
permission notice shall be included in all copies or substantial
portions of the Software."

[1] - https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
[2] - http://www.gnu.org/licenses/gpl.html
[3] - https://opensource.org/licenses/MIT
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] Add multiple padding method in dnn native

2019-05-15 Thread Pedro Arthur

Em qua, 15 de mai de 2019 às 04:44, Steven Liu
 escreveu:
>
> Xuewei Meng  于2019年5月11日周六 上午11:11写道：
> >
> > ---
> >  libavfilter/dnn_backend_native.c | 52 
> >  libavfilter/dnn_backend_native.h |  3 ++
> >  2 files changed, 43 insertions(+), 12 deletions(-)
> >
> > diff --git a/libavfilter/dnn_backend_native.c 
> > b/libavfilter/dnn_backend_native.c
> > index 06fbdf368b..171a756385 100644
> > --- a/libavfilter/dnn_backend_native.c
> > +++ b/libavfilter/dnn_backend_native.c
> > @@ -61,6 +61,12 @@ static DNNReturnType set_input_output_native(void 
> > *model, DNNInputData *input, c
> >  return DNN_ERROR;
> >  }
> >  cur_channels = conv_params->output_num;
> > +
> > +if(conv_params->padding_method == VALID){
> > +int pad_size = conv_params->kernel_size - 1;
> > +cur_height -= pad_size;
> > +cur_width -= pad_size;
> > +}
> >  break;
> >  case DEPTH_TO_SPACE:
> >  depth_to_space_params = (DepthToSpaceParams 
> > *)network->layers[layer].params;
> > @@ -77,6 +83,10 @@ static DNNReturnType set_input_output_native(void 
> > *model, DNNInputData *input, c
> >  if (network->layers[layer].output){
> >  av_freep(&network->layers[layer].output);
> >  }
> > +
> > +if(cur_height <= 0 || cur_width <= 0)
> > +return DNN_ERROR;
> > +
> >  network->layers[layer].output = av_malloc(cur_height * cur_width * 
> > cur_channels * sizeof(float));
> >  if (!network->layers[layer].output){
> >  return DNN_ERROR;
> > @@ -154,13 +164,14 @@ DNNModel *ff_dnn_load_model_native(const char 
> > *model_filename)
> >  ff_dnn_free_model_native(&model);
> >  return NULL;
> >  }
> > +conv_params->padding_method = 
> > (int32_t)avio_rl32(model_file_context);
> >  conv_params->activation = 
> > (int32_t)avio_rl32(model_file_context);
> >  conv_params->input_num = 
> > (int32_t)avio_rl32(model_file_context);
> >  conv_params->output_num = 
> > (int32_t)avio_rl32(model_file_context);
> >  conv_params->kernel_size = 
> > (int32_t)avio_rl32(model_file_context);
> >  kernel_size = conv_params->input_num * conv_params->output_num 
> > *
> >conv_params->kernel_size * 
> > conv_params->kernel_size;
> > -dnn_size += 16 + (kernel_size + conv_params->output_num << 2);
> > +dnn_size += 20 + (kernel_size + conv_params->output_num << 2);
> >  if (dnn_size > file_size || conv_params->input_num <= 0 ||
> >  conv_params->output_num <= 0 || conv_params->kernel_size 
> > <= 0){
> >  avio_closep(&model_file_context);
> > @@ -218,23 +229,35 @@ DNNModel *ff_dnn_load_model_native(const char 
> > *model_filename)
> >
> >  static void convolve(const float *input, float *output, const 
> > ConvolutionalParams *conv_params, int width, int height)
> >  {
> > -int y, x, n_filter, ch, kernel_y, kernel_x;
> >  int radius = conv_params->kernel_size >> 1;
> >  int src_linesize = width * conv_params->input_num;
> >  int filter_linesize = conv_params->kernel_size * 
> > conv_params->input_num;
> >  int filter_size = conv_params->kernel_size * filter_linesize;
> > +int pad_size = (conv_params->padding_method == VALID) ? 
> > (conv_params->kernel_size - 1) / 2 : 0;
> >
> > -for (y = 0; y < height; ++y){
> > -for (x = 0; x < width; ++x){
> > -for (n_filter = 0; n_filter < conv_params->output_num; 
> > ++n_filter){
> > +for (int y = pad_size; y < height - pad_size; ++y){
> > +for (int x = pad_size; x < width - pad_size; ++x){
> > +for (int n_filter = 0; n_filter < conv_params->output_num; 
> > ++n_filter){
> >  output[n_filter] = conv_params->biases[n_filter];
> > -for (ch = 0; ch < conv_params->input_num; ++ch){
> > -for (kernel_y = 0; kernel_y < 
> > conv_params->kernel_size; ++kernel_y){
> > -for (kernel_x = 0; kernel_x < 
> > conv_params->kernel_size; ++kernel_x){
> > -output[n_filter] += input[CLAMP_TO_EDGE(y + 
> > kernel_y - radius, height) * src_linesize +
> > -  CLAMP_TO_EDGE(x + 
> > kernel_x - radius, width) * conv_params->input_num + ch] *
> > -
> > conv_params->kernel[n_filter * filter_size + kernel_y * filter_linesize +
> > -
> > kernel_x * conv_params->input_num + ch];
> > +
> > +for (int ch = 0; ch < conv_params->input_num; ++ch){
> > +for (int kernel_y = 0; kernel_y < 
> > conv_params->kernel_size; ++kernel_y){
> > +

Re: [FFmpeg-devel] native mode in FFmpeg DNN module

2019-05-24 Thread Pedro Arthur

Em qui, 23 de mai de 2019 às 00:06, Guo, Yejun  escreveu:
>
>
>
> > > > > > > Option 2)
> > > > > > > Write c code in FFmpeg to convert tensorflow file format (format 
> > > > > > > 1)
> > > directly
> > > > > > into memory representation (format 3), and so we controls 
> > > > > > everything in
> > > > > > ffmpeg community. And the conversion can be extended to import more
> > > file
> > > > > > formats such as torch, darknet, etc. One example is that OpenCV uses
> > > this
> > > > > > method.
> > > > > > >
> > > > > > > The in memory representation (format 3) can still be current.
> > > > > > >
> > > > > >
> > > > > > Option 2 would be ideal, as it does not introduce any dependency for
> > > > > > using the native backend.
> > > > > > Yet I'm not sure  how complex implementing the tf model reader can
> > be,
> > > > > > If I remember correctly the student said it was not trivial at the
> > > > > > time.
> > > > >
> > > > > yes, it is not easy, but I think it is worthy to do it. Here is a 
> > > > > reference
> > > example
> > > > > for the complexity, see
> > > > >
> > >
> > https://github.com/opencv/opencv/blob/master/modules/dnn/src/tensorflow/
> > > > > tf_importer.cpp.
> > > > >
> > > > > >
> > > > > > Is the tf model file stable? if not it will be a maintenance burden 
> > > > > > to
> > > > > > keep it working whenever tf releases a new version. This point makes
> > > > > > me think having control over our file format is good.
> > > > >
> > > > > imho, this issue is always there, no matter which method used, unless 
> > > > > our
> > > > > format could be exported by tensorflow (it has little possibility).
> > > > >
> > > > > Whenever tf releases a new version with a new file format, we still 
> > > > > have
> > to
> > > > > change the python script in phase 1 (convert tf file model to our 
> > > > > format)
> > > which
> > > > > is even an external dependency at
> > > > > https://github.com/HighVoltageRocknRoll/sr,
> > > > >
> > > > > As from effort perspective, the current implementation is better since
> > > python
> > > > > script is simpler. But I think we are still worth implementing option 
> > > > > 2 as
> > the
> > > > > ideal technical direction.
> > > >
> > > > I checked a bit more about https://github.com/HighVoltageRocknRoll/sr, 
> > > > it
> > is
> > > actually
> > > > not an converter (from tf model to native model), but hard code for 
> > > > given
> > > models.
> > > > And the native model is not exactly the same as tf model, it even 
> > > > changes
> > the
> > > behavior
> > > > of pad parameter of conv layer.
> > > >
> > > > If community is open to option 2, I'll try it.
> > > >
> > > Option 2 is fine for me.
> >
> > that's great, :)
>
> looks that option 2 is a bit complex, TF model file is in protocol buffers 
> (protobuf) format and not easy to parse it with simple c code.
>
> Since there is no official c support for protobuf, let's first image how the 
> work can be done via official c++ support.
>
> 1. get protobuf compiler protoc, .h header files and .so library files 
> (download or build from 
> https://github.com/protocolbuffers/protobuf/tree/master/src).
> 2. get tensorflow model's .proto files from 
> https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/framework.
> 3. generate .cc/.h files from .proto files (see step 2) via protoc (see step 
> 1).
> 4. let the generated .cc/.h files be part of ffmpeg source tree, and build 
> with protobuf header/library files.
> 5. at run time, the protobuf libraries are invoked. It means that the system 
> should have installed protobuf dev package.
>
> furthermore, there is a compatible problem between the protobuf compiler, 
> header files and library files.
> So, as a practice to fix it, the method is to make the protobuf source code 
> be part of ffmpeg source tree. (it is a common practice, so we can many other 
> projects contain the protobuf source code).
>
> I guess the above method is not acceptable in ffmpeg. I would be glad to 
> continue if the community embrace this change. :)
Indeed I think it is not acceptable.

>
> While the current implementation has external dependency, my new suggestion 
> is:
> -  add a python script under .../libavfilter/dnn/  (all other dnn source 
> files will be also moved here later), so ffmpeg has the full control on it.
I'm not sure about the policy on putting secondary scripts with the
main code, but another option is to create a repo controlled by ffmpeg
maybe?
I think this option would also help GSoC students that work with dnn,
so they don't have to depend on previous students maintaining
independent repositories.

> -  it is a script to convert tensorflow model file into native model file. 
> (other formats such as caffe, torch can also be supported later if needed)
>
> thanks.
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above

Re: [FFmpeg-devel] [PATCH V2 1/2] libavfilter/dnn: add script to convert TensorFlow model (.pb) to native model (.model)

2019-06-11 Thread Pedro Arthur

Hi,

Em ter, 11 de jun de 2019 às 05:00, Guo, Yejun  escreveu:
>
>
> there are three options for the place to put these .py scripts.
> 1) at libavfilter/dnn/python/
>   the point is to put all the dnn stuffs together
> 2) at tools/python/
>   the point is that there is already a .py script under tools/
> 3) create a new project controlled by ffmpeg
>   the point is that the python scripts should not be part of ffmpeg source 
> tree.
>   (btw, how to apply such sub project?)
>
I think option (2) is better as it is already there, even if (1) is
more convenient.


> My idea is that the script generates dnn native model file which is loaded by 
> ffmpeg c code,
> it is better to put the script within the ffmpeg source tree, and all the dnn 
> stuffs would be better to put together, thanks.
>
> anyway, I'm open to any option, just to make the progress continue ...
>
> >
> > ping for review, thanks.
> >
> > Here is my rough plan after this patch.
> > - move dnn relative .h/.c from libavfilter to libavfilter/dnn, it is 
> > expected there
> > will be more files for dnn module (code for both model loading and 
> > execution).
> > - add a layer for padding (tf.pad) for native mode and its fate test.
> > - change the script to add tf.pad support, and so the native model and the 
> > tf
> > model of vf_sr will be the same.
> >  in current implementation, the two models have a little difference, it 
> > makes
> > the script not a general solution to convert tf model to native model.
> > - add layer maximum and fate test. This layer appears in tf model, but not 
> > in
> > native model, of vf_sr.
> > - introduce operand concept in native mode (both execution and model), to
> > support data split and merge/concat in the network, such split/concat is 
> > very
> > common.
> >  it also makes possible to reuse memory for the intermediate data as the
> > output of the hidden layers.
> > - tune conv2d layer performance (it is very slow now) or add more layers for
> > native mode.
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 1/3] tools/python: add script to convert TensorFlow model (.pb) to native model (.model)

2019-06-24 Thread Pedro Arthur

LGTM.

BTW I think we should have an ffmpeg controlled repo hosting the
scripts to train the network and also some pretrained files to easy
testing.

Em qua, 19 de jun de 2019 às 21:29, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: Guo, Yejun
> > Sent: Thursday, June 13, 2019 1:31 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Guo, Yejun 
> > Subject: [PATCH V3 1/3] tools/python: add script to convert TensorFlow model
> > (.pb) to native model (.model)
> >
> > For example, given TensorFlow model file espcn.pb,
> > to generate native model file espcn.model, just run:
> > python convert.py espcn.pb
> >
> > In current implementation, the native model file is generated for
> > specific dnn network with hard-code python scripts maintained out of ffmpeg.
> > For example, srcnn network used by vf_sr is generated with
> > https://github.com/HighVoltageRocknRoll/sr/blob/master/generate_header_a
> > nd_model.py#L85
> >
> > In this patch, the script is designed as a general solution which
> > converts general TensorFlow model .pb file into .model file. The script
> > now has some tricky to be compatible with current implemention, will
> > be refined step by step.
> >
> > The script is also added into ffmpeg source tree. It is expected there
> > will be many more patches and community needs the ownership of it.
> >
> > Another technical direction is to do the conversion in c/c++ code within
> > ffmpeg source tree. While .pb file is organized with protocol buffers,
> > it is not easy to do such work with tiny c/c++ code, see more discussion
> > at http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/244496.html. So,
> > choose the python script.
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  .gitignore  |   1 +
> >  tools/python/convert.py |  52 +
> >  tools/python/convert_from_tensorflow.py | 201
> > 
> >  3 files changed, 254 insertions(+)
> >  create mode 100644 tools/python/convert.py
> >  create mode 100644 tools/python/convert_from_tensorflow.py
>
> this patch set ping for review, thanks.
>
> >
> > diff --git a/.gitignore b/.gitignore
> > index 0e57cb0..2450ee8 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -36,3 +36,4 @@
> >  /lcov/
> >  /src
> >  /mapfile
> > +/tools/python/__pycache__/
> > diff --git a/tools/python/convert.py b/tools/python/convert.py
> > new file mode 100644
> > index 000..662b429
> > --- /dev/null
> > +++ b/tools/python/convert.py
> > @@ -0,0 +1,52 @@
> > +# Copyright (c) 2019 Guo Yejun
> > +#
> > +# This file is part of FFmpeg.
> > +#
> > +# FFmpeg is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU Lesser General Public
> > +# License as published by the Free Software Foundation; either
> > +# version 2.1 of the License, or (at your option) any later version.
> > +#
> > +# FFmpeg is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +# Lesser General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU Lesser General Public
> > +# License along with FFmpeg; if not, write to the Free Software
> > +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> > USA
> > +#
> > 
> > ==
> > +
> > +# verified with Python 3.5.2 on Ubuntu 16.04
> > +import argparse
> > +import os
> > +from convert_from_tensorflow import *
> > +
> > +def get_arguments():
> > +parser = argparse.ArgumentParser(description='generate native mode
> > model with weights from deep learning model')
> > +parser.add_argument('--outdir', type=str, default='./', help='where to 
> > put
> > generated files')
> > +parser.add_argument('--infmt', type=str, default='tensorflow',
> > help='format of the deep learning model')
> > +parser.add_argument('infile', help='path to the deep learning model 
> > with
> > weights')
> > +
> > +return parser.parse_args()
> > +
> > +def main():
> > +args = get_arguments()
> > +
> > +if not os.path.isfile(args.infile):
> > +print('the specified input file %s does not exist' % args.infile)
> > +exit(1)
> > +
> > +if not os.path.exists(args.outdir):
> > +print('create output directory %s' % args.outdir)
> > +os.mkdir(args.outdir)
> > +
> > +basefile = os.path.split(args.infile)[1]
> > +basefile = os.path.splitext(basefile)[0]
> > +outfile = os.path.join(args.outdir, basefile) + '.model'
> > +
> > +if args.infmt == 'tensorflow':
> > +convert_from_tensorflow(args.infile, outfile)
> > +
> > +if __name__ == '__main__':
> > +main()
> > diff --git a/tools/python/convert_from_tensorflow.py
> > b/tools/python/convert_from_tensorflow.py
> > new file mode 100644
> > index 000..37049e5
> > --- /de

Re: [FFmpeg-devel] [PATCH V3 1/3] tools/python: add script to convert TensorFlow model (.pb) to native model (.model)

2019-06-24 Thread Pedro Arthur

Em seg, 24 de jun de 2019 às 12:24, Guo, Yejun  escreveu:
>
> yes, good idea. Do you happen to know how to apply such repo? thanks.
>
I think you should ask Michael.

> >
> > Em qua, 19 de jun de 2019 às 21:29, Guo, Yejun 
> > escreveu:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: Guo, Yejun
> > > > Sent: Thursday, June 13, 2019 1:31 PM
> > > > To: ffmpeg-devel@ffmpeg.org
> > > > Cc: Guo, Yejun 
> > > > Subject: [PATCH V3 1/3] tools/python: add script to convert TensorFlow
> > model
> > > > (.pb) to native model (.model)
> > > >
> > > > For example, given TensorFlow model file espcn.pb,
> > > > to generate native model file espcn.model, just run:
> > > > python convert.py espcn.pb
> > > >
> > > > In current implementation, the native model file is generated for
> > > > specific dnn network with hard-code python scripts maintained out of
> > ffmpeg.
> > > > For example, srcnn network used by vf_sr is generated with
> > > >
> > https://github.com/HighVoltageRocknRoll/sr/blob/master/generate_header_a
> > > > nd_model.py#L85
> > > >
> > > > In this patch, the script is designed as a general solution which
> > > > converts general TensorFlow model .pb file into .model file. The script
> > > > now has some tricky to be compatible with current implemention, will
> > > > be refined step by step.
> > > >
> > > > The script is also added into ffmpeg source tree. It is expected there
> > > > will be many more patches and community needs the ownership of it.
> > > >
> > > > Another technical direction is to do the conversion in c/c++ code within
> > > > ffmpeg source tree. While .pb file is organized with protocol buffers,
> > > > it is not easy to do such work with tiny c/c++ code, see more discussion
> > > > at http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/244496.html. So,
> > > > choose the python script.
> > > >
> > > > Signed-off-by: Guo, Yejun 
> > > > ---
> > > >  .gitignore  |   1 +
> > > >  tools/python/convert.py |  52 +
> > > >  tools/python/convert_from_tensorflow.py | 201
> > > > 
> > > >  3 files changed, 254 insertions(+)
> > > >  create mode 100644 tools/python/convert.py
> > > >  create mode 100644 tools/python/convert_from_tensorflow.py
> > >
> > > this patch set ping for review, thanks.
> > >
> > > >
> > > > diff --git a/.gitignore b/.gitignore
> > > > index 0e57cb0..2450ee8 100644
> > > > --- a/.gitignore
> > > > +++ b/.gitignore
> > > > @@ -36,3 +36,4 @@
> > > >  /lcov/
> > > >  /src
> > > >  /mapfile
> > > > +/tools/python/__pycache__/
> > > > diff --git a/tools/python/convert.py b/tools/python/convert.py
> > > > new file mode 100644
> > > > index 000..662b429
> > > > --- /dev/null
> > > > +++ b/tools/python/convert.py
> > > > @@ -0,0 +1,52 @@
> > > > +# Copyright (c) 2019 Guo Yejun
> > > > +#
> > > > +# This file is part of FFmpeg.
> > > > +#
> > > > +# FFmpeg is free software; you can redistribute it and/or
> > > > +# modify it under the terms of the GNU Lesser General Public
> > > > +# License as published by the Free Software Foundation; either
> > > > +# version 2.1 of the License, or (at your option) any later version.
> > > > +#
> > > > +# FFmpeg is distributed in the hope that it will be useful,
> > > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > GNU
> > > > +# Lesser General Public License for more details.
> > > > +#
> > > > +# You should have received a copy of the GNU Lesser General Public
> > > > +# License along with FFmpeg; if not, write to the Free Software
> > > > +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 
> > > > 02110-1301
> > USA
> > > > +#
> > > >
> > 
> > > > ==
> > > > +
> > > > +# verified with Python 3.5.2 on Ubuntu 16.04
> > > > +import argparse
> > > > +import os
> > > > +from convert_from_tensorflow import *
> > > > +
> > > > +def get_arguments():
> > > > +parser = argparse.ArgumentParser(description='generate native
> > mode
> > > > model with weights from deep learning model')
> > > > +parser.add_argument('--outdir', type=str, default='./', 
> > > > help='where to
> > put
> > > > generated files')
> > > > +parser.add_argument('--infmt', type=str, default='tensorflow',
> > > > help='format of the deep learning model')
> > > > +parser.add_argument('infile', help='path to the deep learning model
> > with
> > > > weights')
> > > > +
> > > > +return parser.parse_args()
> > > > +
> > > > +def main():
> > > > +args = get_arguments()
> > > > +
> > > > +if not os.path.isfile(args.infile):
> > > > +print('the specified input file %s does not exist' % 
> > > > args.infile)
> > > > +exit(1)
> > > > +
> > > > +if not os.path.exists(args.outdir):
> > > > +print('create output directory %s' % args.outdir)
> > > > +

Re: [FFmpeg-devel] [PATCH V3 1/3] tools/python: add script to convert TensorFlow model (.pb) to native model (.model)

2019-07-01 Thread Pedro Arthur

Em seg, 1 de jul de 2019 às 05:21, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > Pedro Arthur
> > Sent: Monday, June 24, 2019 11:13 PM
> > To: FFmpeg development discussions and patches 
> > Subject: Re: [FFmpeg-devel] [PATCH V3 1/3] tools/python: add script to 
> > convert
> > TensorFlow model (.pb) to native model (.model)
> >
> > LGTM.
>
> this patch set asks for more comments or push, thanks.
>
Pushed.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] dnn: add layer pad which is equivalent to tf.pad

2019-07-26 Thread Pedro Arthur

Hi,

Em seg, 1 de jul de 2019 às 05:10, Guo, Yejun  escreveu:
>
> the reason to add this layer first is that vf_sr uses it in its
> tensorflow model, and the next plan is to update the python script
> to convert tf.pad into native model.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/Makefile   |   1 +
>  libavfilter/dnn/dnn_backend_native_layer_pad.c | 211 
> +
>  libavfilter/dnn/dnn_backend_native_layer_pad.h |  40 +
>  3 files changed, 252 insertions(+)
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_pad.c
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_pad.h
>
> diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> index 1d12ade..83938e5 100644
> --- a/libavfilter/dnn/Makefile
> +++ b/libavfilter/dnn/Makefile
> @@ -1,5 +1,6 @@
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_interface.o
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_backend_native.o
> +OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_pad.o
>
>  DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
>
> diff --git a/libavfilter/dnn/dnn_backend_native_layer_pad.c 
> b/libavfilter/dnn/dnn_backend_native_layer_pad.c
> new file mode 100644
> index 000..aa12f7f
> --- /dev/null
> +++ b/libavfilter/dnn/dnn_backend_native_layer_pad.c
> @@ -0,0 +1,211 @@
> +/*
> + * Copyright (c) 2019 Guo Yejun
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +#include 
> +#include "libavutil/avassert.h"
> +#include "dnn_backend_native_layer_pad.h"
> +
> +static int before_get_buddy(int given, int paddings, LayerPadModeParam mode)
> +{
> +if (mode == LPMP_SYMMETRIC) {
> +return (2 * paddings - 1 - given);
> +} else if (mode == LPMP_REFLECT) {
> +return (2 * paddings - given);
> +} else {
> +av_assert0(!"should not reach here");
> +return 0;
> +}
> +}
> +
> +static int after_get_buddy(int given, int border, LayerPadModeParam mode)
> +{
> +if (mode == LPMP_SYMMETRIC) {
> +int offset = given - border;
> +return (border - 1 - offset);
> +} else if (mode == LPMP_REFLECT) {
> +int offset = given - border;
> +return (border - 2 - offset);
> +} else {
> +av_assert0(!"should not reach here");
> +return 0;
> +}
> +}
> +
> +void dnn_execute_layer_pad(const float *input, float *output, const 
> LayerPadParams *params, int number, int height, int width, int channel)
> +{
> +int32_t before_paddings;
> +int32_t after_paddings;
> +
> +// suppose format is 
> +int new_number = number + params->paddings[0][0] + 
> params->paddings[0][1];
> +int new_height = height + params->paddings[1][0] + 
> params->paddings[1][1];
> +int new_width = width + params->paddings[2][0] + params->paddings[2][1];
> +int new_channel = channel + params->paddings[3][0] + 
> params->paddings[3][1];
> +
> +int c_stride = channel;
> +int wc_stride = c_stride * width;
> +int hwc_stride = wc_stride * height;
> +
> +int new_c_stride = new_channel;
> +int new_wc_stride = new_c_stride * new_width;
> +int new_hwc_stride = new_wc_stride * new_height;
> +
> +// copy the original data
> +for (int n = 0; n < number; n++) {
> +for (int h = 0; h < height; h++) {
> +for (int w = 0; w < width; w++) {
> +const float *src = input + n * hwc_stride + h * wc_stride + 
> w * c_stride;
> +float *dst = output + (n + params->paddings[0][0]) * 
> new_hwc_stride
> ++ (h + params->paddings[1][0]) * 
> new_wc_stride
> ++ (w + params->paddings[2][0]) * 
> new_c_stride
> ++ params->paddings[3][0];
> +memcpy(dst, src, channel * sizeof(float));
> +}
> +}
> +}
> +
> +// handle the first dimension
> +before_paddings = params->paddings[0][0];
> +after_paddings = params->paddings[0][1];
> +for (int n = 0; n < before_paddings; n++) {
> +float *dst = output + n * new_hwc_stride;
> +if (params->mode == LPMP_CONSTANT) {
>

Re: [FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: move dnn files from libavfilter to libavfilter/dnn

2019-07-26 Thread Pedro Arthur

Hi,
It fails fate source guard header tests,
The headers should be changed from AVFILTER_DNN_BACKEND_xxx to
AVFILTER_DNN_DNN_BACKEND_xxx.
Other than that it LGTM.

Em ter, 16 de jul de 2019 às 02:58, Guo, Yejun  escreveu:
>
> it is expected that there will be more files to support native mode,
> so put all the dnn codes under libavfilter/dnn
>
> The main change of this patch is to move the file location, see below:
> modified:   libavfilter/Makefile
> new file:   libavfilter/dnn/Makefile
> renamed:libavfilter/dnn_backend_native.c -> 
> libavfilter/dnn/dnn_backend_native.c
> renamed:libavfilter/dnn_backend_native.h -> 
> libavfilter/dnn/dnn_backend_native.h
> renamed:libavfilter/dnn_backend_tf.c -> libavfilter/dnn/dnn_backend_tf.c
> renamed:libavfilter/dnn_backend_tf.h -> libavfilter/dnn/dnn_backend_tf.h
> renamed:libavfilter/dnn_interface.c -> libavfilter/dnn/dnn_interface.c
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/Makefile |   3 +-
>  libavfilter/dnn/Makefile |   6 +
>  libavfilter/dnn/dnn_backend_native.c | 389 ++
>  libavfilter/dnn/dnn_backend_native.h |  74 +
>  libavfilter/dnn/dnn_backend_tf.c | 603 
> +++
>  libavfilter/dnn/dnn_backend_tf.h |  38 +++
>  libavfilter/dnn/dnn_interface.c  |  63 
>  libavfilter/dnn_backend_native.c | 389 --
>  libavfilter/dnn_backend_native.h |  74 -
>  libavfilter/dnn_backend_tf.c | 603 
> ---
>  libavfilter/dnn_backend_tf.h |  38 ---
>  libavfilter/dnn_interface.c  |  63 
>  12 files changed, 1174 insertions(+), 1169 deletions(-)
>  create mode 100644 libavfilter/dnn/Makefile
>  create mode 100644 libavfilter/dnn/dnn_backend_native.c
>  create mode 100644 libavfilter/dnn/dnn_backend_native.h
>  create mode 100644 libavfilter/dnn/dnn_backend_tf.c
>  create mode 100644 libavfilter/dnn/dnn_backend_tf.h
>  create mode 100644 libavfilter/dnn/dnn_interface.c
>  delete mode 100644 libavfilter/dnn_backend_native.c
>  delete mode 100644 libavfilter/dnn_backend_native.h
>  delete mode 100644 libavfilter/dnn_backend_tf.c
>  delete mode 100644 libavfilter/dnn_backend_tf.h
>  delete mode 100644 libavfilter/dnn_interface.c
>
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 455c809..450d781 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -26,9 +26,8 @@ OBJS-$(HAVE_THREADS) += pthread.o
>
>  # subsystems
>  OBJS-$(CONFIG_QSVVPP)+= qsvvpp.o
> -DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn_backend_tf.o
> -OBJS-$(CONFIG_DNN)   += dnn_interface.o 
> dnn_backend_native.o $(DNN-OBJS-yes)
>  OBJS-$(CONFIG_SCENE_SAD) += scene_sad.o
> +include $(SRC_PATH)/libavfilter/dnn/Makefile
>
>  # audio filters
>  OBJS-$(CONFIG_ABENCH_FILTER) += f_bench.o
> diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> new file mode 100644
> index 000..1d12ade
> --- /dev/null
> +++ b/libavfilter/dnn/Makefile
> @@ -0,0 +1,6 @@
> +OBJS-$(CONFIG_DNN)   += dnn/dnn_interface.o
> +OBJS-$(CONFIG_DNN)   += dnn/dnn_backend_native.o
> +
> +DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
> +
> +OBJS-$(CONFIG_DNN)   += $(DNN-OBJS-yes)
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> new file mode 100644
> index 000..82e900b
> --- /dev/null
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -0,0 +1,389 @@
> +/*
> + * Copyright (c) 2018 Sergey Lavrushkin
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +/**
> + * @file
> + * DNN native backend implementation.
> + */
> +
> +#include "dnn_backend_native.h"
> +#include "libavutil/avassert.h"
> +
> +static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
> +{
> +ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
> +InputParams *input_params;
> +ConvolutionalParams *conv_params

Re: [FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: move dnn files from libavfilter to libavfilter/dnn

2019-07-26 Thread Pedro Arthur

Em sex, 26 de jul de 2019 às 13:02, Pedro Arthur  escreveu:
>
> Hi,
> It fails fate source guard header tests,
> The headers should be changed from AVFILTER_DNN_BACKEND_xxx to
> AVFILTER_DNN_DNN_BACKEND_xxx.
Changed locally and pushed.

> Other than that it LGTM.
>
> Em ter, 16 de jul de 2019 às 02:58, Guo, Yejun  escreveu:
> >
> > it is expected that there will be more files to support native mode,
> > so put all the dnn codes under libavfilter/dnn
> >
> > The main change of this patch is to move the file location, see below:
> > modified:   libavfilter/Makefile
> > new file:   libavfilter/dnn/Makefile
> > renamed:libavfilter/dnn_backend_native.c -> 
> > libavfilter/dnn/dnn_backend_native.c
> > renamed:libavfilter/dnn_backend_native.h -> 
> > libavfilter/dnn/dnn_backend_native.h
> > renamed:libavfilter/dnn_backend_tf.c -> libavfilter/dnn/dnn_backend_tf.c
> > renamed:libavfilter/dnn_backend_tf.h -> libavfilter/dnn/dnn_backend_tf.h
> > renamed:libavfilter/dnn_interface.c -> libavfilter/dnn/dnn_interface.c
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  libavfilter/Makefile |   3 +-
> >  libavfilter/dnn/Makefile |   6 +
> >  libavfilter/dnn/dnn_backend_native.c | 389 ++
> >  libavfilter/dnn/dnn_backend_native.h |  74 +
> >  libavfilter/dnn/dnn_backend_tf.c | 603 
> > +++
> >  libavfilter/dnn/dnn_backend_tf.h |  38 +++
> >  libavfilter/dnn/dnn_interface.c  |  63 
> >  libavfilter/dnn_backend_native.c | 389 --
> >  libavfilter/dnn_backend_native.h |  74 -
> >  libavfilter/dnn_backend_tf.c | 603 
> > ---
> >  libavfilter/dnn_backend_tf.h |  38 ---
> >  libavfilter/dnn_interface.c  |  63 
> >  12 files changed, 1174 insertions(+), 1169 deletions(-)
> >  create mode 100644 libavfilter/dnn/Makefile
> >  create mode 100644 libavfilter/dnn/dnn_backend_native.c
> >  create mode 100644 libavfilter/dnn/dnn_backend_native.h
> >  create mode 100644 libavfilter/dnn/dnn_backend_tf.c
> >  create mode 100644 libavfilter/dnn/dnn_backend_tf.h
> >  create mode 100644 libavfilter/dnn/dnn_interface.c
> >  delete mode 100644 libavfilter/dnn_backend_native.c
> >  delete mode 100644 libavfilter/dnn_backend_native.h
> >  delete mode 100644 libavfilter/dnn_backend_tf.c
> >  delete mode 100644 libavfilter/dnn_backend_tf.h
> >  delete mode 100644 libavfilter/dnn_interface.c
> >
> > diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> > index 455c809..450d781 100644
> > --- a/libavfilter/Makefile
> > +++ b/libavfilter/Makefile
> > @@ -26,9 +26,8 @@ OBJS-$(HAVE_THREADS) += pthread.o
> >
> >  # subsystems
> >  OBJS-$(CONFIG_QSVVPP)+= qsvvpp.o
> > -DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn_backend_tf.o
> > -OBJS-$(CONFIG_DNN)   += dnn_interface.o 
> > dnn_backend_native.o $(DNN-OBJS-yes)
> >  OBJS-$(CONFIG_SCENE_SAD) += scene_sad.o
> > +include $(SRC_PATH)/libavfilter/dnn/Makefile
> >
> >  # audio filters
> >  OBJS-$(CONFIG_ABENCH_FILTER) += f_bench.o
> > diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> > new file mode 100644
> > index 000..1d12ade
> > --- /dev/null
> > +++ b/libavfilter/dnn/Makefile
> > @@ -0,0 +1,6 @@
> > +OBJS-$(CONFIG_DNN)   += dnn/dnn_interface.o
> > +OBJS-$(CONFIG_DNN)   += dnn/dnn_backend_native.o
> > +
> > +DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
> > +
> > +OBJS-$(CONFIG_DNN)   += $(DNN-OBJS-yes)
> > diff --git a/libavfilter/dnn/dnn_backend_native.c 
> > b/libavfilter/dnn/dnn_backend_native.c
> > new file mode 100644
> > index 000..82e900b
> > --- /dev/null
> > +++ b/libavfilter/dnn/dnn_backend_native.c
> > @@ -0,0 +1,389 @@
> > +/*
> > + * Copyright (c) 2018 Sergey Lavrushkin
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WI

Re: [FFmpeg-devel] [PATCH V2 2/3] fate: add unit test for dnn-layer-pad

2019-07-29 Thread Pedro Arthur

LGTM.
Pushed, thanks!

Em dom, 28 de jul de 2019 às 22:59, Guo, Yejun  escreveu:
>
> 'make fate-dnn-layer-pad' to run the test
>
> Signed-off-by: Guo, Yejun 
> ---
>  tests/Makefile |   5 +-
>  tests/dnn/Makefile |  11 +++
>  tests/dnn/dnn-layer-pad-test.c | 203 
> +
>  tests/fate/dnn.mak |   8 ++
>  4 files changed, 226 insertions(+), 1 deletion(-)
>  create mode 100644 tests/dnn/Makefile
>  create mode 100644 tests/dnn/dnn-layer-pad-test.c
>  create mode 100644 tests/fate/dnn.mak
>
> diff --git a/tests/Makefile b/tests/Makefile
> index 624292d..0ef571b 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -10,7 +10,8 @@ FFMPEG=ffmpeg$(PROGSSUF)$(EXESUF)
>  $(AREF): CMP=
>
>  APITESTSDIR := tests/api
> -FATE_OUTDIRS = tests/data tests/data/fate tests/data/filtergraphs 
> tests/data/lavf tests/data/lavf-fate tests/data/pixfmt tests/vsynth1 
> $(APITESTSDIR)
> +DNNTESTSDIR := tests/dnn
> +FATE_OUTDIRS = tests/data tests/data/fate tests/data/filtergraphs 
> tests/data/lavf tests/data/lavf-fate tests/data/pixfmt tests/vsynth1 
> $(APITESTSDIR) $(DNNTESTSDIR)
>  OUTDIRS += $(FATE_OUTDIRS)
>
>  $(VREF): tests/videogen$(HOSTEXESUF) | tests/vsynth1
> @@ -85,6 +86,7 @@ FILTERDEMDECENCMUX = $(call ALLYES, $(1:%=%_FILTER) 
> $(2)_DEMUXER $(3)_DECODER $(
>  PARSERDEMDEC   = $(call ALLYES, $(1)_PARSER $(2)_DEMUXER $(3)_DECODER)
>
>  include $(SRC_PATH)/$(APITESTSDIR)/Makefile
> +include $(SRC_PATH)/$(DNNTESTSDIR)/Makefile
>
>  include $(SRC_PATH)/tests/fate/acodec.mak
>  include $(SRC_PATH)/tests/fate/vcodec.mak
> @@ -118,6 +120,7 @@ include $(SRC_PATH)/tests/fate/cover-art.mak
>  include $(SRC_PATH)/tests/fate/dca.mak
>  include $(SRC_PATH)/tests/fate/demux.mak
>  include $(SRC_PATH)/tests/fate/dfa.mak
> +include $(SRC_PATH)/tests/fate/dnn.mak
>  include $(SRC_PATH)/tests/fate/dnxhd.mak
>  include $(SRC_PATH)/tests/fate/dpcm.mak
>  include $(SRC_PATH)/tests/fate/ea.mak
> diff --git a/tests/dnn/Makefile b/tests/dnn/Makefile
> new file mode 100644
> index 000..b2e6680
> --- /dev/null
> +++ b/tests/dnn/Makefile
> @@ -0,0 +1,11 @@
> +DNNTESTPROGS += dnn-layer-pad
> +
> +DNNTESTOBJS  := $(DNNTESTOBJS:%=$(DNNTESTSDIR)%) 
> $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test.o)
> +DNNTESTPROGS := $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test$(EXESUF))
> +-include $(wildcard $(DNNTESTOBJS:.o=.d))
> +
> +$(DNNTESTPROGS): %$(EXESUF): %.o $(FF_DEP_LIBS)
> +   $(LD) $(LDFLAGS) $(LDEXEFLAGS) $(LD_O) $(filter %.o,$^) 
> $(FF_EXTRALIBS) $(ELIBS)
> +
> +testclean::
> +   $(RM) $(addprefix $(DNNTESTSDIR)/,$(CLEANSUFFIXES) *-test$(EXESUF))
> diff --git a/tests/dnn/dnn-layer-pad-test.c b/tests/dnn/dnn-layer-pad-test.c
> new file mode 100644
> index 000..28a49eb
> --- /dev/null
> +++ b/tests/dnn/dnn-layer-pad-test.c
> @@ -0,0 +1,203 @@
> +/*
> + * Copyright (c) 2019 Guo Yejun
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include "libavfilter/dnn/dnn_backend_native_layer_pad.h"
> +
> +#define EPSON 0.1
> +
> +static int test_with_mode_symmetric(void)
> +{
> +// the input data and expected data are generated with below python code.
> +/*
> +x = tf.placeholder(tf.float32, shape=[1, None, None, 3])
> +y = tf.pad(x, [[0, 0], [2, 3], [3, 2], [0, 0]], 'SYMMETRIC')
> +data = np.arange(48).reshape(1, 4, 4, 3);
> +
> +sess=tf.Session()
> +sess.run(tf.global_variables_initializer())
> +output = sess.run(y, feed_dict={x: data})
> +
> +print(list(data.flatten()))
> +print(list(output.flatten()))
> +print(data.shape)
> +print(output.shape)
> +*/
> +
> +LayerPadParams params;
> +float input[1*4*4*3] = {
> +0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
> 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 
> 38, 39, 40, 41, 42, 43, 44, 45, 46, 47
> +};
> +float expected_output[1*9*9*3] = {
> +18.0, 19.0, 20.0, 15.0, 16.0, 17.0, 12.0, 13.0, 14.0, 12.0, 13.0, 
> 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 21.0, 22.0, 23.0, 
> 18.0, 19.0, 20.0, 6.0, 7.0, 8.0, 3.0,
> +4.0, 5.0, 0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 3.0, 4.0,

Re: [FFmpeg-devel] [PATCH V2 3/3] dnn: convert tf.pad to native model in python script, and load/execute it in the c code.

2019-07-29 Thread Pedro Arthur

LGTM.
Pushed, thanks!

Em dom, 28 de jul de 2019 às 23:00, Guo, Yejun  escreveu:
>
> since tf.pad is enabled, the conv2d(valid) changes back to its original 
> behavior.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c| 35 
> +
>  libavfilter/dnn/dnn_backend_native.h|  2 +-
>  tools/python/convert_from_tensorflow.py | 23 +-
>  3 files changed, 54 insertions(+), 6 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index 82e900b..09c583b 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -25,6 +25,7 @@
>
>  #include "dnn_backend_native.h"
>  #include "libavutil/avassert.h"
> +#include "dnn_backend_native_layer_pad.h"
>
>  static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
>  {
> @@ -32,6 +33,7 @@ static DNNReturnType set_input_output_native(void *model, 
> DNNInputData *input, c
>  InputParams *input_params;
>  ConvolutionalParams *conv_params;
>  DepthToSpaceParams *depth_to_space_params;
> +LayerPadParams *pad_params;
>  int cur_width, cur_height, cur_channels;
>  int32_t layer;
>
> @@ -77,6 +79,12 @@ static DNNReturnType set_input_output_native(void *model, 
> DNNInputData *input, c
>  cur_height *= depth_to_space_params->block_size;
>  cur_width *= depth_to_space_params->block_size;
>  break;
> +case MIRROR_PAD:
> +pad_params = (LayerPadParams *)network->layers[layer].params;
> +cur_height = cur_height + pad_params->paddings[1][0] + 
> pad_params->paddings[1][1];
> +cur_width = cur_width + pad_params->paddings[2][0] + 
> pad_params->paddings[2][1];
> +cur_channels = cur_channels + pad_params->paddings[3][0] + 
> pad_params->paddings[3][1];
> +break;
>  default:
>  return DNN_ERROR;
>  }
> @@ -110,6 +118,7 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  DNNLayerType layer_type;
>  ConvolutionalParams *conv_params;
>  DepthToSpaceParams *depth_to_space_params;
> +LayerPadParams *pad_params;
>
>  model = av_malloc(sizeof(DNNModel));
>  if (!model){
> @@ -207,6 +216,23 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  network->layers[layer].type = DEPTH_TO_SPACE;
>  network->layers[layer].params = depth_to_space_params;
>  break;
> +case MIRROR_PAD:
> +pad_params = av_malloc(sizeof(LayerPadParams));
> +if (!pad_params){
> +avio_closep(&model_file_context);
> +ff_dnn_free_model_native(&model);
> +return NULL;
> +}
> +pad_params->mode = (int32_t)avio_rl32(model_file_context);
> +dnn_size += 4;
> +for (i = 0; i < 4; ++i) {
> +pad_params->paddings[i][0] = avio_rl32(model_file_context);
> +pad_params->paddings[i][1] = avio_rl32(model_file_context);
> +dnn_size += 8;
> +}
> +network->layers[layer].type = MIRROR_PAD;
> +network->layers[layer].params = pad_params;
> +break;
>  default:
>  avio_closep(&model_file_context);
>  ff_dnn_free_model_native(&model);
> @@ -314,6 +340,7 @@ DNNReturnType ff_dnn_execute_model_native(const DNNModel 
> *model, DNNData *output
>  InputParams *input_params;
>  ConvolutionalParams *conv_params;
>  DepthToSpaceParams *depth_to_space_params;
> +LayerPadParams *pad_params;
>
>  if (network->layers_num <= 0 || network->layers[0].type != INPUT || 
> !network->layers[0].output){
>  return DNN_ERROR;
> @@ -348,6 +375,14 @@ DNNReturnType ff_dnn_execute_model_native(const DNNModel 
> *model, DNNData *output
>  cur_width *= depth_to_space_params->block_size;
>  cur_channels /= depth_to_space_params->block_size * 
> depth_to_space_params->block_size;
>  break;
> +case MIRROR_PAD:
> +pad_params = (LayerPadParams *)network->layers[layer].params;
> +dnn_execute_layer_pad(network->layers[layer - 1].output, 
> network->layers[layer].output,
> +  pad_params, 1, cur_height, cur_width, 
> cur_channels);
> +cur_height = cur_height + pad_params->paddings[1][0] + 
> pad_params->paddings[1][1];
> +cur_width = cur_width + pad_params->paddings[2][0] + 
> pad_params->paddings[2][1];
> +cur_channels = cur_channels + pad_params->paddings[3][0] + 
> pad_params->paddings[3][1];
> +break;
>  case INPUT:
>  return DNN_ERROR;
>  }
> diff --git a/libavfilter/dnn/dnn_backend_native.h 
> b/libavfilter/dnn/dnn_backe

Re: [FFmpeg-devel] [PATCH V2 1/3] dnn: add layer pad which is equivalent to tf.pad

2019-07-29 Thread Pedro Arthur

LGTM.
Pushed, thanks!

Em dom, 28 de jul de 2019 às 22:59, Guo, Yejun  escreveu:
>
> the reason to add this layer first is that vf_sr uses it in its
> tensorflow model, and the next plan is to update the python script
> to convert tf.pad into native model.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/Makefile   |   1 +
>  libavfilter/dnn/dnn_backend_native_layer_pad.c | 211 
> +
>  libavfilter/dnn/dnn_backend_native_layer_pad.h |  40 +
>  3 files changed, 252 insertions(+)
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_pad.c
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_pad.h
>
> diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> index 1d12ade..83938e5 100644
> --- a/libavfilter/dnn/Makefile
> +++ b/libavfilter/dnn/Makefile
> @@ -1,5 +1,6 @@
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_interface.o
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_backend_native.o
> +OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_pad.o
>
>  DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
>
> diff --git a/libavfilter/dnn/dnn_backend_native_layer_pad.c 
> b/libavfilter/dnn/dnn_backend_native_layer_pad.c
> new file mode 100644
> index 000..5417d73
> --- /dev/null
> +++ b/libavfilter/dnn/dnn_backend_native_layer_pad.c
> @@ -0,0 +1,211 @@
> +/*
> + * Copyright (c) 2019 Guo Yejun
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +#include 
> +#include "libavutil/avassert.h"
> +#include "dnn_backend_native_layer_pad.h"
> +
> +static int before_get_buddy(int given, int paddings, LayerPadModeParam mode)
> +{
> +if (mode == LPMP_SYMMETRIC) {
> +return (2 * paddings - 1 - given);
> +} else if (mode == LPMP_REFLECT) {
> +return (2 * paddings - given);
> +} else {
> +av_assert0(!"should not reach here");
> +return 0;
> +}
> +}
> +
> +static int after_get_buddy(int given, int border, LayerPadModeParam mode)
> +{
> +if (mode == LPMP_SYMMETRIC) {
> +int offset = given - border;
> +return (border - 1 - offset);
> +} else if (mode == LPMP_REFLECT) {
> +int offset = given - border;
> +return (border - 2 - offset);
> +} else {
> +av_assert0(!"should not reach here");
> +return 0;
> +}
> +}
> +
> +void dnn_execute_layer_pad(const float *input, float *output, const 
> LayerPadParams *params, int number, int height, int width, int channel)
> +{
> +int32_t before_paddings;
> +int32_t after_paddings;
> +
> +// suppose format is 
> +int new_number = number + params->paddings[0][0] + 
> params->paddings[0][1];
> +int new_height = height + params->paddings[1][0] + 
> params->paddings[1][1];
> +int new_width = width + params->paddings[2][0] + params->paddings[2][1];
> +int new_channel = channel + params->paddings[3][0] + 
> params->paddings[3][1];
> +
> +int c_stride = channel;
> +int wc_stride = c_stride * width;
> +int hwc_stride = wc_stride * height;
> +
> +int new_c_stride = new_channel;
> +int new_wc_stride = new_c_stride * new_width;
> +int new_hwc_stride = new_wc_stride * new_height;
> +
> +// copy the original data
> +for (int n = 0; n < number; n++) {
> +for (int h = 0; h < height; h++) {
> +for (int w = 0; w < width; w++) {
> +const float *src = input + n * hwc_stride + h * wc_stride + 
> w * c_stride;
> +float *dst = output + (n + params->paddings[0][0]) * 
> new_hwc_stride
> ++ (h + params->paddings[1][0]) * 
> new_wc_stride
> ++ (w + params->paddings[2][0]) * 
> new_c_stride
> ++ params->paddings[3][0];
> +memcpy(dst, src, channel * sizeof(float));
> +}
> +}
> +}
> +
> +// handle the first dimension
> +before_paddings = params->paddings[0][0];
> +after_paddings = params->paddings[0][1];
> +for (int n = 0; n < before_paddings; n++) {
> +float *dst = output + n * new_hwc_stride;
> +if (params->mode ==

Re: [FFmpeg-devel] [PATCH] dnn: rename function from dnn_execute_layer_pad to avfilter_dnn_execute_layer_pad

2019-08-01 Thread Pedro Arthur

Hi,

Em qui, 1 de ago de 2019 às 06:36, Paul B Mahol  escreveu:
>
> Why test uses internal function, why was this allowed to be committed at
> all?
> Who is reviewing this mess?
>
> Why test does not use normal filtergraph?
>
I was responsible for pushing the patch, thanks for point out the issues.
Could you provide more details regarding the proper way to make the test?

Thanks.

> >
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] convert_from_tensorflow.py: support conv2d with dilation

2019-08-13 Thread Pedro Arthur

LGTM.
Should push soon.

BTW I just noticed that the tensorflow backend is failling to load SR
filter models.

$ python tools/python/convert.py sr_models/srcnn.pb
$ ./ffmpeg -i input.jpg -vf
sr=model=srcnn.model:dnn_backend=tensorflow out_srcnn_tf.png

The above command fails.
It seems commit ccbab41039af424237eaac5c302c293ab97540f8 is the
problem. I thought I had tested it but clearly I made a mistake
somewhere in the process.
I suppose you have the .pb files to test it, but let me know if you need them.

Em sex, 9 de ago de 2019 às 12:25, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: Guo, Yejun
> > Sent: Tuesday, July 30, 2019 9:26 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Guo, Yejun 
> > Subject: [PATCH 2/2] convert_from_tensorflow.py: support conv2d with 
> > dilation
> >
> > conv2d with dilation > 1 generates tens of nodes in graph, it is not
> > easy to parse each node one by one, so we do special tricks to parse
> > the conv2d layer.
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  tools/python/convert_from_tensorflow.py | 80
> > -
> >  1 file changed, 59 insertions(+), 21 deletions(-)
>
> this patch set asks for review, thanks.
>
> I've locally finished more patches to improve dnn module, plan to send more 
> them set by set, since the patches have dependency.
>
> Just in case you are interested in these new patches, I've uploaded to 
> https://github.com/guoyejun/ffmpeg/tree/dnn0809.
> for your convenient, I also copy the oneline log here for each patch (from 
> newer to older) with 4 patch sets.
>
> 7eced90 libavfilter/dnn: support multiple outputs for native mode
> 28a7054 libavfilter/dnn/dnn_backend_native: find the input operand according 
> to input name
>
> 256e657 FATE/dnn: add unit test for layer maximum
> 8c616a0 libavfilter/dnn: add layer maximum for native mode.
>
> 8ec6c0c FATE/dnn: add unit test for dnn depth_to_space layer
> 09ef108 libavfilter/dnn: separate depth_to_space layer from 
> dnn_backend_native.c to a new file
> c65b59d FATE/dnn: add unit test for dnn conv2d layer
> a5d69a7 libavfilter/dnn: separate conv2d layer from dnn_backend_native.c to a 
> new file
>
> 202d323 dnn: export operand info in python script and load in c code
> 3c706a0 dnn: change .model file format to put layer number at the end of file
> 0256731 dnn: introduce dnn operand (in c code) to hold operand infos within 
> network
>
>
> Besides continuous dnn improvement, I also plan to add two generic video 
> filters for dnn.
> - a generic filter to process the content of AVFrame with different dnn 
> networks.
> and so the current specific filters such as vf_sr (some changes needed) and 
> vf_derain are no longer needed, since they can be
> included in this specific filter. And of course, in practice I'll not remove 
> them.
>
> - a generic filter to analyze the content of AVFrame to generate some side 
> data with different dnn networks. The content of AVFrame does not change.
> The application, which invokes the filter with a given dnn network, has the 
> responsibility/knowledge to parse the side data (analyze result).
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] convert_from_tensorflow.py: support conv2d with dilation

2019-08-15 Thread Pedro Arthur

Pushed.

Em qua, 14 de ago de 2019 às 03:37, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > Pedro Arthur
> > Sent: Wednesday, August 14, 2019 12:09 AM
> > To: FFmpeg development discussions and patches 
> > Subject: Re: [FFmpeg-devel] [PATCH 2/2] convert_from_tensorflow.py: support
> > conv2d with dilation
> >
> > LGTM.
> > Should push soon.
>
> thanks.
>
> >
> > BTW I just noticed that the tensorflow backend is failling to load SR
> > filter models.
> >
> > $ python tools/python/convert.py sr_models/srcnn.pb
> > $ ./ffmpeg -i input.jpg -vf
> > sr=model=srcnn.model:dnn_backend=tensorflow out_srcnn_tf.png
> >
> > The above command fails.
> > It seems commit ccbab41039af424237eaac5c302c293ab97540f8 is the
> > problem. I thought I had tested it but clearly I made a mistake
> > somewhere in the process.
> > I suppose you have the .pb files to test it, but let me know if you need 
> > them.
>
> yes, I have the .pb files. I missed the patch for such support. Will refine 
> and send out soon.
>
>
> >
> > Em sex, 9 de ago de 2019 às 12:25, Guo, Yejun 
> > escreveu:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: Guo, Yejun
> > > > Sent: Tuesday, July 30, 2019 9:26 AM
> > > > To: ffmpeg-devel@ffmpeg.org
> > > > Cc: Guo, Yejun 
> > > > Subject: [PATCH 2/2] convert_from_tensorflow.py: support conv2d with
> > dilation
> > > >
> > > > conv2d with dilation > 1 generates tens of nodes in graph, it is not
> > > > easy to parse each node one by one, so we do special tricks to parse
> > > > the conv2d layer.
> > > >
> > > > Signed-off-by: Guo, Yejun 
> > > > ---
> > > >  tools/python/convert_from_tensorflow.py | 80
> > > > -
> > > >  1 file changed, 59 insertions(+), 21 deletions(-)
> > >
> > > this patch set asks for review, thanks.
> > >
> > > I've locally finished more patches to improve dnn module, plan to send 
> > > more
> > them set by set, since the patches have dependency.
> > >
> > > Just in case you are interested in these new patches, I've uploaded to
> > https://github.com/guoyejun/ffmpeg/tree/dnn0809.
> > > for your convenient, I also copy the oneline log here for each patch (from
> > newer to older) with 4 patch sets.
> > >
> > > 7eced90 libavfilter/dnn: support multiple outputs for native mode
> > > 28a7054 libavfilter/dnn/dnn_backend_native: find the input operand
> > according to input name
> > >
> > > 256e657 FATE/dnn: add unit test for layer maximum
> > > 8c616a0 libavfilter/dnn: add layer maximum for native mode.
> > >
> > > 8ec6c0c FATE/dnn: add unit test for dnn depth_to_space layer
> > > 09ef108 libavfilter/dnn: separate depth_to_space layer from
> > dnn_backend_native.c to a new file
> > > c65b59d FATE/dnn: add unit test for dnn conv2d layer
> > > a5d69a7 libavfilter/dnn: separate conv2d layer from dnn_backend_native.c 
> > > to
> > a new file
> > >
> > > 202d323 dnn: export operand info in python script and load in c code
> > > 3c706a0 dnn: change .model file format to put layer number at the end of 
> > > file
> > > 0256731 dnn: introduce dnn operand (in c code) to hold operand infos 
> > > within
> > network
> > >
> > >
> > > Besides continuous dnn improvement, I also plan to add two generic video
> > filters for dnn.
> > > - a generic filter to process the content of AVFrame with different dnn
> > networks.
> > > and so the current specific filters such as vf_sr (some changes needed) 
> > > and
> > vf_derain are no longer needed, since they can be
> > > included in this specific filter. And of course, in practice I'll not 
> > > remove them.
> > >
> > > - a generic filter to analyze the content of AVFrame to generate some side
> > data with different dnn networks. The content of AVFrame does not change.
> > > The application, which invokes the filter with a given dnn network, has 
> > > the
> > responsibility/knowledge to parse the side data (analyze result).
> > >
> > > ___
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel@ffmpeg.org

Re: [FFmpeg-devel] [PATCH V2 2/2] libavfilter/dnn/dnn_backend_tf: add tf.pad support for tensorflow backend with native model.

2019-08-19 Thread Pedro Arthur

Em qui, 15 de ago de 2019 às 23:41, Guo, Yejun  escreveu:
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_tf.c | 48 
> 
>  1 file changed, 19 insertions(+), 29 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_tf.c 
> b/libavfilter/dnn/dnn_backend_tf.c
> index ca7434a..626fba9 100644
> --- a/libavfilter/dnn/dnn_backend_tf.c
> +++ b/libavfilter/dnn/dnn_backend_tf.c
> @@ -27,6 +27,7 @@
>  #include "dnn_backend_native.h"
>  #include "libavformat/avio.h"
>  #include "libavutil/avassert.h"
> +#include "dnn_backend_native_layer_pad.h"
>
>  #include 
>
> @@ -347,23 +348,8 @@ static DNNReturnType add_depth_to_space_layer(TFModel 
> *tf_model, TF_Operation **
>  return DNN_SUCCESS;
>  }
>
> -static int calculate_pad(const ConvolutionalNetwork *conv_network)
> -{
> -ConvolutionalParams *params;
> -int32_t layer;
> -int pad = 0;
> -
> -for (layer = 0; layer < conv_network->layers_num; ++layer){
> -if (conv_network->layers[layer].type == CONV){
> -params = (ConvolutionalParams 
> *)conv_network->layers[layer].params;
> -pad += params->kernel_size >> 1;
> -}
> -}
> -
> -return pad;
> -}
> -
> -static DNNReturnType add_pad_op(TFModel *tf_model, TF_Operation **cur_op, 
> const int32_t pad)
> +static DNNReturnType add_pad_layer(TFModel *tf_model, TF_Operation **cur_op,
> +  LayerPadParams *params, const 
> int layer)
>  {
>  TF_Operation *op;
>  TF_Tensor *tensor;
> @@ -372,16 +358,21 @@ static DNNReturnType add_pad_op(TFModel *tf_model, 
> TF_Operation **cur_op, const
>  int32_t *pads;
>  int64_t pads_shape[] = {4, 2};
>
> -input.index = 0;
> +char name_buffer[NAME_BUFFER_SIZE];
> +snprintf(name_buffer, NAME_BUFFER_SIZE, "pad%d", layer);
>
> -op_desc = TF_NewOperation(tf_model->graph, "Const", "pads");
> +op_desc = TF_NewOperation(tf_model->graph, "Const", name_buffer);
>  TF_SetAttrType(op_desc, "dtype", TF_INT32);
>  tensor = TF_AllocateTensor(TF_INT32, pads_shape, 2, 4 * 2 * 
> sizeof(int32_t));
>  pads = (int32_t *)TF_TensorData(tensor);
> -pads[0] = 0;   pads[1] = 0;
> -pads[2] = pad; pads[3] = pad;
> -pads[4] = pad; pads[5] = pad;
> -pads[6] = 0;   pads[7] = 0;
> +pads[0] = params->paddings[0][0];
> +pads[1] = params->paddings[0][1];
> +pads[2] = params->paddings[1][0];
> +pads[3] = params->paddings[1][1];
> +pads[4] = params->paddings[2][0];
> +pads[5] = params->paddings[2][1];
> +pads[6] = params->paddings[3][0];
> +pads[7] = params->paddings[3][1];
>  TF_SetAttrTensor(op_desc, "value", tensor, tf_model->status);
>  if (TF_GetCode(tf_model->status) != TF_OK){
>  return DNN_ERROR;
> @@ -393,6 +384,7 @@ static DNNReturnType add_pad_op(TFModel *tf_model, 
> TF_Operation **cur_op, const
>
>  op_desc = TF_NewOperation(tf_model->graph, "MirrorPad", "mirror_pad");
>  input.oper = *cur_op;
> +input.index = 0;
>  TF_AddInput(op_desc, input);
>  input.oper = op;
>  TF_AddInput(op_desc, input);
> @@ -418,7 +410,6 @@ static DNNReturnType load_native_model(TFModel *tf_model, 
> const char *model_file
>  int32_t *transpose_perm;
>  int64_t transpose_perm_shape[] = {4};
>  int64_t input_shape[] = {1, -1, -1, -1};
> -int32_t pad;
>  DNNReturnType layer_add_res;
>  DNNModel *native_model = NULL;
>  ConvolutionalNetwork *conv_network;
> @@ -429,7 +420,6 @@ static DNNReturnType load_native_model(TFModel *tf_model, 
> const char *model_file
>  }
>
>  conv_network = (ConvolutionalNetwork *)native_model->model;
> -pad = calculate_pad(conv_network);
>  tf_model->graph = TF_NewGraph();
>  tf_model->status = TF_NewStatus();
>
> @@ -448,10 +438,6 @@ static DNNReturnType load_native_model(TFModel 
> *tf_model, const char *model_file
>  CLEANUP_ON_ERROR(tf_model);
>  }
>
> -if (add_pad_op(tf_model, &op, pad) != DNN_SUCCESS){
> -CLEANUP_ON_ERROR(tf_model);
> -}
> -
>  op_desc = TF_NewOperation(tf_model->graph, "Const", "transpose_perm");
>  TF_SetAttrType(op_desc, "dtype", TF_INT32);
>  tensor = TF_AllocateTensor(TF_INT32, transpose_perm_shape, 1, 4 * 
> sizeof(int32_t));
> @@ -479,6 +465,10 @@ static DNNReturnType load_native_model(TFModel 
> *tf_model, const char *model_file
>  layer_add_res = add_depth_to_space_layer(tf_model, &op,
>   (DepthToSpaceParams 
> *)conv_network->layers[layer].params, layer);
>  break;
> +case MIRROR_PAD:
> +layer_add_res = add_pad_layer(tf_model, &op,
> +  (LayerPadParams 
> *)conv_network->layers[layer].params, layer);
> +break;
>  default:
>  CLEANUP_ON_ERROR(tf_model);
>  }
> --
> 2.7.4
>
LGTM.
Pushed, thanks.

> _

Re: [FFmpeg-devel] [PATCH V2 1/2] libavfilter/dnn/dnn_backend_tf: fix typo that variable uninitialized.

2019-08-19 Thread Pedro Arthur

Em qui, 15 de ago de 2019 às 23:40, Guo, Yejun  escreveu:
>
> if it is initialized randomly, the tensorflow lib will report
> error message such as:
> Attempt to add output -7920 of depth_to_space4 not in range [0, 1) to node 
> with type Identity
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_tf.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/libavfilter/dnn/dnn_backend_tf.c 
> b/libavfilter/dnn/dnn_backend_tf.c
> index ba959ae..ca7434a 100644
> --- a/libavfilter/dnn/dnn_backend_tf.c
> +++ b/libavfilter/dnn/dnn_backend_tf.c
> @@ -490,6 +490,7 @@ static DNNReturnType load_native_model(TFModel *tf_model, 
> const char *model_file
>
>  op_desc = TF_NewOperation(tf_model->graph, "Identity", "y");
>  input.oper = op;
> +input.index = 0;
>  TF_AddInput(op_desc, input);
>  TF_FinishOperation(op_desc, tf_model->status);
>  if (TF_GetCode(tf_model->status) != TF_OK){
> --
> 2.7.4
>
LGTM.
Pushed, thanks.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] FATE/dnn: let fate/dnn tests depend on ffmpeg static libraries

2019-08-19 Thread Pedro Arthur

Em sex, 16 de ago de 2019 às 11:20, Li, Zhong  escreveu:
>
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> > Of Guo, Yejun
> > Sent: Wednesday, August 7, 2019 10:44 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Guo, Yejun 
> > Subject: [FFmpeg-devel] [PATCH V2] FATE/dnn: let fate/dnn tests depend on
> > ffmpeg static libraries
> >
> > background:
> > DNN (deep neural network) is a sub module of libavfilter, and FATE/dnn is
> > unit test for the DNN module, one unit test for one dnn layer.
> > The unit tests are not based on the APIs exported by libavfilter, they just
> > directly call into the functions within DNN submodule.
> >
> > There is an issue when run the following command:
> > build$ ../ffmpeg/configure --disable-static --enable-shared make make
> > fate-dnn-layer-pad
> >
> > And part of error message:
> > tests/dnn/dnn-layer-pad-test.o: In function `test_with_mode_symmetric':
> > /work/media/ffmpeg/build/src/tests/dnn/dnn-layer-pad-test.c:73:
> > undefined reference to `dnn_execute_layer_pad'
> >
> > The root cause is that function dnn_execute_layer_pad is a LOCAL symbol in
> > libavfilter.so, and so the linker could not find it when build
> > dnn-layer-pad-test.
> > To check it, just run: readelf -s libavfilter/libavfilter.so | grep dnn
> >
> > So, add dependency in fate/dnn Makefile with ffmpeg static libraries.
> > This is the same method used in fate/checkasm
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  tests/dnn/Makefile | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/tests/dnn/Makefile b/tests/dnn/Makefile index b2e6680..0e050ea
> > 100644
> > --- a/tests/dnn/Makefile
> > +++ b/tests/dnn/Makefile
> > @@ -4,8 +4,8 @@ DNNTESTOBJS  :=
> > $(DNNTESTOBJS:%=$(DNNTESTSDIR)%)
> > $(DNNTESTPROGS:%=$(DNNTESTSDIR)  DNNTESTPROGS :=
> > $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test$(EXESUF))
> >  -include $(wildcard $(DNNTESTOBJS:.o=.d))
> >
> > -$(DNNTESTPROGS): %$(EXESUF): %.o $(FF_DEP_LIBS)
> > - $(LD) $(LDFLAGS) $(LDEXEFLAGS) $(LD_O) $(filter %.o,$^)
> > $(FF_EXTRALIBS) $(ELIBS)
> > +$(DNNTESTPROGS): %$(EXESUF): %.o $(FF_STATIC_DEP_LIBS)
> > + $(LD) $(LDFLAGS) $(LDEXEFLAGS) $(LD_O) $(filter %.o,$^)
> > +$(FF_STATIC_DEP_LIBS) $(ELIBS)
> >
> >  testclean::
> >   $(RM) $(addprefix $(DNNTESTSDIR)/,$(CLEANSUFFIXES)
> > *-test$(EXESUF))
> > --
> > 2.7.4
>
> LGTM && Verified
>
> IMHO this is a high priority patch since currently FATE is broken now if 
> build with dynamic link

Pushed, thanks.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/3] dnn: introduce dnn operand (in c code) to hold operand infos within network

2019-08-27 Thread Pedro Arthur

Hi,


Em ter, 20 de ago de 2019 às 05:54, Guo, Yejun  escreveu:
>
> the info can be saved in dnn operand object without regenerating again and 
> again,
> and it is also needed for layer split/merge, and for memory reuse.
>
> to make things step by step, this patch just focuses on c code,
> the change within python script will be added later.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c   | 226 
> -
>  libavfilter/dnn/dnn_backend_native.h   |  54 +-
>  libavfilter/dnn/dnn_backend_native_layer_pad.c |  24 ++-
>  libavfilter/dnn/dnn_backend_native_layer_pad.h |   4 +-
>  tests/dnn/Makefile |   2 +-
>  tests/dnn/dnn-layer-pad-test.c |  60 +--
>  6 files changed, 236 insertions(+), 134 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index d52abc6..78227a5 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -30,77 +30,30 @@
>  static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
> -InputParams *input_params;
> -ConvolutionalParams *conv_params;
> -DepthToSpaceParams *depth_to_space_params;
> -LayerPadParams *pad_params;
> -int cur_width, cur_height, cur_channels;
> -int32_t layer;
>
> -if (network->layers_num <= 0 || network->layers[0].type != INPUT){
> +if (network->layers_num <= 0 || network->operands_num <= 0)
>  return DNN_ERROR;
> -}
> -else{
> -input_params = (InputParams *)network->layers[0].params;
> -input_params->width = cur_width = input->width;
> -input_params->height = cur_height = input->height;
> -input_params->channels = cur_channels = input->channels;
> -if (input->data){
> -av_freep(&input->data);
> -}
> -av_assert0(input->dt == DNN_FLOAT);
> -network->layers[0].output = input->data = av_malloc(cur_height * 
> cur_width * cur_channels * sizeof(float));
> -if (!network->layers[0].output){
> -return DNN_ERROR;
> -}
> -}
> -
> -for (layer = 1; layer < network->layers_num; ++layer){
> -switch (network->layers[layer].type){
> -case CONV:
> -conv_params = (ConvolutionalParams 
> *)network->layers[layer].params;
> -if (conv_params->input_num != cur_channels){
> -return DNN_ERROR;
> -}
> -cur_channels = conv_params->output_num;
> -
> -if (conv_params->padding_method == VALID) {
> -int pad_size = (conv_params->kernel_size - 1) * 
> conv_params->dilation;
> -cur_height -= pad_size;
> -cur_width -= pad_size;
> -}
> -break;
> -case DEPTH_TO_SPACE:
> -depth_to_space_params = (DepthToSpaceParams 
> *)network->layers[layer].params;
> -if (cur_channels % (depth_to_space_params->block_size * 
> depth_to_space_params->block_size) != 0){
> -return DNN_ERROR;
> -}
> -cur_channels = cur_channels / (depth_to_space_params->block_size 
> * depth_to_space_params->block_size);
> -cur_height *= depth_to_space_params->block_size;
> -cur_width *= depth_to_space_params->block_size;
> -break;
> -case MIRROR_PAD:
> -pad_params = (LayerPadParams *)network->layers[layer].params;
> -cur_height = cur_height + pad_params->paddings[1][0] + 
> pad_params->paddings[1][1];
> -cur_width = cur_width + pad_params->paddings[2][0] + 
> pad_params->paddings[2][1];
> -cur_channels = cur_channels + pad_params->paddings[3][0] + 
> pad_params->paddings[3][1];
> -break;
> -default:
> -return DNN_ERROR;
> -}
> -if (network->layers[layer].output){
> -av_freep(&network->layers[layer].output);
> -}
> -
> -if (cur_height <= 0 || cur_width <= 0)
> -return DNN_ERROR;
>
> -network->layers[layer].output = av_malloc(cur_height * cur_width * 
> cur_channels * sizeof(float));
> -if (!network->layers[layer].output){
> -return DNN_ERROR;
> -}
> -}
> +av_assert0(input->dt == DNN_FLOAT);
> +
> +/**
> + * as the first step, suppose network->operands[0] is the input operand.
> + */
> +network->operands[0].dims[0] = 1;
> +network->operands[0].dims[1] = input->height;
> +network->operands[0].dims[2] = input->width;
> +network->operands[0].dims[3] = input->channels;
> +network->operands[0].type = DOT_INPUT;
> +network->operands[0].data_type = DNN_FLOAT;
> +network->operands[0].isNHWC = 1;
> +
> +av_freep(&netw

Re: [FFmpeg-devel] [PATCH 3/3] dnn: export operand info in python script and load in c code

2019-08-27 Thread Pedro Arthur

hi,

Em ter, 20 de ago de 2019 às 05:54, Guo, Yejun  escreveu:
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c|  49 +++---
>  libavfilter/dnn/dnn_backend_native.h|   2 +-
>  libavfilter/dnn_interface.h |   2 +-
>  tools/python/convert_from_tensorflow.py | 111 
> +---
>  4 files changed, 142 insertions(+), 22 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index 0ba4e44..eeae711 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -72,7 +72,6 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  ConvolutionalParams *conv_params;
>  DepthToSpaceParams *depth_to_space_params;
>  LayerPadParams *pad_params;
> -int32_t operand_index = 0;
>
>  model = av_malloc(sizeof(DNNModel));
>  if (!model){
> @@ -93,9 +92,10 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  }
>  model->model = (void *)network;
>
> -avio_seek(model_file_context, file_size - 4, SEEK_SET);
> +avio_seek(model_file_context, file_size - 8, SEEK_SET);
>  network->layers_num = (int32_t)avio_rl32(model_file_context);
> -dnn_size = 4;
> +network->operands_num = (int32_t)avio_rl32(model_file_context);
> +dnn_size = 8;
>  avio_seek(model_file_context, 0, SEEK_SET);
>
I think it is worth adding some means to assert the input file is
indeed a dnn file, the code as is may alloc an undefined amout of
memory if the file passed is malformed or corrupted.
Maybe adding a magic number + the file size (or something else) at the
beginning of the file and early skip parsing it if it does not match?
however it may require two passes to generate the file which goes
against your previous patch.

Otherwise I can push it as is, as this behavior was already there
before the patch.

>  network->layers = av_mallocz(network->layers_num * sizeof(Layer));
> @@ -105,11 +105,6 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  return NULL;
>  }
>
> -/**
> - * Operands should be read from model file, the whole change will be 
> huge.
> - * to make things step by step, we first mock the operands, instead of 
> reading from model file.
> - */
> -network->operands_num = network->layers_num + 1;
>  network->operands = av_mallocz(network->operands_num * 
> sizeof(DnnOperand));
>  if (!network->operands){
>  avio_closep(&model_file_context);
> @@ -120,8 +115,6 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  for (layer = 0; layer < network->layers_num; ++layer){
>  layer_type = (int32_t)avio_rl32(model_file_context);
>  dnn_size += 4;
> -network->layers[layer].input_operand_indexes[0] = operand_index++;
> -network->layers[layer].output_operand_index = operand_index;
>  switch (layer_type){
>  case CONV:
>  conv_params = av_malloc(sizeof(ConvolutionalParams));
> @@ -162,6 +155,9 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  for (i = 0; i < conv_params->output_num; ++i){
>  conv_params->biases[i] = 
> av_int2float(avio_rl32(model_file_context));
>  }
> +network->layers[layer].input_operand_indexes[0] = 
> (int32_t)avio_rl32(model_file_context);
> +network->layers[layer].output_operand_index = 
> (int32_t)avio_rl32(model_file_context);
> +dnn_size += 8;
>  network->layers[layer].type = CONV;
>  network->layers[layer].params = conv_params;
>  break;
> @@ -174,6 +170,9 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  }
>  depth_to_space_params->block_size = 
> (int32_t)avio_rl32(model_file_context);
>  dnn_size += 4;
> +network->layers[layer].input_operand_indexes[0] = 
> (int32_t)avio_rl32(model_file_context);
> +network->layers[layer].output_operand_index = 
> (int32_t)avio_rl32(model_file_context);
> +dnn_size += 8;
>  network->layers[layer].type = DEPTH_TO_SPACE;
>  network->layers[layer].params = depth_to_space_params;
>  break;
> @@ -191,6 +190,9 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  pad_params->paddings[i][1] = avio_rl32(model_file_context);
>  dnn_size += 8;
>  }
> +network->layers[layer].input_operand_indexes[0] = 
> (int32_t)avio_rl32(model_file_context);
> +network->layers[layer].output_operand_index = 
> (int32_t)avio_rl32(model_file_context);
> +dnn_size += 8;
>  network->layers[layer].type = MIRROR_PAD;
>  network->layers[layer].params = pad_params;
>  break;
> @@ -201,6 +203,33 @@ DNNModel *ff_dnn_load_model_native

Re: [FFmpeg-devel] [PATCH] libavfilter: Removes stored DNN models. Adds support for native backend model file format in tf backend. Removes scaling and conversion with libswscale and replaces input fo

2018-09-02 Thread Pedro Arthur

Hi

2018-09-01 16:27 GMT-03:00 Sergey Lavrushkin :
> Hello,
>
> Resending patch with fixes of sr filter and dnn module for review.
Thanks for your work.

I think it would be beter if you split this patch: one removing the
stored data and one which adds the support for native model file in
tf.

Regarding the removal of swscale from the filter, given how it harms
the usage raising the complexity for the user, it should not be
removed.

In the original patch thread I discussed it with Gyan Doshi which
agreed keeping it, I also made sure the other, supposedly interested,
devs were aware of the decision via IRC which noone cared to discuss
further.
Thus I consider the sws removal subject closed and for sure I'll not
ask you to redo it. I apologize for requiring this extra work from you
but I think it will benefit everyone interested in using sr filter.

>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavfilter: Removes stored DNN models. Adds support for native backend model file format in tf backend. Removes scaling and conversion with libswscale and replaces input fo

2018-09-10 Thread Pedro Arthur

2018-09-06 8:44 GMT-03:00 Sergey Lavrushkin :

> Here is the patch with reverted changes on sws removal. I didn't split the
> patch into two patches, because code, that supports native model file
> format in
> tf, is partially from code of default model construction, which is removed
> with default
> models and stored data.
>
Ok.



The scale_factor option is not necessary, it should be stored in the model
file. As the weights are trained for a specific factor, using anything
different from that will give bad results.
It seems the native depth to space conversion is buggy a few lines in the
top of output image are duplicated, tf backend is ok. BTW this bug was not
introduced by this patch.

Other than that LGTM, the above fixes can be done in separated patches.
I may push it by Friday.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavfilter: Removes stored DNN models. Adds support for native backend model file format in tf backend. Removes scaling and conversion with libswscale and replaces input fo

2018-09-16 Thread Pedro Arthur

2018-09-16 15:20 GMT-03:00 Paul B Mahol :
>
> When this will be pushed?
>
Yes, I did not had time to push it friday.
I'll do it monday or you could push it if you don't mind.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavfilter: Removes stored DNN models. Adds support for native backend model file format in tf backend. Removes scaling and conversion with libswscale and replaces input fo

2018-09-17 Thread Pedro Arthur

Pushed.

2018-09-16 16:21 GMT-03:00 Pedro Arthur :

>
>
> 2018-09-16 15:20 GMT-03:00 Paul B Mahol :
>>
>> When this will be pushed?
>>
> Yes, I did not had time to push it friday.
> I'll do it monday or you could push it if you don't mind.
>
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] avfilter/vf_sr: fix read out of bounds

2018-09-18 Thread Pedro Arthur

Hi,

2018-09-17 0:43 GMT-03:00 Zhao Zhili :

> Ping for review.
>
> On 2018年09月13日 15:58, Zhao Zhili wrote:
>
>> ---
>>   libavfilter/vf_sr.c | 9 ++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
>> index 5ad1baa..bc9d186 100644
>> --- a/libavfilter/vf_sr.c
>> +++ b/libavfilter/vf_sr.c
>> @@ -239,7 +239,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
>> *in)
>> 0, sr_context->sws_slice_h, out->data, out->linesize);
>> sws_scale(sr_context->sws_contexts[1], (const uint8_t
>> **)out->data, out->linesize,
>> -  0, out->height, (uint8_t *
>> const*)(&sr_context->input.data), &sr_context->sws_input_linesize);
>> +  0, out->height, (uint8_t *
>> const*)(&sr_context->input.data),
>> +  (const int [4]){sr_context->sws_input_linesize, 0, 0,
>> 0});
>>   break;
>>   case ESPCN:
>>   if (sr_context->sws_contexts[0]){
>> @@ -250,7 +251,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
>> *in)
>>   }
>> sws_scale(sr_context->sws_contexts[1], (const uint8_t
>> **)in->data, in->linesize,
>> -  0, in->height, (uint8_t *
>> const*)(&sr_context->input.data), &sr_context->sws_input_linesize);
>> +  0, in->height, (uint8_t *
>> const*)(&sr_context->input.data),
>> +  (const int [4]){sr_context->sws_input_linesize, 0, 0,
>> 0});
>>   }
>>   av_frame_free(&in);
>>   @@ -260,7 +262,8 @@ static int filter_frame(AVFilterLink *inlink,
>> AVFrame *in)
>>   return AVERROR(EIO);
>>   }
>>   -sws_scale(sr_context->sws_contexts[2], (const uint8_t
>> **)(&sr_context->output.data), &sr_context->sws_output_linesize,
>> +sws_scale(sr_context->sws_contexts[2], (const uint8_t
>> **)(&sr_context->output.data),
>> +  (const int [4]){sr_context->sws_output_linesize, 0, 0, 0},
>> 0, out->height, (uint8_t * const*)out->data,
>> out->linesize);
>> return ff_filter_frame(outlink, out);
>>
>
> The patch does not apply against head, but the fix is correct.
Could you make a new patch?

Thanks,
Pedro.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 3/3] avfilter/dnn_backend_native: fix memleak

2018-09-18 Thread Pedro Arthur

Hi,

2018-09-13 4:49 GMT-03:00 Zhao Zhili :

> ---
>  libavfilter/dnn_backend_native.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_
> native.c
> index 7ed155d..3108185 100644
> --- a/libavfilter/dnn_backend_native.c
> +++ b/libavfilter/dnn_backend_native.c
> @@ -489,6 +489,7 @@ void ff_dnn_free_model_native(DNNModel **model)
>  }
>  av_freep(&network->layers[layer].params);
>  }
> +av_freep(&network->layers);
>  av_freep(&network);
>  av_freep(model);
>  }
> --
>

Patch does not apply, could you reabse it?
Thanks!
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 2/3] avfilter/dnn_backend_native: fix invalid free

2018-09-18 Thread Pedro Arthur

Hi,

2018-09-13 4:49 GMT-03:00 Zhao Zhili :

> ---
>  libavfilter/dnn_backend_native.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_
> native.c
> index baefea7..7ed155d 100644
> --- a/libavfilter/dnn_backend_native.c
> +++ b/libavfilter/dnn_backend_native.c
> @@ -489,7 +489,7 @@ void ff_dnn_free_model_native(DNNModel **model)
>  }
>  av_freep(&network->layers[layer].params);
>  }
> -av_freep(network);
> +av_freep(&network);
>  av_freep(model);
>  }
>  }
> --
>
> Patch does not apply, could you reabse it?
Thanks!
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [V2 PATCH 3/3] avfilter/dnn_backend_native: fix memleak

2018-09-19 Thread Pedro Arthur

Pushed, thanks!

2018-09-18 23:55 GMT-03:00 Zhao Zhili :

> ---
>  libavfilter/dnn_backend_native.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_
> native.c
> index 7dd35d46..70d857f 100644
> --- a/libavfilter/dnn_backend_native.c
> +++ b/libavfilter/dnn_backend_native.c
> @@ -343,6 +343,7 @@ void ff_dnn_free_model_native(DNNModel **model)
>  }
>  av_freep(&network->layers[layer].params);
>  }
> +av_freep(&network->layers);
>  av_freep(&network);
>  av_freep(model);
>  }
> --
> 2.9.5
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [V2 PATCH 1/3] avfilter/vf_sr: fix read out of bounds

2018-09-19 Thread Pedro Arthur

Pushed, Thanks!

2018-09-18 23:55 GMT-03:00 Zhao Zhili :

> ---
>  libavfilter/vf_sr.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
> index 8a77a1d..c1ae6c5 100644
> --- a/libavfilter/vf_sr.c
> +++ b/libavfilter/vf_sr.c
> @@ -227,7 +227,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
> *in)
>0, sr_context->sws_slice_h, out->data, out->linesize);
>
>  sws_scale(sr_context->sws_contexts[1], (const uint8_t
> **)out->data, out->linesize,
> -  0, out->height, (uint8_t * 
> const*)(&sr_context->input.data),
> &sr_context->sws_input_linesize);
> +  0, out->height, (uint8_t * const*)(&sr_context->input.
> data),
> +  (const int [4]){sr_context->sws_input_linesize, 0, 0,
> 0});
>  }
>  else{
>  if (sr_context->sws_contexts[0]){
> @@ -238,7 +239,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
> *in)
>  }
>
>  sws_scale(sr_context->sws_contexts[1], (const uint8_t
> **)in->data, in->linesize,
> -  0, in->height, (uint8_t * const*)(&sr_context->input.data),
> &sr_context->sws_input_linesize);
> +  0, in->height, (uint8_t * const*)(&sr_context->input.
> data),
> +  (const int [4]){sr_context->sws_input_linesize, 0, 0,
> 0});
>  }
>  av_frame_free(&in);
>
> @@ -248,7 +250,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
> *in)
>  return AVERROR(EIO);
>  }
>
> -sws_scale(sr_context->sws_contexts[2], (const uint8_t
> **)(&sr_context->output.data), &sr_context->sws_output_linesize,
> +sws_scale(sr_context->sws_contexts[2], (const uint8_t
> **)(&sr_context->output.data),
> +  (const int[4]){sr_context->sws_output_linesize, 0, 0, 0},
>0, out->height, (uint8_t * const*)out->data, out->linesize);
>
>  return ff_filter_frame(outlink, out);
> --
> 2.9.5
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [V2 PATCH 2/3] avfilter/dnn_backend_native: fix invalid free

2018-09-19 Thread Pedro Arthur

Pushed,
Thanks!

2018-09-18 23:55 GMT-03:00 Zhao Zhili :

> ---
>  libavfilter/dnn_backend_native.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_
> native.c
> index 184fe54..7dd35d46 100644
> --- a/libavfilter/dnn_backend_native.c
> +++ b/libavfilter/dnn_backend_native.c
> @@ -343,7 +343,7 @@ void ff_dnn_free_model_native(DNNModel **model)
>  }
>  av_freep(&network->layers[layer].params);
>  }
> -av_freep(network);
> +av_freep(&network);
>  av_freep(model);
>  }
>  }
> --
> 2.9.5
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] avfilter/vf_sr: Fix coverity CID 1439584

2018-09-20 Thread Pedro Arthur

Hi,
This patch fixes coverity CID 1439584.
From a724bece3c18df551f874673c268aa01702b4576 Mon Sep 17 00:00:00 2001
From: Pedro Arthur 
Date: Thu, 20 Sep 2018 11:48:20 -0300
Subject: [PATCH] avfilter/vf_sr: Fix coverity CID 1439584

---
 libavfilter/vf_sr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
index c1ae6c5ff2..077ccc799c 100644
--- a/libavfilter/vf_sr.c
+++ b/libavfilter/vf_sr.c
@@ -250,7 +250,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 return AVERROR(EIO);
 }
 
-sws_scale(sr_context->sws_contexts[2], (const uint8_t **)(&sr_context->output.data),
+sws_scale(sr_context->sws_contexts[2], (const uint8_t *[4]){(const uint8_t *)sr_context->output.data, 0, 0, 0},
   (const int[4]){sr_context->sws_output_linesize, 0, 0, 0},
   0, out->height, (uint8_t * const*)out->data, out->linesize);
 
-- 
2.17.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] avfilter/sr: process and output message when load_model is NULL

2018-09-24 Thread Pedro Arthur

2018-09-24 0:35 GMT-03:00 Steven Liu :

> fix ticket: 7455
>
> Signed-off-by: Steven Liu 
> ---
>  libavfilter/dnn_interface.c | 4 
>  libavfilter/vf_sr.c | 7 ++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/libavfilter/dnn_interface.c b/libavfilter/dnn_interface.c
> index 78d7c5cf22..792c280c53 100644
> --- a/libavfilter/dnn_interface.c
> +++ b/libavfilter/dnn_interface.c
> @@ -52,6 +52,10 @@ DNNModule *ff_get_dnn_module(DNNBackendType
> backend_type)
>  av_freep(&dnn_module);
>  return NULL;
>  #endif
> +default:
> +av_log(NULL, AV_LOG_ERROR, "Module backend_type is not native or
> tensorflow\n");
> +av_freep(&dnn_module);
> +return NULL;
>  }
>
 It is missing a break in the DNN_TF case, the rest looks good.

Thanks.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] Mentoring project: music test source

2018-10-04 Thread Pedro Arthur

Hi,
Em dom, 30 de set de 2018 às 14:41, Nicolas George 
escreveu:

> Hi.
>
> For the next rounds of sponsored internships, I would like to propose
> the following project, that I would mentor:
>
> A music-like audio lavfi source for testing purposes.
>
> That means a deterministic pseudo-random stream of notes with varied
> frequencies, with a structure that looks like music and would trigger
> the same pathways in filters and codecs.
>
Lately I did some research about fractal music generation using L-Systems.
I'm providing a reference [1] if anyone find it interesting (just searching
it in google give lots of examples).

It basically consists of defining a grammar and applying it to an initial
symbol. Then the result string is interpreted in play a note, pitch
up/down, etc.

[1] -
https://pdfs.semanticscholar.org/8c2f/caaf3153779ec3e838b416cd6e6d7feecdb9.pdf
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 6/7] libavfilter/vf_sr.c: Removes uint8 -> float and float -> uint8 conversions.

2018-10-09 Thread Pedro Arthur

Hi,
Em seg, 8 de out de 2018 às 23:59, Liu Steven  escreveu:
>
>
>
> > 在 2018年8月15日，上午2:37，Pedro Arthur  写道：
> >
> > Patch pushed.
>
> How should i test it?
If you already performed the training (train_srcnn.sh/train_espcn.sh)
you can generate the model files using the script
'generate_header_and_model.py' provided in the repo. If not I'm
attaching my generated models.
Then
./ffmpeg -i img -vf sr=model=model_file_name output
or if you have TF
./ffmpeg -i img -vf sr=model=model_file_name:dnn_backend=tensorflow output


>
>
> bash generate_datasets.sh
> (py3k) [root@onvideo sr]# ls 
> logdir/srcnn_batch_32_lr_1e-3_decay_adam/train/model_100*
> logdir/srcnn_batch_32_lr_1e-3_decay_adam/train/model_100.ckpt.data-0-of-1
>   logdir/srcnn_batch_32_lr_1e-3_decay_adam/train/model_100.ckpt.index  
> logdir/srcnn_batch_32_lr_1e-3_decay_adam/train/model_100.ckpt.meta
> (py3k) [root@onvideo sr]#
>
> [root@onvideo nvidia]# ./ffmpeg
> ffmpeg version N-91943-g1b98bfb Copyright (c) 2000-2018 the FFmpeg developers
>   built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-28)
>   configuration: --enable-ffnvcodec --enable-libtensorflow 
> --extra-ldflags=-L/data/liuqi/tensorflow/bazel-bin/tensorflow/
>   libavutil  56. 19.101 / 56. 19.101
>   libavcodec 58. 30.100 / 58. 30.100
>   libavformat58. 18.100 / 58. 18.100
>   libavdevice58.  4.103 / 58.  4.103
>   libavfilter 7. 31.100 /  7. 31.100
>   libswscale  5.  2.100 /  5.  2.100
>   libswresample   3.  2.100 /  3.  2.100
> Hyper fast Audio and Video encoder
> usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] 
> outfile}...
>
> Use -h to get full help or, even better, run 'man ffmpeg'
> [root@onvideo nvidia]# pwd
> /data/liuqi/ffmpeg/nvidia
> [root@onvideo nvidia]#
>
>
> BTW, the GitHub link looks no body maintaining it. 
> https://github.com/HighVoltageRocknRoll/sr
Is there anything that is not working?

>
>
> Thanks
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH V2] Add a filter implementing HDR image reconstruction from a single exposure using deep CNNs

2018-10-17 Thread Pedro Arthur

Hi,

How hard is it to support the native backend? which operations are
missing or any other limitations?

Em qua, 17 de out de 2018 às 05:47, Guo, Yejun  escreveu:
>
> see the algorithm's paper and code below.
>
> the filter's parameter looks like:
> sdr2hdr=model_filename=/path_to_tensorflow_graph.pb:out_fmt=gbrp10le
>
> The input of the deep CNN model is RGB24 while the output is float
> for each color channel. This is the filter's default behavior to
> output format with gbrpf32le. And gbrp10le is also supported as the
> output, so we can see the rendering result in a player, as a reference.
>
> To generate the model file, we need modify the original script a little.
> - set name='y' for y_final within script at
> https://github.com/gabrieleilertsen/hdrcnn/blob/master/network.py
> - add the following code to the script at
> https://github.com/gabrieleilertsen/hdrcnn/blob/master/hdrcnn_predict.py
>
> graph = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, 
> ["y"])
> tf.train.write_graph(graph, '.', 'graph.pb', as_text=False)
>
> The filter only works when tensorflow C api is supported in the system,
> native backend is not supported since there are some different types of
> layers in the deep CNN model, besides CONV and DEPTH_TO_SPACE.
>
> https://arxiv.org/pdf/1710.07480.pdf:
>   author   = "Eilertsen, Gabriel and Kronander, Joel, and Denes, Gyorgy 
> and Mantiuk, Rafał and Unger, Jonas",
>   title= "HDR image reconstruction from a single exposure using deep 
> CNNs",
>   journal  = "ACM Transactions on Graphics (TOG)",
>   number   = "6",
>   volume   = "36",
>   articleno= "178",
>   year = "2017"
>
> https://github.com/gabrieleilertsen/hdrcnn
>
> btw, as a whole solution, metadata should also be generated from
> the sdr video, so to be encoded as a HDR video. Not supported yet.
> This patch just focuses on this paper.
>
> v2: use AV_OPT_TYPE_PIXEL_FMT for filter option
> remove some unnecessary code
> Use in->linesize[0] and FFMAX/FFMIN
> remove flag AVFILTER_FLAG_SLICE_THREADS
> add av_log message when error
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/Makefile |   1 +
>  libavfilter/allfilters.c |   1 +
>  libavfilter/vf_sdr2hdr.c | 266 
> +++
>  3 files changed, 268 insertions(+)
>  create mode 100644 libavfilter/vf_sdr2hdr.c
>
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 62cc2f5..88e7da6 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -360,6 +360,7 @@ OBJS-$(CONFIG_SOBEL_OPENCL_FILTER)   += 
> vf_convolution_opencl.o opencl.o
>  OBJS-$(CONFIG_SPLIT_FILTER)  += split.o
>  OBJS-$(CONFIG_SPP_FILTER)+= vf_spp.o
>  OBJS-$(CONFIG_SR_FILTER) += vf_sr.o
> +OBJS-$(CONFIG_SDR2HDR_FILTER)+= vf_sdr2hdr.o
>  OBJS-$(CONFIG_SSIM_FILTER)   += vf_ssim.o framesync.o
>  OBJS-$(CONFIG_STEREO3D_FILTER)   += vf_stereo3d.o
>  OBJS-$(CONFIG_STREAMSELECT_FILTER)   += f_streamselect.o framesync.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 5e72803..1645c0f 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -319,6 +319,7 @@ extern AVFilter ff_vf_scale_npp;
>  extern AVFilter ff_vf_scale_qsv;
>  extern AVFilter ff_vf_scale_vaapi;
>  extern AVFilter ff_vf_scale2ref;
> +extern AVFilter ff_vf_sdr2hdr;
>  extern AVFilter ff_vf_select;
>  extern AVFilter ff_vf_selectivecolor;
>  extern AVFilter ff_vf_sendcmd;
> diff --git a/libavfilter/vf_sdr2hdr.c b/libavfilter/vf_sdr2hdr.c
> new file mode 100644
> index 000..fa61bfa
> --- /dev/null
> +++ b/libavfilter/vf_sdr2hdr.c
> @@ -0,0 +1,266 @@
> +/*
> + * Copyright (c) 2018 Guo Yejun
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +/**
> + * @file
> + * Filter implementing HDR image reconstruction from a single exposure using 
> deep CNNs.
> + * https://arxiv.org/pdf/1710.07480.pdf
> + */
> +
> +#include "avfilter.h"
> +#include "formats.h"
> +#include "internal.h"
> +#include "libavutil/opt.h"
> +#include "libavutil/qsort.h"
> +#include "libavformat/avio.h"
> +#include "libswscale/s

Re: [FFmpeg-devel] GSoC 2018 mentor summit review

2018-10-22 Thread Pedro Arthur

Hi!

It was a great experience, as you said there was lots of interesting
projects and lots of passionate devs.

Em sáb, 20 de out de 2018 às 12:23, Thilo Borgmann
 escreveu:
> It was my pleasure to come to know Pedro in person - maybe he has some more 
> comments about the summit and hin impressions.
It was very nice to meet you  and all the multimedia guys too!

Best regards,
Pedro.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] Reimbursement request

2018-11-05 Thread Pedro Arthur

Hi,
I'm requesting the reimbursement of travel expenses for the Google
Mentor Summit.
I mentored the Super Resolution project, more details can be found in [1].


Flight   (BRL) R$ 3271.03
One night Airbnb (BRL) R$   235.70
Trasnportation (USD)  $  23.00

Total(USD)  $  971.46

BRL to USD conversion done according to the brazilian central bank [2]
on 1st November (1 USD = 3.6973 BRL).

[1] - https://trac.ffmpeg.org/wiki/SponsoringPrograms/GSoC/2018/Results
[2] - https://www4.bcb.gov.br/pec/taxas/port/ptaxnpesq.asp?id=txcotacao


Thanks,
Pedro.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] Regarding GSoC 2020 project proposal

2020-03-03 Thread Pedro Arthur

Hi

Em ter., 3 de mar. de 2020 às 09:24, YATENDRA SINGH
 escreveu:
>
> Hi,
> I am a third year CSE student at the Indian Institute of Technology Bhilai,
> and would like to contribute to ffmpeg this year. I have
> relevant experience with Machine Learning and would like to work on
> improving the video frame interpolation already implemented. With such a
> plethora of great Machine Learning Algorithms being published every year at
> prestigious conferences I would aim to read the relevant academic papers
> and implement the best suited technique for the task. For example, Depth
> Aware Video Frame Interpolation (DAIN CVPR-2019) is supposedly the state of
> the art method on Vimeo90k and MiddleBury
>  but at the same
> time Frame Interpolation with Generative Adversarial Network(FIGAN), uses
> not CNN but multi-scale synthesis( MS ) to get higher speeds.
> Looking forward to hearing from you soon.
>
> Yatendra SIngh
> Frame Interpolation with Multi-Scale Deep Loss Functions and Generative
> Adversarial NetworksFrame Interpolation with Multi-Scale Deep Loss
> Functions and Generative Adversarial NetworksFrame Interpolation with
> Multi-Scale Deep Loss Functions and Generative Adversarial Networks

I suppose this project is your own idea as it is not listed in the
projects page, right?

I think it would be good to add you idea under "Your Own Project Idea"
section in [1] adding as much information as possible so that we can
evaluate your idea and possible assign a mentor / backup mentor.
A few things I think are important to evaluate your project are:
*have a well defined "expected result", will it be a filter? or
something else? we already have a dnn module and a dnn_processing
filter, will your project be using it?

*what is the amount of work that will be done during the project, more
or less this is related to above "expected result"

*define a qualification task, we can discuss it after the above is define

*sell your idea (not strictly necessary but may help evaluating your
project), why is it useful feature to have, what improvements it
brings, etc

[1] - 
https://trac.ffmpeg.org/wiki/SponsoringPrograms/GSoC/2020#YourOwnProjectIdea
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 3/3] avfilter/vf_dnn_processing.c: add frame size change support for planar yuv format

2020-03-06 Thread Pedro Arthur

Em qui., 5 de mar. de 2020 às 20:57, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: Guo, Yejun
> > Sent: Tuesday, February 25, 2020 5:15 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Guo, Yejun 
> > Subject: [PATCH V2 3/3] avfilter/vf_dnn_processing.c: add frame size change
> > support for planar yuv format
> >
> > The Y channel is handled by dnn, and also resized by dnn. The UV channels
> > are resized with swscale.
> >
> > The command to use espcn.pb (see vf_sr) looks like:
> > ./ffmpeg -i 480p.jpg -vf
> > format=yuv420p,dnn_processing=dnn_backend=tensorflow:model=espcn.pb:in
> > put=x:output=y -y tmp.espcn.jpg
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  doc/filters.texi|  9 +
> >  libavfilter/vf_dnn_processing.c | 37 ++---
> >  2 files changed, 39 insertions(+), 7 deletions(-)
>
> this patch set asks for review, thanks.
I'll not be able to test it in the near future, but code wise LGTM.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 3/3] avfilter/vf_dnn_processing.c: add frame size change support for planar yuv format

2020-03-06 Thread Pedro Arthur

Em sex., 6 de mar. de 2020 às 00:52, myp...@gmail.com
 escreveu:
>
> On Tue, Feb 25, 2020 at 5:24 PM Guo, Yejun  wrote:
> >
> > The Y channel is handled by dnn, and also resized by dnn. The UV channels
> > are resized with swscale.
> For me, this is a little weird to resize Y with dnn backend but resize
> UV channel with FFmpeg swscale, is it used the same scale algorithm ?
Complementing Yejun's response, usually the luminance plane contains
most of the high frequency in "natural" images therefore most super
resolution methods are applied only to Y channel, which is cheaper
than applying it to all channels and yields almost as good results.

>
> > The command to use espcn.pb (see vf_sr) looks like:
> > ./ffmpeg -i 480p.jpg -vf 
> > format=yuv420p,dnn_processing=dnn_backend=tensorflow:model=espcn.pb:input=x:output=y
> >  -y tmp.espcn.jpg
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  doc/filters.texi|  9 +
> >  libavfilter/vf_dnn_processing.c | 37 ++---
> >  2 files changed, 39 insertions(+), 7 deletions(-)
> >
> > diff --git a/doc/filters.texi b/doc/filters.texi
> > index 33b7857..e3df8f9 100644
> > --- a/doc/filters.texi
> > +++ b/doc/filters.texi
> > @@ -9155,6 +9155,7 @@ ffmpeg -i INPUT -f lavfi -i 
> > nullsrc=hd720,geq='r=128+80*(sin(sqrt((X-W/2)*(X-W/2
> >  @end example
> >  @end itemize
> >
> > +@anchor{dnn_processing}
> >  @section dnn_processing
> >
> >  Do image processing with deep neural networks. It works together with 
> > another filter
> > @@ -9216,6 +9217,12 @@ Handle the Y channel with srcnn.pb (see @ref{sr} 
> > filter) for frame with yuv420p
> >  ./ffmpeg -i 480p.jpg -vf 
> > format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=tensorflow:model=srcnn.pb:input=x:output=y
> >  -y srcnn.jpg
> >  @end example
> >
> > +@item
> > +Handle the Y channel with espcn.pb (see @ref{sr} filter), which changes 
> > frame size, for format yuv420p (planar YUV formats supported):
> > +@example
> > +./ffmpeg -i 480p.jpg -vf 
> > format=yuv420p,dnn_processing=dnn_backend=tensorflow:model=espcn.pb:input=x:output=y
> >  -y tmp.espcn.jpg
> > +@end example
> > +
> >  @end itemize
> >
> >  @section drawbox
> > @@ -17369,6 +17376,8 @@ Default value is @code{2}. Scale factor is 
> > necessary for SRCNN model, because it
> >  input upscaled using bicubic upscaling with proper scale factor.
> >  @end table
> >
> > +This feature can also be finished with @ref{dnn_processing} filter.
> > +
> >  @section ssim
> >
> >  Obtain the SSIM (Structural SImilarity Metric) between two input videos.
> > diff --git a/libavfilter/vf_dnn_processing.c 
> > b/libavfilter/vf_dnn_processing.c
> > index f9458f0..7f40f85 100644
> > --- a/libavfilter/vf_dnn_processing.c
> > +++ b/libavfilter/vf_dnn_processing.c
> > @@ -51,6 +51,8 @@ typedef struct DnnProcessingContext {
> >
> >  struct SwsContext *sws_gray8_to_grayf32;
> >  struct SwsContext *sws_grayf32_to_gray8;
> > +struct SwsContext *sws_uv_scale;
> > +int sws_uv_height;
> >  } DnnProcessingContext;
> >
> >  #define OFFSET(x) offsetof(DnnProcessingContext, x)
> > @@ -274,6 +276,18 @@ static int prepare_sws_context(AVFilterLink *outlink)
> > outlink->h,
> > AV_PIX_FMT_GRAY8,
> > 0, NULL, NULL, NULL);
> > +
> > +if (inlink->w != outlink->w || inlink->h != outlink->h) {
> > +const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(fmt);
> > +int sws_src_h = AV_CEIL_RSHIFT(inlink->h, desc->log2_chroma_h);
> > +int sws_src_w = AV_CEIL_RSHIFT(inlink->w, desc->log2_chroma_w);
> > +int sws_dst_h = AV_CEIL_RSHIFT(outlink->h, 
> > desc->log2_chroma_h);
> > +int sws_dst_w = AV_CEIL_RSHIFT(outlink->w, 
> > desc->log2_chroma_w);
> > +ctx->sws_uv_scale = sws_getContext(sws_src_w, sws_src_h, 
> > AV_PIX_FMT_GRAY8,
> > +   sws_dst_w, sws_dst_h, 
> > AV_PIX_FMT_GRAY8,
> > +   SWS_BICUBIC, NULL, NULL, 
> > NULL);
> > +ctx->sws_uv_height = sws_src_h;
> > +}
> >  return 0;
> >  default:
> >  //do nothing
> > @@ -404,13 +418,21 @@ static av_always_inline int isPlanarYUV(enum 
> > AVPixelFormat pix_fmt)
> >
> >  static int copy_uv_planes(DnnProcessingContext *ctx, AVFrame *out, const 
> > AVFrame *in)
> >  {
> > -const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(in->format);
> > -int uv_height = AV_CEIL_RSHIFT(in->height, desc->log2_chroma_h);
> > -for (int i = 1; i < 3; ++i) {
> > -int bytewidth = av_image_get_linesize(in->format, in->width, i);
> > -av_image_copy_plane(out->data[i], out->linesize[i],
> > -in->data[i], in->linesize[i],
> > -bytewidth, uv_height);
> > +if (!ctx->sws

Re: [FFmpeg-devel] Regarding GSoC 2020 project proposal

2020-03-16 Thread Pedro Arthur

Hi,

Em qua., 4 de mar. de 2020 às 09:57, YATENDRA SINGH
 escreveu:
>
> Thank you for explaining the procedure.
> I have posted my own project proposal on the page you had instructed me to.
> Looking forward to the feedback.


Have you contacted any possible mentor? If not, I would suggest you to
make your project idea less generic.
For example: which dnn models are you planning to use (both for the
qualification and the project)?
Are they already supported by our dnn infrastructure? our dnn module
has two backends: native, which is cpu only and tensorflow. Which one
are you going to support? or both?

>
>
> Regards,
> Yatendra Singh.
>
> On Tue, Mar 3, 2020 at 10:19 PM Pedro Arthur  wrote:
>
> > Hi
> >
> > Em ter., 3 de mar. de 2020 às 09:24, YATENDRA SINGH
> >  escreveu:
> > >
> > > Hi,
> > > I am a third year CSE student at the Indian Institute of Technology
> > Bhilai,
> > > and would like to contribute to ffmpeg this year. I have
> > > relevant experience with Machine Learning and would like to work on
> > > improving the video frame interpolation already implemented. With such a
> > > plethora of great Machine Learning Algorithms being published every year
> > at
> > > prestigious conferences I would aim to read the relevant academic papers
> > > and implement the best suited technique for the task. For example, Depth
> > > Aware Video Frame Interpolation (DAIN CVPR-2019) is supposedly the state
> > of
> > > the art method on Vimeo90k and MiddleBury
> > > <https://paperswithcode.com/task/video-frame-interpolation> but at the
> > same
> > > time Frame Interpolation with Generative Adversarial Network(FIGAN), uses
> > > not CNN but multi-scale synthesis( MS ) to get higher speeds.
> > > Looking forward to hearing from you soon.
> > >
> > > Yatendra SIngh
> > > Frame Interpolation with Multi-Scale Deep Loss Functions and Generative
> > > Adversarial NetworksFrame Interpolation with Multi-Scale Deep Loss
> > > Functions and Generative Adversarial NetworksFrame Interpolation with
> > > Multi-Scale Deep Loss Functions and Generative Adversarial Networks
> >
> > I suppose this project is your own idea as it is not listed in the
> > projects page, right?
> >
> > I think it would be good to add you idea under "Your Own Project Idea"
> > section in [1] adding as much information as possible so that we can
> > evaluate your idea and possible assign a mentor / backup mentor.
> > A few things I think are important to evaluate your project are:
> > *have a well defined "expected result", will it be a filter? or
> > something else? we already have a dnn module and a dnn_processing
> > filter, will your project be using it?
> >
> > *what is the amount of work that will be done during the project, more
> > or less this is related to above "expected result"
> >
> > *define a qualification task, we can discuss it after the above is define
> >
> > *sell your idea (not strictly necessary but may help evaluating your
> > project), why is it useful feature to have, what improvements it
> > brings, etc
> >
> > [1] -
> > https://trac.ffmpeg.org/wiki/SponsoringPrograms/GSoC/2020#YourOwnProjectIdea
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 19/23] dnn/dnn_backend_native_layer_conv2d: Check allocation

2021-03-11 Thread Pedro Arthur

Em qui., 11 de mar. de 2021 às 04:29, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of
> > Andreas Rheinhardt
> > Sent: 2021年3月11日 5:55
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Andreas Rheinhardt 
> > Subject: [FFmpeg-devel] [PATCH 19/23]
> > dnn/dnn_backend_native_layer_conv2d: Check allocation
> >
> > Signed-off-by: Andreas Rheinhardt 
> > ---
> > Why does DNN actually not use the ordinary error codes?
>
> DNN_ERROR/DNN_SUCCESS is introduced at the very beginning,
> @Pedro any comment if we need to revisit the error code? thanks.

I believe it was used for dnn specific errors and at some point were
mixed with ordinary errors.
I agree we should use ordinary error codes for ordinary errors.

>
> >
> >  libavfilter/dnn/dnn_backend_native_layer_conv2d.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> > b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> > index 94a07c1fdb..941330c895 100644
> > --- a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> > +++ b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> > @@ -228,6 +228,8 @@ int ff_dnn_execute_layer_conv2d(DnnOperand
> > *operands, const int32_t *input_opera
> >
> >  #if HAVE_PTHREAD_CANCEL
> >  thread_param = av_calloc(thread_num, sizeof(*thread_param));
> > +if (!thread_param)
> > +return DNN_ERROR;
>
> LGTM, thanks.
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V7 4/6] lavu: add side data AV_FRAME_DATA_BOUNDING_BOXES

2021-04-09 Thread Pedro Arthur

Em sex., 9 de abr. de 2021 às 01:13, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of Lynne
> > Sent: 2021年4月9日 0:57
> > To: FFmpeg development discussions and patches 
> > Subject: Re: [FFmpeg-devel] [PATCH V7 4/6] lavu: add side data
> > AV_FRAME_DATA_BOUNDING_BOXES
> >
>
> First of all, thanks for the quick replies, I see, all the 
> discussions/comments are to
> make this patch better, thank you.
>
> > >> >
> > >> >> >> > +
> > >> >> >> > +typedef struct AVBoundingBoxHeader {
> > >> >> >> > +/**
> > >> >> >> > + * Information about how the bounding box is generated.
> > >> >> >> > + * for example, the DNN model name.
> > >> >> >> > + */
> > >> >> >> > +char source[128];
> > >> >> >> > +
> > >> >> >> > +/**
> > >> >> >> > + * The size of frame when it is detected.
> > >> >> >> > + */
> > >> >> >> > +int frame_width;
> > >> >> >> > +int frame_height;
> > >> >> >> >
> > >> >> >>
> > >> >> >> Why? This side data is attached to AVFrames only, where we
> > >> >> >> already have width and height.
> > >> >> >>
> > >> >> >
> > >> >> > The detection result will be used by other filters, for example,
> > >> >> > dnn_classify (see https://github.com/guoyejun/ffmpeg/tree/classify).
> > >> >> >
> > >> >> > The filter dnn_detect detects all the objects (cat, dog, person 
> > >> >> > ...) in a
> > >> >> > frame, while dnn_classify classifies one detected object (for 
> > >> >> > example,
> > >> person)
> > >> >> > for its attribute (for example, emotion, etc.)
> > >> >> >
> > >> >> > The filter dnn_classify have to check if the frame size is changed 
> > >> >> > after
> > >> >> > it is detected, to handle the below filter chain:
> > >> >> > dnn_detect -> scale -> dnn_classify
> > >> >> >
> > >> >>
> > >> >> This doesn't look good. Why is dnn_classify needing to know
> > >> >> the original frame size at all?
> > >> >>
> > >> >
> > >> > For example, the original size of the frame is 100*100, and dnn_detect
> > >> > detects a face at place (10, 10) -> (30, 40), such data will be saved 
> > >> > in
> > >> > AVBoundingBox.top/left/right/bottom.
> > >> >
> > >> > Then, the frame is scaled into 50*50.
> > >> >
> > >> > Then, dnn_classify is used to analyze the emotion of the face, it 
> > >> > needs to
> > >> > know the frame size (100*100) when it is detected, otherwise, it does 
> > >> > not
> > >> > work with just (10,10), (30,40) and 50*50.
> > >> >
> > >>
> > >> Why can't the scale filter also rescale this side data as well?
> > >>
> > >
> > > I'm afraid that we could not make sure all such filters (including 
> > > filters in the
> > > future) to do the rescale. And in the previous discussion, I got to know 
> > > that
> > > 'many other existing side-data types are invalidated by scaling'.
> > >
> > > So, we need frame_width and frame_height here.
> > >
> >
> > No, you don't. You just need to make sure filters which change resolution
> > or do cropping also change the side data parameters.
> > It's called maintainership. As-is, this won't even work with cropping,
> > only with basic aspect ratio preserving scaling.
> > For the lack of a better term, this is a hack.
>
> As discussed in previous email, for the frame size change case, dnn_classify
> (and other filters which use the detection result, for example drawbox) can
> just output a warning message to tell user what happens, and don't do the
> classification, otherwise, it will give a wrong/weird result which makes the
> user confused.
>
> >
> > I would accept just specifying that if the frame dimensions are
> > altered in any way, the side-data is no longer valid and it's up
> > to users to figure that out by out of bound coordinates.
> > This is what we currently do with video_enc_params.
>
> frame_width/frame_height is not perfect (for the cases such as: scale down
> + crop + scale up to the same size), but it provides more info than the 
> checking
> of 'out of bound coordinates'. There are many other possible issues when the
> coordinates are within the frame.
>
> If we think we'd better not let user get more info from the warning message,
> I'm ok to remove them.
>
> I'll remove them if there's another comment supporting the removal, and
> there's no objection.
>
> >
> >
> > >> >> >> > diff --git a/libavutil/frame.h b/libavutil/frame.h
> > >> >> >> > index a5ed91b20a..41e22de02a 100644
> > >> >> >> > --- a/libavutil/frame.h
> > >> >> >> > +++ b/libavutil/frame.h
> > >> >> >> > @@ -198,6 +198,13 @@ enum AVFrameSideDataType {
> > >> >> >> >  * Must be present for every frame which should have film grain
> > >> applied.
> > >> >> >> >  */
> > >> >> >> >  AV_FRAME_DATA_FILM_GRAIN_PARAMS,
> > >> >> >> > +
> > >> >> >> > +/**
> > >> >> >> > + * Bounding boxes for object detection and classification, 
> > >> >> >> > the
> > >> data is
> > >> >> a
> > >> >> >> AVBoundingBoxHeader
> > >> >> >> > + * followed with an array of AVBoudingBox, and
> > >> >> >> AVBoundingBoxHe

Re: [FFmpeg-devel] [PATCH V7 4/6] lavu: add side data AV_FRAME_DATA_BOUNDING_BOXES

2021-04-09 Thread Pedro Arthur

Em sex., 9 de abr. de 2021 às 12:15, Lynne  escreveu:
>
> Apr 9, 2021, 16:35 by bygran...@gmail.com:
>
> > Em sex., 9 de abr. de 2021 às 01:13, Guo, Yejun  
> > escreveu:
> >
> >>
> >>
> >>
> >> > -Original Message-
> >> > From: ffmpeg-devel  On Behalf Of Lynne
> >> > Sent: 2021年4月9日 0:57
> >> > To: FFmpeg development discussions and patches 
> >> > Subject: Re: [FFmpeg-devel] [PATCH V7 4/6] lavu: add side data
> >> > AV_FRAME_DATA_BOUNDING_BOXES
> >> >
> >>
> >> First of all, thanks for the quick replies, I see, all the 
> >> discussions/comments are to
> >> make this patch better, thank you.
> >>
> >> > >> >
> >> > >> >> >> > +
> >> > >> >> >> > +typedef struct AVBoundingBoxHeader {
> >> > >> >> >> > +/**
> >> > >> >> >> > + * Information about how the bounding box is generated.
> >> > >> >> >> > + * for example, the DNN model name.
> >> > >> >> >> > + */
> >> > >> >> >> > +char source[128];
> >> > >> >> >> > +
> >> > >> >> >> > +/**
> >> > >> >> >> > + * The size of frame when it is detected.
> >> > >> >> >> > + */
> >> > >> >> >> > +int frame_width;
> >> > >> >> >> > +int frame_height;
> >> > >> >> >> >
> >> > >> >> >>
> >> > >> >> >> Why? This side data is attached to AVFrames only, where we
> >> > >> >> >> already have width and height.
> >> > >> >> >>
> >> > >> >> >
> >> > >> >> > The detection result will be used by other filters, for example,
> >> > >> >> > dnn_classify (see 
> >> > >> >> > https://github.com/guoyejun/ffmpeg/tree/classify).
> >> > >> >> >
> >> > >> >> > The filter dnn_detect detects all the objects (cat, dog, person 
> >> > >> >> > ...) in a
> >> > >> >> > frame, while dnn_classify classifies one detected object (for 
> >> > >> >> > example,
> >> > >> person)
> >> > >> >> > for its attribute (for example, emotion, etc.)
> >> > >> >> >
> >> > >> >> > The filter dnn_classify have to check if the frame size is 
> >> > >> >> > changed after
> >> > >> >> > it is detected, to handle the below filter chain:
> >> > >> >> > dnn_detect -> scale -> dnn_classify
> >> > >> >> >
> >> > >> >>
> >> > >> >> This doesn't look good. Why is dnn_classify needing to know
> >> > >> >> the original frame size at all?
> >> > >> >>
> >> > >> >
> >> > >> > For example, the original size of the frame is 100*100, and 
> >> > >> > dnn_detect
> >> > >> > detects a face at place (10, 10) -> (30, 40), such data will be 
> >> > >> > saved in
> >> > >> > AVBoundingBox.top/left/right/bottom.
> >> > >> >
> >> > >> > Then, the frame is scaled into 50*50.
> >> > >> >
> >> > >> > Then, dnn_classify is used to analyze the emotion of the face, it 
> >> > >> > needs to
> >> > >> > know the frame size (100*100) when it is detected, otherwise, it 
> >> > >> > does not
> >> > >> > work with just (10,10), (30,40) and 50*50.
> >> > >> >
> >> > >>
> >> > >> Why can't the scale filter also rescale this side data as well?
> >> > >>
> >> > >
> >> > > I'm afraid that we could not make sure all such filters (including 
> >> > > filters in the
> >> > > future) to do the rescale. And in the previous discussion, I got to 
> >> > > know that
> >> > > 'many other existing side-data types are invalidated by scaling'.
> >> > >
> >> > > So, we need frame_width and frame_height here.
> >> > >
> >> >
> >> > No, you don't. You just need to make sure filters which change resolution
> >> > or do cropping also change the side data parameters.
> >> > It's called maintainership. As-is, this won't even work with cropping,
> >> > only with basic aspect ratio preserving scaling.
> >> > For the lack of a better term, this is a hack.
> >>
> >> As discussed in previous email, for the frame size change case, 
> >> dnn_classify
> >> (and other filters which use the detection result, for example drawbox) can
> >> just output a warning message to tell user what happens, and don't do the
> >> classification, otherwise, it will give a wrong/weird result which makes 
> >> the
> >> user confused.
> >>
> >> >
> >> > I would accept just specifying that if the frame dimensions are
> >> > altered in any way, the side-data is no longer valid and it's up
> >> > to users to figure that out by out of bound coordinates.
> >> > This is what we currently do with video_enc_params.
> >>
> >> frame_width/frame_height is not perfect (for the cases such as: scale down
> >> + crop + scale up to the same size), but it provides more info than the 
> >> checking
> >> of 'out of bound coordinates'. There are many other possible issues when 
> >> the
> >> coordinates are within the frame.
> >>
> >> If we think we'd better not let user get more info from the warning 
> >> message,
> >> I'm ok to remove them.
> >>
> >> I'll remove them if there's another comment supporting the removal, and
> >> there's no objection.
> >>
> >> >
> >> >
> >> > >> >> >> > diff --git a/libavutil/frame.h b/libavutil/frame.h
> >> > >> >> >> > index a5ed91b20a..41e22de02a 100644
> >> > >> >> >> > --- a/libavutil/frame.h
> >> > >> >> >> > +++ b/libavutil/frame.h
>

Re: [FFmpeg-devel] [PATCH V7 4/6] lavu: add side data AV_FRAME_DATA_BOUNDING_BOXES

2021-04-11 Thread Pedro Arthur

Em dom., 11 de abr. de 2021 às 14:53, Nicolas George  escreveu:
>
> Anton Khirnov (12021-04-11):
> > We are a generic multimedia framework. "the field" for us is multimedia
> > in general, so we should use names meaningful in general multimedia
> > context.
> > I mostly agree with Lynne, "bounding box" is confusing and misleading
> > when this structure is built around object detection and classification.
>
> I agree with both of you. When faced with this kind of choice, we must
> choose the wording that will be as clear and not confusing as possible
> for people familiar with general concepts of video encoding but not
> necessarily familiar with the jargon of any particular sub-field. The
> specialists of the particular subfield are supposed to be more capable
> of adjusting.
>
Personally, "bounding box" is very clear to me, I might be biased.
It seems we are bikeshedding over a naming. I think it is more
constructive if we propose a better name as the only other option
proposed until now seems worst to me.

I think something like "AV_SIDE_DATA_DETECTION_BOUNDING_BOX" is
reasonable, as it conveys what it is and what it is used for.


> Also, I have observed that the jargon of a narrow field is frequently
> made of awful misnomers kept more as cargo cult than for any good
> reason.
>
> Regards,
>
> --
>   Nicolas George
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] dnn: add openvino as one of dnn backend

2020-05-31 Thread Pedro Arthur

Hi,


Em seg., 25 de mai. de 2020 às 22:56, Guo, Yejun  escreveu:
>
> OpenVINO is a Deep Learning Deployment Toolkit at
> https://github.com/openvinotoolkit/openvino, it supports CPU, GPU
> and heterogeneous plugins to accelerate deep learning inferencing.
>
> Please refer to 
> https://github.com/openvinotoolkit/openvino/blob/master/build-instruction.md
> to build openvino (c library is built at the same time). Please add
> option -DENABLE_MKL_DNN=ON for cmake to enable CPU path. The header
> files and libraries are installed to 
> /usr/local/deployment_tools/inference_engine/
> with default options on my system.
>
> To build FFmpeg with openvion, take my system as an example, run with:
> $ ../ffmpeg/configure --enable-libopenvino 
> --extra-cflags=-I/usr/local/deployment_tools/inference_engine/include/ 
> --extra-ldflags=-L/usr/local/deployment_tools/inference_engine/lib/intel64
> $ make
>
> As dnn module maintainer, I do want to see it is utilized by customers,
> so the dnn module can be improved on the right direction with 
> developers/customers

I agree with you, yet it is not clear to me  what is the right direction.
Currently we have the native and tensorflow backends, does OpenVINO
brings something our current backends lacks?

Reading the docs I see a few points (and questions) that may be pro
openvino. If you can confirm them, I think it would be worth adding
another backend.
* It has a dedicated inference engine and it can optimize a model for
inference, thus speeding it up
* It can convert from various common models formats
* it supports CPU and GPU out of the box, TF also suports GPU but only
cuda capable ones and it needes different installations of the library
(one for cpu and another for gpu)
Does openvino CPU backend runs well on non-intel cpus? I mean it does
not need to be equally good but at least decent.
Does gpu support runs on non-intel gpus? I think it is a really
important point, it seems it is using opencl so if it can run on any
opencl capable gpu  it would be a great upgrade over TF


>
> collaboration, but I seldomly receive feedbacks.
>
> On the other hand, I know that there are video analytics projects
> accepted by customers based on FFmpeg + openvino, see more detail
Being used is a good point but I think there must be some improvements
over our current backends to justify it, otherwise one may ask why not
adding any other dnn library from the huge list of 'yet another dnn
library'.
In short I think it is a good adition if you can confirm the above points.

> at https://github.com/VCDP/FFmpeg-patch, but the code bypasses the
> dnn interface layer and could not be upstreamed directly.
>
> So, I introduce openvino as one of the dnn backend as a preparation
> for later usage.
>
> Signed-off-by: Guo, Yejun 
> ---
>  configure  |   6 +-
>  libavfilter/dnn/Makefile   |   1 +
>  libavfilter/dnn/dnn_backend_openvino.c | 261 
> +
>  libavfilter/dnn/dnn_backend_openvino.h |  38 +
>  libavfilter/dnn/dnn_interface.c|  11 ++
>  libavfilter/dnn_interface.h|   2 +-
>  6 files changed, 317 insertions(+), 2 deletions(-)
>  create mode 100644 libavfilter/dnn/dnn_backend_openvino.c
>  create mode 100644 libavfilter/dnn/dnn_backend_openvino.h
>
> diff --git a/configure b/configure
> index f97cad0..6a50351 100755
> --- a/configure
> +++ b/configure
> @@ -253,6 +253,8 @@ External library support:
>--enable-libopenh264 enable H.264 encoding via OpenH264 [no]
>--enable-libopenjpeg enable JPEG 2000 de/encoding via OpenJPEG [no]
>--enable-libopenmpt  enable decoding tracked files via libopenmpt [no]
> +  --enable-libopenvino enable OpenVINO as a DNN module backend
> +   for DNN based filters like dnn_processing [no]
>--enable-libopus enable Opus de/encoding via libopus [no]
>--enable-libpulseenable Pulseaudio input via libpulse [no]
>--enable-librabbitmq enable RabbitMQ library [no]
> @@ -1790,6 +1792,7 @@ EXTERNAL_LIBRARY_LIST="
>  libopenh264
>  libopenjpeg
>  libopenmpt
> +libopenvino
>  libopus
>  libpulse
>  librabbitmq
> @@ -2620,7 +2623,7 @@ cbs_mpeg2_select="cbs"
>  cbs_vp9_select="cbs"
>  dct_select="rdft"
>  dirac_parse_select="golomb"
> -dnn_suggest="libtensorflow"
> +dnn_suggest="libtensorflow libopenvino"
>  error_resilience_select="me_cmp"
>  faandct_deps="faan"
>  faandct_select="fdctdsp"
> @@ -6346,6 +6349,7 @@ enabled libopenh264   && require_pkg_config 
> libopenh264 openh264 wels/codec_
>  enabled libopenjpeg   && { check_pkg_config libopenjpeg "libopenjp2 >= 
> 2.1.0" openjpeg.h opj_version ||
> { require_pkg_config libopenjpeg "libopenjp2 
> >= 2.1.0" openjpeg.h opj_version -DOPJ_STATIC && add_cppflags -DOPJ_STATIC; } 
> }
>  enabled libopenmpt&& require_pkg_config libopenmpt "libopenmpt >= 
> 0.2.6557" libopenmpt/libopenmpt.h

Re: [FFmpeg-devel] [PATCH V2 2/2] vf_dnn_processing.c: add dnn backend openvino

2020-06-28 Thread Pedro Arthur

Hi,

Em qua., 24 de jun. de 2020 às 03:40, Guo, Yejun 
escreveu:

>
>
> > -Original Message-
> > From: Guo, Yejun 
> > Sent: 2020年6月11日 21:01
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Guo, Yejun 
> > Subject: [PATCH V2 2/2] vf_dnn_processing.c: add dnn backend openvino
> >
> > We can try with the srcnn model from sr filter.
> > 1) get srcnn.pb model file, see filter sr
> > 2) convert srcnn.pb into openvino model with command:
> > python mo_tf.py --input_model srcnn.pb --data_type=FP32 --input_shape
> > [1,960,1440,1] --keep_shape_ops
> >
> > See the script at
> > https://github.com/openvinotoolkit/openvino/tree/master/model-optimizer
> > We'll see srcnn.xml and srcnn.bin at current path, copy them to the
> directory
> > where ffmpeg is.
> >
> > I have also uploaded the model files at
> > https://github.com/guoyejun/dnn_processing/tree/master/models
> >
> > 3) run with openvino backend:
> > ffmpeg -i input.jpg -vf
> > format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=openvino
> > :model=srcnn.xml:input=x:output=srcnn/Maximum -y srcnn.ov.jpg (The
> > input.jpg resolution is 720*480)
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  doc/filters.texi| 10 +-
> >  libavfilter/vf_dnn_processing.c |  5 -
> >  2 files changed, 13 insertions(+), 2 deletions(-)
>
> any comment for this patch set? thanks.
>
It would be nice if you include some benchmark numbers, comparing it with
the others backends.
Rest LGTM, thanks!

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/3] dnn: change .model file format to put layer number at the end of file

2019-08-30 Thread Pedro Arthur

Em qui, 29 de ago de 2019 às 02:58, Guo, Yejun  escreveu:
>
> currently, the layer number is at the beginning of the .model file,
> so we have to scan twice in python script, the first scan to get the
> layer number. Only one scan needed after put the layer number at the
> end of .model file.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c|  2 ++
>  tools/python/convert_from_tensorflow.py | 12 +---
>  2 files changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index daa4f50..5d39353 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -93,8 +93,10 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  }
>  model->model = (void *)network;
>
> +avio_seek(model_file_context, file_size - 4, SEEK_SET);
>  network->layers_num = (int32_t)avio_rl32(model_file_context);
>  dnn_size = 4;
> +avio_seek(model_file_context, 0, SEEK_SET);
>
>  network->layers = av_mallocz(network->layers_num * sizeof(Layer));
>  if (!network->layers){
> diff --git a/tools/python/convert_from_tensorflow.py 
> b/tools/python/convert_from_tensorflow.py
> index 34454b8..cbc76a9 100644
> --- a/tools/python/convert_from_tensorflow.py
> +++ b/tools/python/convert_from_tensorflow.py
> @@ -129,15 +129,6 @@ class TFConverter:
>  self.converted_nodes.add(node.name)
>
>
> -def generate_layer_number(self):
> -# in current hard code implementation, the layer number is the first 
> data written to the native model file
> -# it is not easy to know it at the beginning time in the general 
> converter, so first do a dry run for compatibility
> -# will be refined later.
> -with open('/tmp/tmp.model', 'wb') as f:
> -self.dump_layers_to_file(f)
> -self.converted_nodes.clear()
> -
> -
>  def dump_layers_to_file(self, f):
>  for node in self.nodes:
>  if node.name in self.converted_nodes:
> @@ -157,10 +148,9 @@ class TFConverter:
>
>
>  def dump_to_file(self):
> -self.generate_layer_number()
>  with open(self.outfile, 'wb') as f:
> -np.array([self.layer_number], dtype=np.uint32).tofile(f)
>  self.dump_layers_to_file(f)
> +np.array([self.layer_number], dtype=np.uint32).tofile(f)
>
>
>  def generate_name_node_dict(self):
> --
> 2.7.4

Pushed, thanks!

>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/3] dnn: introduce dnn operand (in c code) to hold operand infos within network

2019-08-30 Thread Pedro Arthur

Em qui, 29 de ago de 2019 às 02:57, Guo, Yejun  escreveu:
>
> the info can be saved in dnn operand object without regenerating again and 
> again,
> and it is also needed for layer split/merge, and for memory reuse.
>
> to make things step by step, this patch just focuses on c code,
> the change within python script will be added later.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c   | 226 
> -
>  libavfilter/dnn/dnn_backend_native.h   |  54 +-
>  libavfilter/dnn/dnn_backend_native_layer_pad.c |  24 ++-
>  libavfilter/dnn/dnn_backend_native_layer_pad.h |   4 +-
>  tests/dnn/Makefile |   2 +-
>  tests/dnn/dnn-layer-pad-test.c |  60 +--
>  6 files changed, 236 insertions(+), 134 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index d52abc6..daa4f50 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -30,77 +30,30 @@
>  static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
> -InputParams *input_params;
> -ConvolutionalParams *conv_params;
> -DepthToSpaceParams *depth_to_space_params;
> -LayerPadParams *pad_params;
> -int cur_width, cur_height, cur_channels;
> -int32_t layer;
>
> -if (network->layers_num <= 0 || network->layers[0].type != INPUT){
> +if (network->layers_num <= 0 || network->operands_num <= 0)
>  return DNN_ERROR;
> -}
> -else{
> -input_params = (InputParams *)network->layers[0].params;
> -input_params->width = cur_width = input->width;
> -input_params->height = cur_height = input->height;
> -input_params->channels = cur_channels = input->channels;
> -if (input->data){
> -av_freep(&input->data);
> -}
> -av_assert0(input->dt == DNN_FLOAT);
> -network->layers[0].output = input->data = av_malloc(cur_height * 
> cur_width * cur_channels * sizeof(float));
> -if (!network->layers[0].output){
> -return DNN_ERROR;
> -}
> -}
> -
> -for (layer = 1; layer < network->layers_num; ++layer){
> -switch (network->layers[layer].type){
> -case CONV:
> -conv_params = (ConvolutionalParams 
> *)network->layers[layer].params;
> -if (conv_params->input_num != cur_channels){
> -return DNN_ERROR;
> -}
> -cur_channels = conv_params->output_num;
> -
> -if (conv_params->padding_method == VALID) {
> -int pad_size = (conv_params->kernel_size - 1) * 
> conv_params->dilation;
> -cur_height -= pad_size;
> -cur_width -= pad_size;
> -}
> -break;
> -case DEPTH_TO_SPACE:
> -depth_to_space_params = (DepthToSpaceParams 
> *)network->layers[layer].params;
> -if (cur_channels % (depth_to_space_params->block_size * 
> depth_to_space_params->block_size) != 0){
> -return DNN_ERROR;
> -}
> -cur_channels = cur_channels / (depth_to_space_params->block_size 
> * depth_to_space_params->block_size);
> -cur_height *= depth_to_space_params->block_size;
> -cur_width *= depth_to_space_params->block_size;
> -break;
> -case MIRROR_PAD:
> -pad_params = (LayerPadParams *)network->layers[layer].params;
> -cur_height = cur_height + pad_params->paddings[1][0] + 
> pad_params->paddings[1][1];
> -cur_width = cur_width + pad_params->paddings[2][0] + 
> pad_params->paddings[2][1];
> -cur_channels = cur_channels + pad_params->paddings[3][0] + 
> pad_params->paddings[3][1];
> -break;
> -default:
> -return DNN_ERROR;
> -}
> -if (network->layers[layer].output){
> -av_freep(&network->layers[layer].output);
> -}
> -
> -if (cur_height <= 0 || cur_width <= 0)
> -return DNN_ERROR;
>
> -network->layers[layer].output = av_malloc(cur_height * cur_width * 
> cur_channels * sizeof(float));
> -if (!network->layers[layer].output){
> -return DNN_ERROR;
> -}
> -}
> +av_assert0(input->dt == DNN_FLOAT);
> +
> +/**
> + * as the first step, suppose network->operands[0] is the input operand.
> + */
> +network->operands[0].dims[0] = 1;
> +network->operands[0].dims[1] = input->height;
> +network->operands[0].dims[2] = input->width;
> +network->operands[0].dims[3] = input->channels;
> +network->operands[0].type = DOT_INPUT;
> +network->operands[0].data_type = DNN_FLOAT;
> +network->operands[0].isNHWC = 1;
> +
> +av_freep(&network->o

Re: [FFmpeg-devel] [PATCH 3/3] dnn: export operand info in python script and load in c code

2019-08-30 Thread Pedro Arthur

Em qui, 29 de ago de 2019 às 02:42, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > Pedro Arthur
> > Sent: Tuesday, August 27, 2019 10:46 PM
> > To: FFmpeg development discussions and patches 
> > Subject: Re: [FFmpeg-devel] [PATCH 3/3] dnn: export operand info in python
> > script and load in c code
> >
> > hi,
> >
> > Em ter, 20 de ago de 2019 às 05:54, Guo, Yejun 
> > escreveu:
> > >
> > > Signed-off-by: Guo, Yejun 
> > > ---
> > >  libavfilter/dnn/dnn_backend_native.c|  49 +++---
> > >  libavfilter/dnn/dnn_backend_native.h|   2 +-
> > >  libavfilter/dnn_interface.h |   2 +-
> > >  tools/python/convert_from_tensorflow.py | 111
> > +---
> > >  4 files changed, 142 insertions(+), 22 deletions(-)
> > >
> > > diff --git a/libavfilter/dnn/dnn_backend_native.c
> > b/libavfilter/dnn/dnn_backend_native.c
> > > index 0ba4e44..eeae711 100644
> > > --- a/libavfilter/dnn/dnn_backend_native.c
> > > +++ b/libavfilter/dnn/dnn_backend_native.c
> > > @@ -72,7 +72,6 @@ DNNModel *ff_dnn_load_model_native(const char
> > *model_filename)
> > >  ConvolutionalParams *conv_params;
> > >  DepthToSpaceParams *depth_to_space_params;
> > >  LayerPadParams *pad_params;
> > > -int32_t operand_index = 0;
> > >
> > >  model = av_malloc(sizeof(DNNModel));
> > >  if (!model){
> > > @@ -93,9 +92,10 @@ DNNModel *ff_dnn_load_model_native(const char
> > *model_filename)
> > >  }
> > >  model->model = (void *)network;
> > >
> > > -avio_seek(model_file_context, file_size - 4, SEEK_SET);
> > > +avio_seek(model_file_context, file_size - 8, SEEK_SET);
> > >  network->layers_num = (int32_t)avio_rl32(model_file_context);
> > > -dnn_size = 4;
> > > +network->operands_num = (int32_t)avio_rl32(model_file_context);
> > > +dnn_size = 8;
> > >  avio_seek(model_file_context, 0, SEEK_SET);
> > >
> > I think it is worth adding some means to assert the input file is
> > indeed a dnn file, the code as is may alloc an undefined amout of
> > memory if the file passed is malformed or corrupted.
> > Maybe adding a magic number + the file size (or something else) at the
> > beginning of the file and early skip parsing it if it does not match?
> > however it may require two passes to generate the file which goes
> > against your previous patch.
> >
> > Otherwise I can push it as is, as this behavior was already there
> > before the patch.
>
> good point, how about add "FFMPEGDNNNATIVE" + version_number at the beginning 
> of the file,
> or we can use another magic number instead of "FFMPEGDNNNATIVE". Once we 
> change the model file
> format, the version_number should be increased. I can send a new patch after 
> this patch set is pushed.
>
I was thinking of using a single dword but anything will do.


Patch pushed, thanks!
> I think it doesn't matter to put the info at the beginning or at the end of 
> the file, avio_seek
> does not alloc memory. And the layers_num and operands_num take similar 
> effect of file_size.
>
> >
> > >  network->layers = av_mallocz(network->layers_num * sizeof(Layer));
> > > @@ -105,11 +105,6 @@ DNNModel *ff_dnn_load_model_native(const char
> > *model_filename)
> > >  return NULL;
> > >  }
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] libavfilter/dnn: add header into native model file

2019-09-04 Thread Pedro Arthur

LGTM

Pushed, thanks!

Em seg, 2 de set de 2019 às 01:40, Guo, Yejun  escreveu:
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c| 43 
> +++--
>  tools/python/convert_from_tensorflow.py |  3 +++
>  tools/python/convert_header.py  | 26 
>  3 files changed, 70 insertions(+), 2 deletions(-)
>  create mode 100644 tools/python/convert_header.py
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index 8b05bec..f56cd81 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -64,6 +64,10 @@ static DNNReturnType set_input_output_native(void *model, 
> DNNInputData *input, c
>  DNNModel *ff_dnn_load_model_native(const char *model_filename)
>  {
>  DNNModel *model = NULL;
> +char header_expected[] = "FFMPEGDNNNATIVE";
> +char *buf;
> +size_t size;
> +int version, header_size, major_version_expected = 0;
>  ConvolutionalNetwork *network = NULL;
>  AVIOContext *model_file_context;
>  int file_size, dnn_size, kernel_size, i;
> @@ -84,6 +88,41 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  }
>  file_size = avio_size(model_file_context);
>
> +/**
> + * check file header with string and version
> + */
> +size = sizeof(header_expected);
> +buf = av_malloc(size);
> +if (!buf) {
> +avio_closep(&model_file_context);
> +av_freep(&model);
> +return NULL;
> +}
> +
> +// size - 1 to skip the ending '\0' which is not saved in file
> +avio_get_str(model_file_context, size - 1, buf, size);
> +dnn_size = size - 1;
> +if (strncmp(buf, header_expected, size) != 0) {
> +av_freep(&buf);
> +avio_closep(&model_file_context);
> +av_freep(&model);
> +return NULL;
> +}
> +av_freep(&buf);
> +
> +version = (int32_t)avio_rl32(model_file_context);
> +dnn_size += 4;
> +if (version != major_version_expected) {
> +avio_closep(&model_file_context);
> +av_freep(&model);
> +return NULL;
> +}
> +
> +// currently no need to check minor version
> +version = (int32_t)avio_rl32(model_file_context);
> +dnn_size += 4;
> +header_size = dnn_size;
> +
>  network = av_mallocz(sizeof(ConvolutionalNetwork));
>  if (!network){
>  avio_closep(&model_file_context);
> @@ -95,8 +134,8 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  avio_seek(model_file_context, file_size - 8, SEEK_SET);
>  network->layers_num = (int32_t)avio_rl32(model_file_context);
>  network->operands_num = (int32_t)avio_rl32(model_file_context);
> -dnn_size = 8;
> -avio_seek(model_file_context, 0, SEEK_SET);
> +dnn_size += 8;
> +avio_seek(model_file_context, header_size, SEEK_SET);
>
>  network->layers = av_mallocz(network->layers_num * sizeof(Layer));
>  if (!network->layers){
> diff --git a/tools/python/convert_from_tensorflow.py 
> b/tools/python/convert_from_tensorflow.py
> index bab11a5..1437ad3 100644
> --- a/tools/python/convert_from_tensorflow.py
> +++ b/tools/python/convert_from_tensorflow.py
> @@ -20,6 +20,7 @@
>  import tensorflow as tf
>  import numpy as np
>  import sys, struct
> +import convert_header as header
>
>  __all__ = ['convert_from_tensorflow']
>
> @@ -229,6 +230,8 @@ class TFConverter:
>
>  def dump_to_file(self):
>  with open(self.outfile, 'wb') as f:
> +f.write(header.str.encode('utf-8'))
> +np.array([header.major, header.minor], dtype=np.uint32).tofile(f)
>  self.dump_layers_to_file(f)
>  self.dump_operands_to_file(f)
>  np.array([self.layer_number, len(self.name_operand_dict)], 
> dtype=np.uint32).tofile(f)
> diff --git a/tools/python/convert_header.py b/tools/python/convert_header.py
> new file mode 100644
> index 000..6a7e4af
> --- /dev/null
> +++ b/tools/python/convert_header.py
> @@ -0,0 +1,26 @@
> +# Copyright (c) 2019
> +#
> +# This file is part of FFmpeg.
> +#
> +# FFmpeg is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# FFmpeg is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with FFmpeg; if not, write to the Free Software
> +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> +# 
> ==
> +
> +str = 'FFMPEGDNNNATIVE'
> +

Re: [FFmpeg-devel] [PATCH 3/4] libavfilter/dnn: separate depth_to_space layer from dnn_backend_native.c to a new file

2019-09-19 Thread Pedro Arthur

LGTM

Pushed, thanks!

Em qui, 5 de set de 2019 às 03:05, Guo, Yejun  escreveu:
>
> the logic is that one layer in one separated source file to make
> the source files simple for maintaining.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/Makefile   |  1 +
>  libavfilter/dnn/dnn_backend_native.c   | 44 +-
>  libavfilter/dnn/dnn_backend_native.h   |  4 --
>  .../dnn/dnn_backend_native_layer_depth2space.c | 71 
> ++
>  .../dnn/dnn_backend_native_layer_depth2space.h | 39 
>  libavfilter/dnn/dnn_backend_tf.c   |  1 +
>  6 files changed, 113 insertions(+), 47 deletions(-)
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_depth2space.c
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_depth2space.h
>
> diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> index 40b848b..63a35e7 100644
> --- a/libavfilter/dnn/Makefile
> +++ b/libavfilter/dnn/Makefile
> @@ -2,6 +2,7 @@ OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_interface.o
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_backend_native.o
>  OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_pad.o
>  OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_conv2d.o
> +OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_depth2space.o
>
>  DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index 5dabd15..be548c6 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -27,6 +27,7 @@
>  #include "libavutil/avassert.h"
>  #include "dnn_backend_native_layer_pad.h"
>  #include "dnn_backend_native_layer_conv2d.h"
> +#include "dnn_backend_native_layer_depth2space.h"
>
>  static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
>  {
> @@ -282,49 +283,6 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  return model;
>  }
>
> -static int depth_to_space(DnnOperand *operands, const int32_t 
> *input_operand_indexes, int32_t output_operand_index, int block_size)
> -{
> -float *output;
> -int32_t input_operand_index = input_operand_indexes[0];
> -int number = operands[input_operand_index].dims[0];
> -int height = operands[input_operand_index].dims[1];
> -int width = operands[input_operand_index].dims[2];
> -int channels = operands[input_operand_index].dims[3];
> -const float *input = operands[input_operand_index].data;
> -
> -int y, x, by, bx, ch;
> -int new_channels = channels / (block_size * block_size);
> -int output_linesize = width * channels;
> -int by_linesize = output_linesize / block_size;
> -int x_linesize = new_channels * block_size;
> -
> -DnnOperand *output_operand = &operands[output_operand_index];
> -output_operand->dims[0] = number;
> -output_operand->dims[1] = height * block_size;
> -output_operand->dims[2] = width * block_size;
> -output_operand->dims[3] = new_channels;
> -output_operand->length = calculate_operand_data_length(output_operand);
> -output_operand->data = av_realloc(output_operand->data, 
> output_operand->length);
> -if (!output_operand->data)
> -return -1;
> -output = output_operand->data;
> -
> -for (y = 0; y < height; ++y){
> -for (x = 0; x < width; ++x){
> -for (by = 0; by < block_size; ++by){
> -for (bx = 0; bx < block_size; ++bx){
> -for (ch = 0; ch < new_channels; ++ch){
> -output[by * by_linesize + x * x_linesize + bx * 
> new_channels + ch] = input[ch];
> -}
> -input += new_channels;
> -}
> -}
> -}
> -output += output_linesize;
> -}
> -return 0;
> -}
> -
>  DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNData 
> *outputs, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model->model;
> diff --git a/libavfilter/dnn/dnn_backend_native.h 
> b/libavfilter/dnn/dnn_backend_native.h
> index aa5..a74d138 100644
> --- a/libavfilter/dnn/dnn_backend_native.h
> +++ b/libavfilter/dnn/dnn_backend_native.h
> @@ -90,10 +90,6 @@ typedef struct InputParams{
>  int height, width, channels;
>  } InputParams;
>
> -typedef struct DepthToSpaceParams{
> -int block_size;
> -} DepthToSpaceParams;
> -
>  // Represents simple feed-forward convolutional network.
>  typedef struct ConvolutionalNetwork{
>  Layer *layers;
> diff --git a/libavfilter/dnn/dnn_backend_native_layer_depth2space.c 
> b/libavfilter/dnn/dnn_backend_native_layer_depth2space.c
> new file mode 100644
> index 000..a2

Re: [FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: separate conv2d layer from dnn_backend_native.c to a new file

2019-09-19 Thread Pedro Arthur

Em qui, 5 de set de 2019 às 03:05, Guo, Yejun  escreveu:
>
> the logic is that one layer in one separated source file to make
> the source files simple for maintaining.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/Makefile  |   1 +
>  libavfilter/dnn/dnn_backend_native.c  |  80 +
>  libavfilter/dnn/dnn_backend_native.h  |  13 ---
>  libavfilter/dnn/dnn_backend_native_layer_conv2d.c | 101 
> ++
>  libavfilter/dnn/dnn_backend_native_layer_conv2d.h |  39 +
>  libavfilter/dnn/dnn_backend_tf.c  |   1 +
>  6 files changed, 143 insertions(+), 92 deletions(-)
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_conv2d.c
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_conv2d.h
>
> diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> index 83938e5..40b848b 100644
> --- a/libavfilter/dnn/Makefile
> +++ b/libavfilter/dnn/Makefile
> @@ -1,6 +1,7 @@
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_interface.o
>  OBJS-$(CONFIG_DNN)   += dnn/dnn_backend_native.o
>  OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_pad.o
> +OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_conv2d.o
>
>  DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index f56cd81..5dabd15 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -26,6 +26,7 @@
>  #include "dnn_backend_native.h"
>  #include "libavutil/avassert.h"
>  #include "dnn_backend_native_layer_pad.h"
> +#include "dnn_backend_native_layer_conv2d.h"
>
>  static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
>  {
> @@ -281,85 +282,6 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  return model;
>  }
>
> -#define CLAMP_TO_EDGE(x, w) ((x) < 0 ? 0 : ((x) >= (w) ? (w - 1) : (x)))
> -
> -static int convolve(DnnOperand *operands, const int32_t 
> *input_operand_indexes, int32_t output_operand_index, const 
> ConvolutionalParams *conv_params)
> -{
> -float *output;
> -int32_t input_operand_index = input_operand_indexes[0];
> -int number = operands[input_operand_index].dims[0];
> -int height = operands[input_operand_index].dims[1];
> -int width = operands[input_operand_index].dims[2];
> -int channel = operands[input_operand_index].dims[3];
> -const float *input = operands[input_operand_index].data;
> -
> -int radius = conv_params->kernel_size >> 1;
> -int src_linesize = width * conv_params->input_num;
> -int filter_linesize = conv_params->kernel_size * conv_params->input_num;
> -int filter_size = conv_params->kernel_size * filter_linesize;
> -int pad_size = (conv_params->padding_method == VALID) ? 
> (conv_params->kernel_size - 1) / 2 * conv_params->dilation : 0;
> -
> -DnnOperand *output_operand = &operands[output_operand_index];
> -output_operand->dims[0] = number;
> -output_operand->dims[1] = height - pad_size * 2;
> -output_operand->dims[2] = width - pad_size * 2;
> -output_operand->dims[3] = conv_params->output_num;
> -output_operand->length = calculate_operand_data_length(output_operand);
> -output_operand->data = av_realloc(output_operand->data, 
> output_operand->length);
> -if (!output_operand->data)
> -return -1;
> -output = output_operand->data;
> -
> -av_assert0(channel == conv_params->input_num);
> -
> -for (int y = pad_size; y < height - pad_size; ++y) {
> -for (int x = pad_size; x < width - pad_size; ++x) {
> -for (int n_filter = 0; n_filter < conv_params->output_num; 
> ++n_filter) {
> -output[n_filter] = conv_params->biases[n_filter];
> -
> -for (int ch = 0; ch < conv_params->input_num; ++ch) {
> -for (int kernel_y = 0; kernel_y < 
> conv_params->kernel_size; ++kernel_y) {
> -for (int kernel_x = 0; kernel_x < 
> conv_params->kernel_size; ++kernel_x) {
> -float input_pel;
> -if (conv_params->padding_method == 
> SAME_CLAMP_TO_EDGE) {
> -int y_pos = CLAMP_TO_EDGE(y + (kernel_y - 
> radius) * conv_params->dilation, height);
> -int x_pos = CLAMP_TO_EDGE(x + (kernel_x - 
> radius) * conv_params->dilation, width);
> -input_pel = input[y_pos * src_linesize + 
> x_pos * conv_params->input_num + ch];
> -} else {
> -int y_pos = y + (kernel_y - radius) * 
> conv_params->dilation;
> -int x_pos = x + (kernel_x - radius) * 
> co

Re: [FFmpeg-devel] [PATCH 2/4] FATE/dnn: add unit test for dnn conv2d layer

2019-09-19 Thread Pedro Arthur

Em qui, 5 de set de 2019 às 03:05, Guo, Yejun  escreveu:
>
> 'make fate-dnn-layer-conv2d' to run the test
>
> Signed-off-by: Guo, Yejun 
> ---
>  tests/dnn/Makefile|   1 +
>  tests/dnn/dnn-layer-conv2d-test.c | 238 
> ++
>  tests/fate/dnn.mak|   5 +
>  3 files changed, 244 insertions(+)
>  create mode 100644 tests/dnn/dnn-layer-conv2d-test.c
>
> diff --git a/tests/dnn/Makefile b/tests/dnn/Makefile
> index fabed75..3adefe8 100644
> --- a/tests/dnn/Makefile
> +++ b/tests/dnn/Makefile
> @@ -1,4 +1,5 @@
>  DNNTESTPROGS += dnn-layer-pad
> +DNNTESTPROGS += dnn-layer-conv2d
>
>  DNNTESTOBJS  := $(DNNTESTOBJS:%=$(DNNTESTSDIR)%) 
> $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test.o)
>  DNNTESTPROGS := $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test$(EXESUF))
> diff --git a/tests/dnn/dnn-layer-conv2d-test.c 
> b/tests/dnn/dnn-layer-conv2d-test.c
> new file mode 100644
> index 000..afc5391
> --- /dev/null
> +++ b/tests/dnn/dnn-layer-conv2d-test.c
> @@ -0,0 +1,238 @@
> +/*
> + * Copyright (c) 2019 Guo Yejun
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include "libavfilter/dnn/dnn_backend_native_layer_conv2d.h"
> +
> +#define EPSON 0.1
> +
> +static int test_with_same_dilate(void)
> +{
> +// the input data and expected data are generated with below python code.
> +/*
> +x = tf.placeholder(tf.float32, shape=[1, None, None, 3])
> +y = tf.layers.conv2d(x, 2, 3, activation=tf.nn.tanh, padding='same', 
> dilation_rate=(2, 2), bias_initializer=tf.keras.initializers.he_normal())
> +data = np.random.rand(1, 5, 6, 3);
> +
> +sess=tf.Session()
> +sess.run(tf.global_variables_initializer())
> +
> +weights = dict([(var.name, sess.run(var)) for var in 
> tf.trainable_variables()])
> +kernel = weights['conv2d/kernel:0']
> +kernel = np.transpose(kernel, [3, 0, 1, 2])
> +print("kernel:")
> +print(kernel.shape)
> +print(list(kernel.flatten()))
> +
> +bias = weights['conv2d/bias:0']
> +print("bias:")
> +print(bias.shape)
> +print(list(bias.flatten()))
> +
> +output = sess.run(y, feed_dict={x: data})
> +
> +print("input:")
> +print(data.shape)
> +print(list(data.flatten()))
> +
> +print("output:")
> +print(output.shape)
> +print(list(output.flatten()))
> +*/
> +
> +ConvolutionalParams params;
> +DnnOperand operands[2];
> +int32_t input_indexes[1];
> +float input[1*5*6*3] = {
> +0.7012556460308194, 0.4233847954643357, 0.19515900664313612, 
> 0.16343083004926495, 0.5758261611052848, 0.9510767434014871, 
> 0.11014085055947687,
> +0.906327053637727, 0.8136794715542507, 0.45371764543639526, 
> 0.5768443343523952, 0.19543668786046986, 0.15648326047898609, 
> 0.2099500241141279,
> +0.17658777090552413, 0.059335724777169196, 0.1729991838469117, 
> 0.8150514704819208, 0.4435535466703049, 0.3752188477566878, 0.749936650421431,
> +0.6823494635284907, 0.10776389679424747, 0.34247481674596836, 
> 0.5147867256244629, 0.9063709728129032, 0.12423605800856818, 
> 0.6064872945412728,
> +0.5891681538551459, 0.9865836236466314, 0.9002163879294677, 
> 0.003968273184274618, 0.8628374809643967, 0.1327176268279583, 
> 0.8449799925703798,
> +0.1937671869354366, 0.41524410152707425, 0.02038786604756837, 
> 0.49792466069597496, 0.8881874553848784, 0.9683921035597336, 
> 0.4122972568010813,
> +0.843553550993252, 0.9588482762501964, 0.5190350762645546, 
> 0.4283584264145317, 0.09781496073714646, 0.9501058833776156, 
> 0.8665541760152776,
> +0.31669272550095806, 0.07133074675453632, 0.606438007334886, 
> 0.7007157020538224, 0.4827996264130444, 0.5167615606392761, 
> 0.6385043039312651,
> +0.23069664707810555, 0.058233497329354456, 0.06323892961591071, 
> 0.24816458893245974, 0.8646369065257812, 0.24742185893094837, 
> 0.09991225948167437,
> +0.625700606979606, 0.7678541502111257, 0.6215834594679912, 
> 0.5623003956582483, 0.07389123942681242, 0.7659100715711249, 
> 0.486061471642225,
> +0.9947455699829012, 0.9094911797643259, 0.7644355876253265, 
> 0.05384315321492239, 0.13565394382783613, 0.9810628204953316

Re: [FFmpeg-devel] [PATCH 4/4] FATE/dnn: add unit test for dnn depth_to_space layer

2019-09-19 Thread Pedro Arthur

LGTM

Pushed, thanks!

Em qui, 5 de set de 2019 às 03:05, Guo, Yejun  escreveu:
>
> 'make fate-dnn-layer-depth2space' to run the test
>
> Signed-off-by: Guo, Yejun 
> ---
>  tests/dnn/Makefile |   1 +
>  tests/dnn/dnn-layer-depth2space-test.c | 100 
> +
>  tests/fate/dnn.mak |   5 ++
>  3 files changed, 106 insertions(+)
>  create mode 100644 tests/dnn/dnn-layer-depth2space-test.c
>
> diff --git a/tests/dnn/Makefile b/tests/dnn/Makefile
> index 3adefe8..3cb5f6d 100644
> --- a/tests/dnn/Makefile
> +++ b/tests/dnn/Makefile
> @@ -1,5 +1,6 @@
>  DNNTESTPROGS += dnn-layer-pad
>  DNNTESTPROGS += dnn-layer-conv2d
> +DNNTESTPROGS += dnn-layer-depth2space
>
>  DNNTESTOBJS  := $(DNNTESTOBJS:%=$(DNNTESTSDIR)%) 
> $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test.o)
>  DNNTESTPROGS := $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test$(EXESUF))
> diff --git a/tests/dnn/dnn-layer-depth2space-test.c 
> b/tests/dnn/dnn-layer-depth2space-test.c
> new file mode 100644
> index 000..87118de
> --- /dev/null
> +++ b/tests/dnn/dnn-layer-depth2space-test.c
> @@ -0,0 +1,100 @@
> +/*
> + * Copyright (c) 2019 Guo Yejun
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include "libavfilter/dnn/dnn_backend_native.h"
> +#include "libavfilter/dnn/dnn_backend_native_layer_depth2space.h"
> +
> +#define EPSON 0.1
Just a note, I think you mean EPSILON right?  using EPSILON or EPS is
more conformant with the usual notation.

> +
> +static int test(void)
> +{
> +// the input data and expected data are generated with below python code.
> +/*
> +x = tf.placeholder(tf.float32, shape=[1, None, None, 4])
> +y = tf.depth_to_space(x, 2)
> +data = np.random.rand(1, 5, 3, 4);
> +
> +sess=tf.Session()
> +sess.run(tf.global_variables_initializer())
> +
> +output = sess.run(y, feed_dict={x: data})
> +
> +print("input:")
> +print(data.shape)
> +print(list(data.flatten()))
> +
> +print("output:")
> +print(output.shape)
> +print(list(output.flatten()))
> +*/
> +
> +DnnOperand operands[2];
> +int32_t input_indexes[1];
> +float input[1*5*3*4] = {
> +0.09771065121566602, 0.6336807372403175, 0.5142416549709786, 
> 0.8027206567330333, 0.2154276025069397, 0.12112878462616772, 
> 0.913936596765778,
> +0.38881443647542646, 0.5850447615898835, 0.9311499327398275, 
> 0.3613660929428246, 0.5420722002125493, 0.6002131190230359, 
> 0.44800665702299525,
> +0.7271322557896777, 0.3869293511885826, 0.5144404769364138, 
> 0.6910844856987723, 0.6142102742269762, 0.6249991371621018, 
> 0.45663376215836626,
> +0.19523477129943423, 0.2483895888532045, 0.64326768256278, 
> 0.5485877602998981, 0.45442067849873546, 0.529374943304256, 
> 0.30439850391811885,
> +0.11961343361340993, 0.2909643484561082, 0.9810970344127848, 
> 0.8886928489786549, 0.6112237084436409, 0.8852482695156674, 
> 0.9110868043114374,
> +0.21242780027585217, 0.7101536973207572, 0.9709717457443375, 
> 0.2702666770969332, 0.7718295953780221, 0.3957005164588574, 
> 0.24383544252475453,
> +0.040143453532367035, 0.26358051835323115, 0.013130251443791319, 
> 0.3016550481482074, 0.03582340459943956, 0.718025513612361, 
> 0.09844204177633753,
> +0.04433767496953056, 0.6221895044119757, 0.6190414032940228, 
> 0.8963550834625371, 0.5642449700064629, 0.2482982014723497, 
> 0.17824909294583013,
> +0.024401882408643272, 0.21742800875253465, 0.6794724473181843, 
> 0.4814830479242237
> +};
> +float expected_output[1*10*6*1] = {
> +0.097710654, 0.63368076, 0.2154276, 0.12112878, 0.58504474, 
> 0.93114996, 0.51424164, 0.80272067, 0.9139366, 0.38881445,
> +0.3613661, 0.5420722, 0.6002131, 0.44800666, 0.5144405, 0.6910845, 
> 0.45663378, 0.19523478, 0.72713226, 0.38692936,
> +0.61421025, 0.62499917, 0.24838959, 0.6432677, 0.54858774, 
> 0.4544207, 0.11961343, 0.29096434, 0.6112237, 0.88524824,
> +0.52937496, 0.3043985, 0.98109704, 0.88869286, 0.9110868, 0.2124278, 
> 0.7101537, 0.97097176, 0.3957005, 0.24383545,
> +0.013130251, 0.30165505, 0.27026668, 0.7718296, 0.040143453, 
> 0.2635805

Re: [FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: add layer maximum for native mode.

2019-09-20 Thread Pedro Arthur

Hi,

Em sex, 20 de set de 2019 às 01:00, Guo, Yejun  escreveu:
>
> The reason to add this layer is that it is used by srcnn in vf_sr.
> This layer is currently ignored in native mode. After this patch,
> we can add multiple outputs support for native mode.
>
I did not quite understand the commit message. Where does srcnn needs
max a layer?
What is the relation between max layer and supporting multiple outputs?

> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/Makefile   |  1 +
>  libavfilter/dnn/dnn_backend_native.c   | 36 ++-
>  libavfilter/dnn/dnn_backend_native.h   |  6 +--
>  libavfilter/dnn/dnn_backend_native_layer_maximum.c | 54 
> ++
>  libavfilter/dnn/dnn_backend_native_layer_maximum.h | 42 +
>  libavfilter/dnn/dnn_backend_tf.c   | 47 +++
>  tools/python/convert_from_tensorflow.py| 17 ++-
>  tools/python/convert_header.py |  2 +-
>  8 files changed, 198 insertions(+), 7 deletions(-)
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_maximum.c
>  create mode 100644 libavfilter/dnn/dnn_backend_native_layer_maximum.h
>
> diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> index 63a35e7..721094d 100644
> --- a/libavfilter/dnn/Makefile
> +++ b/libavfilter/dnn/Makefile
> @@ -3,6 +3,7 @@ OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native.o
>  OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_pad.o
>  OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_conv2d.o
>  OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_depth2space.o
> +OBJS-$(CONFIG_DNN)   += 
> dnn/dnn_backend_native_layer_maximum.o
>
>  DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index be548c6..22a9a33 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -28,6 +28,7 @@
>  #include "dnn_backend_native_layer_pad.h"
>  #include "dnn_backend_native_layer_conv2d.h"
>  #include "dnn_backend_native_layer_depth2space.h"
> +#include "dnn_backend_native_layer_maximum.h"
>
>  static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
>  {
> @@ -78,6 +79,7 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  ConvolutionalParams *conv_params;
>  DepthToSpaceParams *depth_to_space_params;
>  LayerPadParams *pad_params;
> +DnnLayerMaximumParams *maximum_params;
>
>  model = av_malloc(sizeof(DNNModel));
>  if (!model){
> @@ -237,6 +239,21 @@ DNNModel *ff_dnn_load_model_native(const char 
> *model_filename)
>  network->layers[layer].type = MIRROR_PAD;
>  network->layers[layer].params = pad_params;
>  break;
> +case MAXIMUM:
> +maximum_params = av_malloc(sizeof(*maximum_params));
> +if (!maximum_params){
> +avio_closep(&model_file_context);
> +ff_dnn_free_model_native(&model);
> +return NULL;
> +}
> +maximum_params->val.u32 = avio_rl32(model_file_context);
> +dnn_size += 4;
> +network->layers[layer].type = MAXIMUM;
> +network->layers[layer].params = maximum_params;
> +network->layers[layer].input_operand_indexes[0] = 
> (int32_t)avio_rl32(model_file_context);
> +network->layers[layer].output_operand_index = 
> (int32_t)avio_rl32(model_file_context);
> +dnn_size += 8;
> +break;
>  default:
>  avio_closep(&model_file_context);
>  ff_dnn_free_model_native(&model);
> @@ -290,6 +307,7 @@ DNNReturnType ff_dnn_execute_model_native(const DNNModel 
> *model, DNNData *output
>  ConvolutionalParams *conv_params;
>  DepthToSpaceParams *depth_to_space_params;
>  LayerPadParams *pad_params;
> +DnnLayerMaximumParams *maximum_params;
>
>  if (network->layers_num <= 0 || network->operands_num <= 0)
>  return DNN_ERROR;
> @@ -313,6 +331,11 @@ DNNReturnType ff_dnn_execute_model_native(const DNNModel 
> *model, DNNData *output
>  dnn_execute_layer_pad(network->operands, 
> network->layers[layer].input_operand_indexes,
>
> network->layers[layer].output_operand_index, pad_params);
>  break;
> +case MAXIMUM:
> +maximum_params = (DnnLayerMaximumParams 
> *)network->layers[layer].params;
> +dnn_execute_layer_maximum(network->operands, 
> network->layers[layer].input_operand_indexes,
> +  
> network->layers[layer].output_operand_index, maximum_params);
> +

Re: [FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: add layer maximum for native mode.

2019-09-20 Thread Pedro Arthur

Em sex, 20 de set de 2019 às 11:50, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > Pedro Arthur
> > Sent: Friday, September 20, 2019 10:17 PM
> > To: FFmpeg development discussions and patches 
> > Subject: Re: [FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: add layer maximum 
> > for
> > native mode.
> >
> > Hi,
> >
> > Em sex, 20 de set de 2019 às 01:00, Guo, Yejun 
> > escreveu:
> > >
> > > The reason to add this layer is that it is used by srcnn in vf_sr.
> > > This layer is currently ignored in native mode. After this patch,
> > > we can add multiple outputs support for native mode.
> > >
> > I did not quite understand the commit message. Where does srcnn needs
> > max a layer?
>
> see 
> https://github.com/HighVoltageRocknRoll/sr/blob/master/models/model_srcnn.py#L39
>  ,
> the maximum layer is the last layer of the model.
I see, indeed if I'm not missing something this max layer is
superfulous as the relu activation already does this right?
What we have to guarantee is that the output is in the range [0, 1],
that means we should have had a layer min(y, 1) instead of the max or
guarantee the conversion from float to integer properly saturates y >
1.

>
> > What is the relation between max layer and supporting multiple outputs?
>
> thanks, I did not describe it explicitly, will add more detail as below.
>
> The direct relation is the max layer and the model output name, and then 
> multiple outputs
> can be supported after the output name matching is supported.
>
> suppose the output name of srcnn is 'y', it means that the output name of max 
> layer is 'y'
> since max layer is the last layer. And suppose the input name of max layer is 
> 'z', the network
> looks like:
> ... -> 'z' -> (max layer) -> 'y'
>
> In current implementation, the max layer is ignored in native mode, it means 
> that 'y' is also
> discarded in native mode. The output name of the native model becomes 'z'. 
> And so we could not
> find the correct output operand with name 'y'.
>
> The reason that current implementation works is that we just consider the 
> last operand as the
> model output, ignoring the name matching.
>
> to support multiple outputs, we have to recognize output operands by names. 
> To support the output searching
> with name, we must add 'y' back to srcnn (that is to handle max layer), so 
> the vf_sr is compatible to work in both tf mode and native mode.
>
thanks, in any case the patch is useful, I should push it soon.

>
> >
> > > Signed-off-by: Guo, Yejun 
> > > ---
> > >  libavfilter/dnn/Makefile   |  1 +
> > >  libavfilter/dnn/dnn_backend_native.c   | 36
> > ++-
> > >  libavfilter/dnn/dnn_backend_native.h   |  6 +--
> > >  libavfilter/dnn/dnn_backend_native_layer_maximum.c | 54
> > ++
> > >  libavfilter/dnn/dnn_backend_native_layer_maximum.h | 42
> > +
> > >  libavfilter/dnn/dnn_backend_tf.c   | 47
> > +++
> > >  tools/python/convert_from_tensorflow.py| 17 ++-
> > >  tools/python/convert_header.py |  2 +-
> > >  8 files changed, 198 insertions(+), 7 deletions(-)
> > >  create mode 100644
> > libavfilter/dnn/dnn_backend_native_layer_maximum.c
> > >  create mode 100644
> > libavfilter/dnn/dnn_backend_native_layer_maximum.h
> > >
> > > diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
> > > index 63a35e7..721094d 100644
> > > --- a/libavfilter/dnn/Makefile
> > > +++ b/libavfilter/dnn/Makefile
> > > @@ -3,6 +3,7 @@ OBJS-$(CONFIG_DNN)
> > += dnn/dnn_backend_native.o
> > >  OBJS-$(CONFIG_DNN)   +=
> > dnn/dnn_backend_native_layer_pad.o
> > >  OBJS-$(CONFIG_DNN)   +=
> > dnn/dnn_backend_native_layer_conv2d.o
> > >  OBJS-$(CONFIG_DNN)   +=
> > dnn/dnn_backend_native_layer_depth2space.o
> > > +OBJS-$(CONFIG_DNN)   +=
> > dnn/dnn_backend_native_layer_maximum.o
> > >
> > >  DNN-OBJS-$(CONFIG_LIBTENSORFLOW) +=
> > dnn/dnn_backend_tf.o
> > >
> > > diff --git a/libavfilter/dnn/dnn_backend_native.c
> > b/libavfilter/dnn/dnn_backend_native.c
> > > index b

Re: [FFmpeg-devel] [PATCH 2/4] FATE/dnn: add unit test for layer maximum

2019-09-20 Thread Pedro Arthur

Em sex, 20 de set de 2019 às 01:01, Guo, Yejun  escreveu:
>
> Signed-off-by: Guo, Yejun 
> ---
>  tests/dnn/Makefile |  1 +
>  tests/dnn/dnn-layer-maximum-test.c | 71 
> ++
>  tests/fate/dnn.mak |  5 +++
>  3 files changed, 77 insertions(+)
>  create mode 100644 tests/dnn/dnn-layer-maximum-test.c
>
> diff --git a/tests/dnn/Makefile b/tests/dnn/Makefile
> index 3cb5f6d..e1bfe3f 100644
> --- a/tests/dnn/Makefile
> +++ b/tests/dnn/Makefile
> @@ -1,6 +1,7 @@
>  DNNTESTPROGS += dnn-layer-pad
>  DNNTESTPROGS += dnn-layer-conv2d
>  DNNTESTPROGS += dnn-layer-depth2space
> +DNNTESTPROGS += dnn-layer-maximum
>
>  DNNTESTOBJS  := $(DNNTESTOBJS:%=$(DNNTESTSDIR)%) 
> $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test.o)
>  DNNTESTPROGS := $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test$(EXESUF))
> diff --git a/tests/dnn/dnn-layer-maximum-test.c 
> b/tests/dnn/dnn-layer-maximum-test.c
> new file mode 100644
> index 000..06daf64
> --- /dev/null
> +++ b/tests/dnn/dnn-layer-maximum-test.c
> @@ -0,0 +1,71 @@
> +/*
> + * Copyright (c) 2019 Guo Yejun
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include "libavfilter/dnn/dnn_backend_native_layer_maximum.h"
> +
> +#define EPSON 0.1
> +
> +static int test(void)
> +{
> +DnnLayerMaximumParams params;
> +DnnOperand operands[2];
> +int32_t input_indexes[1];
> +float input[1*1*2*3] = {
> +-3, 2.5, 2, -2.1, 7.8, 100
> +};
> +float *output;
> +
> +params.val.y = 2.3;
> +
> +operands[0].data = input;
> +operands[0].dims[0] = 1;
> +operands[0].dims[1] = 1;
> +operands[0].dims[2] = 2;
> +operands[0].dims[3] = 3;
> +operands[1].data = NULL;
> +
> +input_indexes[0] = 0;
> +dnn_execute_layer_maximum(operands, input_indexes, 1, ¶ms);
> +
> +output = operands[1].data;
> +for (int i = 0; i < sizeof(input) / sizeof(float); i++) {
> +float expected_output = input[i] > params.val.y ? input[i] : 
> params.val.y;
> +if (fabs(output[i] - expected_output) > EPSON) {
> +printf("at index %d, output: %f, expected_output: %f\n", i, 
> output[i], expected_output);
> +av_freep(&output);
> +return 1;
> +}
> +}
> +
> +av_freep(&output);
> +return 0;
> +
> +}
> +
> +int main(int argc, char **argv)
> +{
> +if (test())
> +return 1;
> +
> +return 0;
> +}
> diff --git a/tests/fate/dnn.mak b/tests/fate/dnn.mak
> index 99578e0..ec60b07 100644
> --- a/tests/fate/dnn.mak
> +++ b/tests/fate/dnn.mak
> @@ -13,6 +13,11 @@ fate-dnn-layer-depth2space: 
> $(DNNTESTSDIR)/dnn-layer-depth2space-test$(EXESUF)
>  fate-dnn-layer-depth2space: CMD = run 
> $(DNNTESTSDIR)/dnn-layer-depth2space-test$(EXESUF)
>  fate-dnn-layer-depth2space: CMP = null
>
> +FATE_DNN += fate-dnn-layer-maximum
> +fate-dnn-layer-maximum: $(DNNTESTSDIR)/dnn-layer-maximum-test$(EXESUF)
> +fate-dnn-layer-maximum: CMD = run 
> $(DNNTESTSDIR)/dnn-layer-maximum-test$(EXESUF)
> +fate-dnn-layer-maximum: CMP = null
> +
>  FATE-yes += $(FATE_DNN)
>
>  fate-dnn: $(FATE_DNN)
> --
> 2.7.4
>
LGTM, pushed.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 4/4] libavfilter/dnn: support multiple outputs for native mode

2019-09-20 Thread Pedro Arthur

Em sex, 20 de set de 2019 às 01:01, Guo, Yejun  escreveu:
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c | 43 
> +++-
>  libavfilter/dnn/dnn_backend_native.h |  2 ++
>  2 files changed, 34 insertions(+), 11 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index 1b0aea2..68fca50 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -38,6 +38,7 @@ static DNNReturnType set_input_output_native(void *model, 
> DNNInputData *input, c
>  if (network->layers_num <= 0 || network->operands_num <= 0)
>  return DNN_ERROR;
>
> +/* inputs */
>  av_assert0(input->dt == DNN_FLOAT);
>  for (int i = 0; i < network->operands_num; ++i) {
>  oprd = &network->operands[i];
> @@ -64,6 +65,28 @@ static DNNReturnType set_input_output_native(void *model, 
> DNNInputData *input, c
>  return DNN_ERROR;
>
>  input->data = oprd->data;
> +
> +/* outputs */
> +network->nb_output = 0;
> +av_freep(&network->output_indexes);
> +network->output_indexes = av_mallocz_array(nb_output, 
> sizeof(*network->output_indexes));
> +if (!network->output_indexes)
> +return DNN_ERROR;
> +
> +for (uint32_t i = 0; i < nb_output; ++i) {
> +const char *output_name = output_names[i];
> +for (int j = 0; j < network->operands_num; ++j) {
> +oprd = &network->operands[j];
> +if (strcmp(oprd->name, output_name) == 0) {
> +network->output_indexes[network->nb_output++] = j;
> +break;
> +}
> +}
> +}
> +
> +if (network->nb_output != nb_output)
> +return DNN_ERROR;
> +
>  return DNN_SUCCESS;
>  }
>
> @@ -315,6 +338,7 @@ DNNReturnType ff_dnn_execute_model_native(const DNNModel 
> *model, DNNData *output
>  DepthToSpaceParams *depth_to_space_params;
>  LayerPadParams *pad_params;
>  DnnLayerMaximumParams *maximum_params;
> +uint32_t nb = FFMIN(nb_output, network->nb_output);
>
>  if (network->layers_num <= 0 || network->operands_num <= 0)
>  return DNN_ERROR;
> @@ -348,17 +372,13 @@ DNNReturnType ff_dnn_execute_model_native(const 
> DNNModel *model, DNNData *output
>  }
>  }
>
> -// native mode does not support multiple outputs yet
> -if (nb_output > 1)
> -return DNN_ERROR;
> -
> -/**
> - * as the first step, suppose network->operands[network->operands_num - 
> 1] is the output operand.
> - */
> -outputs[0].data = network->operands[network->operands_num - 1].data;
> -outputs[0].height = network->operands[network->operands_num - 1].dims[1];
> -outputs[0].width = network->operands[network->operands_num - 1].dims[2];
> -outputs[0].channels = network->operands[network->operands_num - 
> 1].dims[3];
> +for (uint32_t i = 0; i < nb; ++i) {
> +DnnOperand *oprd = &network->operands[network->output_indexes[i]];
> +outputs[i].data = oprd->data;
> +outputs[i].height = oprd->dims[1];
> +outputs[i].width = oprd->dims[2];
> +outputs[i].channels = oprd->dims[3];
> +}
>
>  return DNN_SUCCESS;
>  }
> @@ -401,6 +421,7 @@ void ff_dnn_free_model_native(DNNModel **model)
>  av_freep(&network->operands[operand].data);
>  av_freep(&network->operands);
>
> +av_freep(&network->output_indexes);
>  av_freep(&network);
>  av_freep(model);
>  }
> diff --git a/libavfilter/dnn/dnn_backend_native.h 
> b/libavfilter/dnn/dnn_backend_native.h
> index b238d18..3f2840c 100644
> --- a/libavfilter/dnn/dnn_backend_native.h
> +++ b/libavfilter/dnn/dnn_backend_native.h
> @@ -96,6 +96,8 @@ typedef struct ConvolutionalNetwork{
>  int32_t layers_num;
>  DnnOperand *operands;
>  int32_t operands_num;
> +int32_t *output_indexes;
> +uint32_t nb_output;
>  } ConvolutionalNetwork;
>
>  DNNModel *ff_dnn_load_model_native(const char *model_filename);
> --
> 2.7.4
>
LGTM, pushed.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/4] libavfilter/dnn/dnn_backend_native: find the input operand according to input name

2019-09-20 Thread Pedro Arthur

Em sex, 20 de set de 2019 às 01:01, Guo, Yejun  escreveu:
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c | 39 
> +---
>  1 file changed, 23 insertions(+), 16 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c 
> b/libavfilter/dnn/dnn_backend_native.c
> index 22a9a33..1b0aea2 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -33,30 +33,37 @@
>  static DNNReturnType set_input_output_native(void *model, DNNInputData 
> *input, const char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
> +DnnOperand *oprd = NULL;
>
>  if (network->layers_num <= 0 || network->operands_num <= 0)
>  return DNN_ERROR;
>
>  av_assert0(input->dt == DNN_FLOAT);
> +for (int i = 0; i < network->operands_num; ++i) {
> +oprd = &network->operands[i];
> +if (strcmp(oprd->name, input_name) == 0) {
> +if (oprd->type != DOT_INPUT)
> +return DNN_ERROR;
> +break;
> +}
> +oprd = NULL;
> +}
>
> -/**
> - * as the first step, suppose network->operands[0] is the input operand.
> - */
> -network->operands[0].dims[0] = 1;
> -network->operands[0].dims[1] = input->height;
> -network->operands[0].dims[2] = input->width;
> -network->operands[0].dims[3] = input->channels;
> -network->operands[0].type = DOT_INPUT;
> -network->operands[0].data_type = DNN_FLOAT;
> -network->operands[0].isNHWC = 1;
> -
> -av_freep(&network->operands[0].data);
> -network->operands[0].length = 
> calculate_operand_data_length(&network->operands[0]);
> -network->operands[0].data = av_malloc(network->operands[0].length);
> -if (!network->operands[0].data)
> +if (!oprd)
> +return DNN_ERROR;
> +
> +oprd->dims[0] = 1;
> +oprd->dims[1] = input->height;
> +oprd->dims[2] = input->width;
> +oprd->dims[3] = input->channels;
> +
> +av_freep(&oprd->data);
> +oprd->length = calculate_operand_data_length(oprd);
> +oprd->data = av_malloc(oprd->length);
> +if (!oprd->data)
>  return DNN_ERROR;
>
> -input->data = network->operands[0].data;
> +input->data = oprd->data;
>  return DNN_SUCCESS;
>  }
>
> --
> 2.7.4
>
LGTM, pushed.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] FATE/dnn: fix stack buffer overflow

2019-10-04 Thread Pedro Arthur

Em sex, 4 de out de 2019 às 09:11, Guo, Yejun 
escreveu:

>
>
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > quinkbl...@foxmail.com
> > Sent: Tuesday, October 01, 2019 2:37 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Zhao Zhili 
> > Subject: [FFmpeg-devel] [PATCH] FATE/dnn: fix stack buffer overflow
> >
> > From: Zhao Zhili 
> >
> > ---
> >  tests/dnn/dnn-layer-pad-test.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tests/dnn/dnn-layer-pad-test.c
> b/tests/dnn/dnn-layer-pad-test.c
> > index 1fb2be1590..ea8c824d1e 100644
> > --- a/tests/dnn/dnn-layer-pad-test.c
> > +++ b/tests/dnn/dnn-layer-pad-test.c
> > @@ -203,7 +203,7 @@ static int test_with_mode_constant(void)
> >  params.paddings[3][1] = 2;
> >
> >  operands[0].data = input;
> > -operands[0].dims[0] = 3;
> > +operands[0].dims[0] = 1;
>
> nice catch, LGTM, thanks.
>
Pushed.


> >  operands[0].dims[1] = 2;
> >  operands[0].dims[2] = 2;
> >  operands[0].dims[3] = 3;
> > --
> > 2.17.1
> >
> >
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/2] avfilter/dnn: unify the layer execution function in native mode

2019-10-04 Thread Pedro Arthur

Hi,

Em sex, 4 de out de 2019 às 09:59, Guo, Yejun 
escreveu:

>
>
> > -Original Message-
> > From: Guo, Yejun
> > Sent: Tuesday, September 24, 2019 1:34 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Guo, Yejun 
> > Subject: [PATCH V2 2/2] avfilter/dnn: unify the layer execution function
> in native
> > mode
> >
> > with this change, the code will be simpler when more layers supported.
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  libavfilter/dnn/dnn_backend_native.c   | 38
> > --
> >  libavfilter/dnn/dnn_backend_native_layer_conv2d.c  |  4 ++-
> >  libavfilter/dnn/dnn_backend_native_layer_conv2d.h  |  3 +-
> >  .../dnn/dnn_backend_native_layer_depth2space.c |  5 ++-
> >  .../dnn/dnn_backend_native_layer_depth2space.h |  3 +-
> >  libavfilter/dnn/dnn_backend_native_layer_maximum.c |  4 ++-
> >  libavfilter/dnn/dnn_backend_native_layer_maximum.h |  3 +-
> >  libavfilter/dnn/dnn_backend_native_layer_pad.c |  5 +--
> >  libavfilter/dnn/dnn_backend_native_layer_pad.h |  4 +--
> >  libavfilter/dnn/dnn_backend_native_layers.hxx  |  4 +++
>
I don't think this naming pattern is used in the code base (at least I
could not find anything similar).
Also I think the code would be much more clean without using macros, for
example just define an array containing the function pointers and index it
using the enum type.

rest looks good.

> >  tests/dnn/dnn-layer-conv2d-test.c  |  4 +--
> >  tests/dnn/dnn-layer-depth2space-test.c |  4 ++-
> >  tests/dnn/dnn-layer-maximum-test.c |  2 +-
> >  tests/dnn/dnn-layer-pad-test.c |  6 ++--
> >  14 files changed, 48 insertions(+), 41 deletions(-)
> >  create mode 100644 libavfilter/dnn/dnn_backend_native_layers.hxx
> >
>
> this patch set ask for review, thanks.
>
> I'll add another patch to unify the layer load function after the holiday.
>
> I might have several students help me to support more native layers in c
> code,
> so they can just focus on his own separate dnn_backend_layer_xxx.h/c files.
>
>
> btw, my other patches for vf_dnn_rgb_processing is still in
> https://github.com/guoyejun/ffmpeg/tree/dnn0927
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 3/3] avfilter/dnn: unify the layer load function in native mode

2019-10-15 Thread Pedro Arthur

Em ter, 15 de out de 2019 às 02:29, Guo, Yejun 
escreveu:

>
>
> > -Original Message-
> > From: Guo, Yejun
> > Sent: Wednesday, October 09, 2019 10:08 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Guo, Yejun 
> > Subject: [PATCH V3 3/3] avfilter/dnn: unify the layer load function in
> native
> > mode
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  libavfilter/dnn/dnn_backend_native.c   | 114
> +++--
> >  libavfilter/dnn/dnn_backend_native.h   |   2 +-
> >  libavfilter/dnn/dnn_backend_native_layer_conv2d.c  |  46 +
> >  libavfilter/dnn/dnn_backend_native_layer_conv2d.h  |   1 +
> >  .../dnn/dnn_backend_native_layer_depth2space.c |  18 
> >  .../dnn/dnn_backend_native_layer_depth2space.h |   1 +
> >  libavfilter/dnn/dnn_backend_native_layer_maximum.c |  18 
> >  libavfilter/dnn/dnn_backend_native_layer_maximum.h |   1 +
> >  libavfilter/dnn/dnn_backend_native_layer_pad.c |  23 +
> >  libavfilter/dnn/dnn_backend_native_layer_pad.h |   1 +
> >  libavfilter/dnn/dnn_backend_native_layers.c|  12 +--
> >  libavfilter/dnn/dnn_backend_native_layers.h|   8 +-
> >  12 files changed, 135 insertions(+), 110 deletions(-)
> >
> this patch set asks for review, thanks.
>

Patch set LGTM.
Please make sure your source files end in a newline otherwise it can't be
pushed, I've fixed it locally and pushed, thanks!



> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/4] dnn: add tf.nn.conv2d support for native model

2019-10-30 Thread Pedro Arthur

Em seg, 21 de out de 2019 às 09:44, Guo, Yejun 
escreveu:

> Unlike other tf.*.conv2d layers, tf.nn.conv2d does not create many
> nodes (within a scope) in the graph, it just acts like other layers.
> tf.nn.conv2d only creates one node in the graph, and no internal
> nodes such as 'kernel' are created.
>
> The format of native model file is also changed, a flag named
> has_bias is added, so change the version number.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c  |  2 +-
>  libavfilter/dnn/dnn_backend_native_layer_conv2d.c | 37 +++-
>  libavfilter/dnn/dnn_backend_native_layer_conv2d.h |  1 +
>  tests/dnn/dnn-layer-conv2d-test.c |  2 +
>  tools/python/convert_from_tensorflow.py   | 54
> ---
>  tools/python/convert_header.py|  4 +-
>  6 files changed, 82 insertions(+), 18 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c
> b/libavfilter/dnn/dnn_backend_native.c
> index 06b010d..ff280b5 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -98,7 +98,7 @@ DNNModel *ff_dnn_load_model_native(const char
> *model_filename)
>  char header_expected[] = "FFMPEGDNNNATIVE";
>  char *buf;
>  size_t size;
> -int version, header_size, major_version_expected = 0;
> +int version, header_size, major_version_expected = 1;
>  ConvolutionalNetwork *network = NULL;
>  AVIOContext *model_file_context;
>  int file_size, dnn_size, parsed_size;
> diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> index 0de8902..6ec0fa7 100644
> --- a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> +++ b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> @@ -38,27 +38,41 @@ int dnn_load_layer_conv2d(Layer *layer, AVIOContext
> *model_file_context, int fil
>  conv_params->input_num = (int32_t)avio_rl32(model_file_context);
>  conv_params->output_num = (int32_t)avio_rl32(model_file_context);
>  conv_params->kernel_size = (int32_t)avio_rl32(model_file_context);
> +conv_params->has_bias = (int32_t)avio_rl32(model_file_context);
> +dnn_size += 28;
> +
>  kernel_size = conv_params->input_num * conv_params->output_num *
> -  conv_params->kernel_size * conv_params->kernel_size;
> -dnn_size += 24 + (kernel_size + conv_params->output_num << 2);
> +  conv_params->kernel_size * conv_params->kernel_size;
> +dnn_size += kernel_size * 4;
> +if (conv_params->has_bias)
> +dnn_size += conv_params->output_num * 4;
> +
>  if (dnn_size > file_size || conv_params->input_num <= 0 ||
>  conv_params->output_num <= 0 || conv_params->kernel_size <= 0){
>  av_freep(&conv_params);
>  return 0;
>  }
> +
>  conv_params->kernel = av_malloc(kernel_size * sizeof(float));
> -conv_params->biases = av_malloc(conv_params->output_num *
> sizeof(float));
> -if (!conv_params->kernel || !conv_params->biases){
> -av_freep(&conv_params->kernel);
> -av_freep(&conv_params->biases);
> +if (!conv_params->kernel) {
>  av_freep(&conv_params);
>  return 0;
>  }
> -for (int i = 0; i < kernel_size; ++i){
> +for (int i = 0; i < kernel_size; ++i) {
>  conv_params->kernel[i] =
> av_int2float(avio_rl32(model_file_context));
>  }
> -for (int i = 0; i < conv_params->output_num; ++i){
> -conv_params->biases[i] =
> av_int2float(avio_rl32(model_file_context));
> +
> +conv_params->biases = NULL;
> +if (conv_params->has_bias) {
> +conv_params->biases = av_malloc(conv_params->output_num *
> sizeof(float));
> +if (!conv_params->biases){
> +av_freep(&conv_params->kernel);
> +av_freep(&conv_params);
> +return 0;
> +}
> +for (int i = 0; i < conv_params->output_num; ++i){
> +conv_params->biases[i] =
> av_int2float(avio_rl32(model_file_context));
> +}
>  }
>
>  layer->params = conv_params;
> @@ -103,7 +117,10 @@ int dnn_execute_layer_conv2d(DnnOperand *operands,
> const int32_t *input_operand_
>  for (int y = pad_size; y < height - pad_size; ++y) {
>  for (int x = pad_size; x < width - pad_size; ++x) {
>  for (int n_filter = 0; n_filter < conv_params->output_num;
> ++n_filter) {
> -output[n_filter] = conv_params->biases[n_filter];
> +if (conv_params->has_bias)
> +output[n_filter] = conv_params->biases[n_filter];
> +else
> +output[n_filter] = 0.f;
>
>  for (int ch = 0; ch < conv_params->input_num; ++ch) {
>  for (int kernel_y = 0; kernel_y <
> conv_params->kernel_size; ++kernel_y) {
> diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.h
> b/libavfilter/dnn/dnn_backend_native_layer_conv2d.

Re: [FFmpeg-devel] [PATCH V2 2/4] avfilter/dnn: get the data type of network output from dnn execution result

2019-10-30 Thread Pedro Arthur

Em seg, 21 de out de 2019 às 09:44, Guo, Yejun 
escreveu:

> so,  we can make a filter more general to accept different network
> models, by adding a data type convertion after getting data from network.
>
> After we add dt field into struct DNNData, it becomes the same as
> DNNInputData, so merge them with one struct: DNNData.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c   | 3 ++-
>  libavfilter/dnn/dnn_backend_native_layer_conv2d.c  | 1 +
>  libavfilter/dnn/dnn_backend_native_layer_depth2space.c | 1 +
>  libavfilter/dnn/dnn_backend_native_layer_pad.c | 1 +
>  libavfilter/dnn/dnn_backend_tf.c   | 5 +++--
>  libavfilter/dnn_interface.h| 9 ++---
>  libavfilter/vf_derain.c| 4 ++--
>  libavfilter/vf_sr.c| 2 +-
>  8 files changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c
> b/libavfilter/dnn/dnn_backend_native.c
> index ff280b5..add1db4 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -28,7 +28,7 @@
>  #include "dnn_backend_native_layer_conv2d.h"
>  #include "dnn_backend_native_layers.h"
>
> -static DNNReturnType set_input_output_native(void *model, DNNInputData
> *input, const char *input_name, const char **output_names, uint32_t
> nb_output)
> +static DNNReturnType set_input_output_native(void *model, DNNData *input,
> const char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
>  DnnOperand *oprd = NULL;
> @@ -263,6 +263,7 @@ DNNReturnType ff_dnn_execute_model_native(const
> DNNModel *model, DNNData *output
>  outputs[i].height = oprd->dims[1];
>  outputs[i].width = oprd->dims[2];
>  outputs[i].channels = oprd->dims[3];
> +outputs[i].dt = oprd->data_type;
>  }
>
>  return DNN_SUCCESS;
> diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> index 6ec0fa7..7b29697 100644
> --- a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> +++ b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c
> @@ -106,6 +106,7 @@ int dnn_execute_layer_conv2d(DnnOperand *operands,
> const int32_t *input_operand_
>  output_operand->dims[1] = height - pad_size * 2;
>  output_operand->dims[2] = width - pad_size * 2;
>  output_operand->dims[3] = conv_params->output_num;
> +output_operand->data_type = operands[input_operand_index].data_type;
>  output_operand->length =
> calculate_operand_data_length(output_operand);
>  output_operand->data = av_realloc(output_operand->data,
> output_operand->length);
>  if (!output_operand->data)
> diff --git a/libavfilter/dnn/dnn_backend_native_layer_depth2space.c
> b/libavfilter/dnn/dnn_backend_native_layer_depth2space.c
> index 174676e..7dab19d 100644
> --- a/libavfilter/dnn/dnn_backend_native_layer_depth2space.c
> +++ b/libavfilter/dnn/dnn_backend_native_layer_depth2space.c
> @@ -69,6 +69,7 @@ int dnn_execute_layer_depth2space(DnnOperand *operands,
> const int32_t *input_ope
>  output_operand->dims[1] = height * block_size;
>  output_operand->dims[2] = width * block_size;
>  output_operand->dims[3] = new_channels;
> +output_operand->data_type = operands[input_operand_index].data_type;
>  output_operand->length =
> calculate_operand_data_length(output_operand);
>  output_operand->data = av_realloc(output_operand->data,
> output_operand->length);
>  if (!output_operand->data)
> diff --git a/libavfilter/dnn/dnn_backend_native_layer_pad.c
> b/libavfilter/dnn/dnn_backend_native_layer_pad.c
> index 8fa35de..8e5959b 100644
> --- a/libavfilter/dnn/dnn_backend_native_layer_pad.c
> +++ b/libavfilter/dnn/dnn_backend_native_layer_pad.c
> @@ -105,6 +105,7 @@ int dnn_execute_layer_pad(DnnOperand *operands, const
> int32_t *input_operand_ind
>  output_operand->dims[1] = new_height;
>  output_operand->dims[2] = new_width;
>  output_operand->dims[3] = new_channel;
> +output_operand->data_type = operands[input_operand_index].data_type;
>  output_operand->length =
> calculate_operand_data_length(output_operand);
>  output_operand->data = av_realloc(output_operand->data,
> output_operand->length);
>  if (!output_operand->data)
> diff --git a/libavfilter/dnn/dnn_backend_tf.c
> b/libavfilter/dnn/dnn_backend_tf.c
> index c8dff51..ed91d05 100644
> --- a/libavfilter/dnn/dnn_backend_tf.c
> +++ b/libavfilter/dnn/dnn_backend_tf.c
> @@ -83,7 +83,7 @@ static TF_Buffer *read_graph(const char *model_filename)
>  return graph_buf;
>  }
>
> -static TF_Tensor *allocate_input_tensor(const DNNInputData *input)
> +static TF_Tensor *allocate_input_tensor(const DNNData *input)
>  {
>  TF_DataType dt;
>  size_t size;
> @@ -105,7 +105,7 @@ static TF_Tensor *allocate_input_tensor(con

Re: [FFmpeg-devel] [PATCH V2 3/4] avfilter/dnn: add a new interface to query dnn model's input info

2019-10-30 Thread Pedro Arthur

Em seg, 21 de out de 2019 às 09:44, Guo, Yejun 
escreveu:

> to support dnn networks more general, we need to know the input info
> of the dnn model.
>
> background:
> The data type of dnn model's input could be float32, uint8 or fp16, etc.
> And the w/h of input image could be fixed or variable.
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/dnn/dnn_backend_native.c | 24 +++-
>  libavfilter/dnn/dnn_backend_tf.c | 32 
>  libavfilter/dnn_interface.h  |  3 +++
>  3 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/libavfilter/dnn/dnn_backend_native.c
> b/libavfilter/dnn/dnn_backend_native.c
> index add1db4..94634b3 100644
> --- a/libavfilter/dnn/dnn_backend_native.c
> +++ b/libavfilter/dnn/dnn_backend_native.c
> @@ -28,6 +28,28 @@
>  #include "dnn_backend_native_layer_conv2d.h"
>  #include "dnn_backend_native_layers.h"
>
> +static DNNReturnType get_input_native(void *model, DNNData *input, const
> char *input_name)
> +{
> +ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
> +
> +for (int i = 0; i < network->operands_num; ++i) {
> +DnnOperand *oprd = &network->operands[i];
> +if (strcmp(oprd->name, input_name) == 0) {
> +if (oprd->type != DOT_INPUT)
> +return DNN_ERROR;
> +input->dt = oprd->data_type;
> +av_assert0(oprd->dims[0] == 1);
> +input->height = oprd->dims[1];
> +input->width = oprd->dims[2];
> +input->channels = oprd->dims[3];
> +return DNN_SUCCESS;
> +}
> +}
> +
> +// do not find the input operand
> +return DNN_ERROR;
> +}
> +
>  static DNNReturnType set_input_output_native(void *model, DNNData *input,
> const char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  ConvolutionalNetwork *network = (ConvolutionalNetwork *)model;
> @@ -37,7 +59,6 @@ static DNNReturnType set_input_output_native(void
> *model, DNNData *input, const
>  return DNN_ERROR;
>
>  /* inputs */
> -av_assert0(input->dt == DNN_FLOAT);
>  for (int i = 0; i < network->operands_num; ++i) {
>  oprd = &network->operands[i];
>  if (strcmp(oprd->name, input_name) == 0) {
> @@ -234,6 +255,7 @@ DNNModel *ff_dnn_load_model_native(const char
> *model_filename)
>  }
>
>  model->set_input_output = &set_input_output_native;
> +model->get_input = &get_input_native;
>
>  return model;
>  }
> diff --git a/libavfilter/dnn/dnn_backend_tf.c
> b/libavfilter/dnn/dnn_backend_tf.c
> index ed91d05..a921667 100644
> --- a/libavfilter/dnn/dnn_backend_tf.c
> +++ b/libavfilter/dnn/dnn_backend_tf.c
> @@ -105,6 +105,37 @@ static TF_Tensor *allocate_input_tensor(const DNNData
> *input)
>   input_dims[1] * input_dims[2] *
> input_dims[3] * size);
>  }
>
> +static DNNReturnType get_input_tf(void *model, DNNData *input, const char
> *input_name)
> +{
> +TFModel *tf_model = (TFModel *)model;
> +TF_Status *status;
> +int64_t dims[4];
> +
> +TF_Output tf_output;
> +tf_output.oper = TF_GraphOperationByName(tf_model->graph, input_name);
> +if (!tf_output.oper)
> +return DNN_ERROR;
> +
> +tf_output.index = 0;
> +input->dt = TF_OperationOutputType(tf_output);
> +
> +status = TF_NewStatus();
> +TF_GraphGetTensorShape(tf_model->graph, tf_output, dims, 4, status);
> +if (TF_GetCode(status) != TF_OK){
> +TF_DeleteStatus(status);
> +return DNN_ERROR;
> +}
> +TF_DeleteStatus(status);
> +
> +// currently only NHWC is supported
> +av_assert0(dims[0] == 1);
> +input->height = dims[1];
> +input->width = dims[2];
> +input->channels = dims[3];
> +
> +return DNN_SUCCESS;
> +}
> +
>  static DNNReturnType set_input_output_tf(void *model, DNNData *input,
> const char *input_name, const char **output_names, uint32_t nb_output)
>  {
>  TFModel *tf_model = (TFModel *)model;
> @@ -568,6 +599,7 @@ DNNModel *ff_dnn_load_model_tf(const char
> *model_filename)
>
>  model->model = (void *)tf_model;
>  model->set_input_output = &set_input_output_tf;
> +model->get_input = &get_input_tf;
>
>  return model;
>  }
> diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h
> index fdefcb7..b20e5c8 100644
> --- a/libavfilter/dnn_interface.h
> +++ b/libavfilter/dnn_interface.h
> @@ -43,6 +43,9 @@ typedef struct DNNData{
>  typedef struct DNNModel{
>  // Stores model that can be different for different backends.
>  void *model;
> +// Gets model input information
> +// Just reuse struct DNNData here, actually the DNNData.data field is
> not needed.
> +DNNReturnType (*get_input)(void *model, DNNData *input, const char
> *input_name);
>  // Sets model input and output.
>  // Should be called at least once before model execution.
>  DNNReturnType (*set_input_output)(void *model, DNNData *input, const
> char *inp

Re: [FFmpeg-devel] [PATCH] avfilter/vf_sr: correct flags since the filter changes frame w/h

2019-10-30 Thread Pedro Arthur

Pushed, thanks.

Em seg, 28 de out de 2019 às 10:24, Paul B Mahol 
escreveu:

> LGTM
>
> On 10/28/19, Guo, Yejun  wrote:
> > If filter changes frame w/h, AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC
> > cannot be supported.
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> >  libavfilter/vf_sr.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
> > index 0433246..b90643c 100644
> > --- a/libavfilter/vf_sr.c
> > +++ b/libavfilter/vf_sr.c
> > @@ -317,5 +317,4 @@ AVFilter ff_vf_sr = {
> >  .inputs= sr_inputs,
> >  .outputs   = sr_outputs,
> >  .priv_class= &sr_class,
> > -.flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC,
> >  };
> > --
> > 2.7.4
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3] avfilter/vf_dnn_processing: add a generic filter for image proccessing with dnn networks

2019-11-06 Thread Pedro Arthur

Hi,

Em qui., 31 de out. de 2019 às 05:39, Guo, Yejun 
escreveu:

> This filter accepts all the dnn networks which do image processing.
> Currently, frame with formats rgb24 and bgr24 are supported. Other
> formats such as gray and YUV will be supported next. The dnn network
> can accept data in float32 or uint8 format. And the dnn network can
> change frame size.
>
> The following is a python script to halve the value of the first
> channel of the pixel. It demos how to setup and execute dnn model
> with python+tensorflow. It also generates .pb file which will be
> used by ffmpeg.
>
> import tensorflow as tf
> import numpy as np
> import scipy.misc
> in_img = scipy.misc.imread('in.bmp')
> in_img = in_img.astype(np.float32)/255.0
> in_data = in_img[np.newaxis, :]
> filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0,
> 1.]).reshape(1,1,3,3).astype(np.float32)
> filter = tf.Variable(filter_data)
> x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
> y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID',
> name='dnn_out')
> sess=tf.Session()
> sess.run(tf.global_variables_initializer())
> output = sess.run(y, feed_dict={x: in_data})
> graph_def = tf.graph_util.convert_variables_to_constants(sess,
> sess.graph_def, ['dnn_out'])
> tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb',
> as_text=False)
> output = output * 255.0
> output = output.astype(np.uint8)
> scipy.misc.imsave("out.bmp", np.squeeze(output))
>
> To do the same thing with ffmpeg:
> - generate halve_first_channel.pb with the above script
> - generate halve_first_channel.model with tools/python/convert.py
> - try with following commands
>   ./ffmpeg -i input.jpg -vf
> dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=native
> -y out.native.png
>   ./ffmpeg -i input.jpg -vf
> dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=tensorflow
> -y out.tf.png
>
It would be great if you could transform the above steps in a fate test,
that way one can automatically ensure the filter is always working properly.


>
> Signed-off-by: Guo, Yejun 
> ---
>  configure   |   1 +
>  doc/filters.texi|  44 ++
>  libavfilter/Makefile|   1 +
>  libavfilter/allfilters.c|   1 +
>  libavfilter/vf_dnn_processing.c | 331
> 
>  5 files changed, 378 insertions(+)
>  create mode 100644 libavfilter/vf_dnn_processing.c
>
> diff --git a/configure b/configure
> index 875b77f..4b3964d 100755
> --- a/configure
> +++ b/configure
> @@ -3463,6 +3463,7 @@ derain_filter_select="dnn"
>  deshake_filter_select="pixelutils"
>  deshake_opencl_filter_deps="opencl"
>  dilation_opencl_filter_deps="opencl"
> +dnn_processing_filter_select="dnn"
>  drawtext_filter_deps="libfreetype"
>  drawtext_filter_suggest="libfontconfig libfribidi"
>  elbg_filter_deps="avcodec"
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 9d387be..15771ab 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -8928,6 +8928,50 @@ ffmpeg -i INPUT -f lavfi -i
> nullsrc=hd720,geq='r=128+80*(sin(sqrt((X-W/2)*(X-W/2
>  @end example
>  @end itemize
>
> +@section dnn_processing
> +
> +Do image processing with deep neural networks. Currently only AVFrame
> with RGB24
> +and BGR24 are supported, more formats will be added later.
> +
> +The filter accepts the following options:
> +
> +@table @option
> +@item dnn_backend
> +Specify which DNN backend to use for model loading and execution. This
> option accepts
> +the following values:
> +
> +@table @samp
> +@item native
> +Native implementation of DNN loading and execution.
> +
> +@item tensorflow
> +TensorFlow backend. To enable this backend you
> +need to install the TensorFlow for C library (see
> +@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg
> with
> +@code{--enable-libtensorflow}
> +@end table
> +
> +Default value is @samp{native}.
> +
> +@item model
> +Set path to model file specifying network architecture and its parameters.
> +Note that different backends use different file formats. TensorFlow and
> native
> +backend can load files for only its format.
> +
> +Native model file (.model) can be generated from TensorFlow model file
> (.pb) by using tools/python/convert.py
> +
> +@item input
> +Set the input name of the dnn network.
> +
> +@item output
> +Set the output name of the dnn network.
> +
> +@item fmt
> +Set the pixel format for the Frame. Allowed values are
> @code{AV_PIX_FMT_RGB24}, and @code{AV_PIX_FMT_BGR24}.
> +Default value is @code{AV_PIX_FMT_RGB24}.
> +
> +@end table
> +
>  @section drawbox
>
>  Draw a colored box on the input image.
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 2080eed..3eff398 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -223,6 +223,7 @@ OBJS-$(CONFIG_DILATION_FILTER)   +=
> vf_neighbor.o
>  OBJS-$(CONFIG_DILATION_OP

Re: [FFmpeg-devel] [PATCH V3] avfilter/vf_dnn_processing: add a generic filter for image proccessing with dnn networks

2019-11-07 Thread Pedro Arthur

Em qui., 7 de nov. de 2019 às 13:17, Guo, Yejun 
escreveu:

>
> > > From: Pedro Arthur [mailto:bygran...@gmail.com]
> > > Sent: Thursday, November 07, 2019 1:18 AM
> > > To: FFmpeg development discussions and patches
> > 
> > > Cc: Guo, Yejun 
> > > Subject: Re: [FFmpeg-devel] [PATCH V3] avfilter/vf_dnn_processing: add
> a
> > > generic filter for image proccessing with dnn networks
> > >
> > > Hi,
> > >
> > > Em qui., 31 de out. de 2019 às 05:39, Guo, Yejun 
> > > escreveu:
> > > This filter accepts all the dnn networks which do image processing.
> > > Currently, frame with formats rgb24 and bgr24 are supported. Other
> > > formats such as gray and YUV will be supported next. The dnn network
> > > can accept data in float32 or uint8 format. And the dnn network can
> > > change frame size.
> > >
> > > The following is a python script to halve the value of the first
> > > channel of the pixel. It demos how to setup and execute dnn model
> > > with python+tensorflow. It also generates .pb file which will be
> > > used by ffmpeg.
> > >
> > > import tensorflow as tf
> > > import numpy as np
> > > import scipy.misc
> > > in_img = scipy.misc.imread('in.bmp')
> > > in_img = in_img.astype(np.float32)/255.0
> > > in_data = in_img[np.newaxis, :]
> > > filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0,
> > > 1.]).reshape(1,1,3,3).astype(np.float32)
> > > filter = tf.Variable(filter_data)
> > > x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
> > > y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID',
> > name='dnn_out')
> > > sess=tf.Session()
> > > sess.run(tf.global_variables_initializer())
> > > output = sess.run(y, feed_dict={x: in_data})
> > > graph_def = tf.graph_util.convert_variables_to_constants(sess,
> > > sess.graph_def, ['dnn_out'])
> > > tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb',
> as_text=False)
> > > output = output * 255.0
> > > output = output.astype(np.uint8)
> > > scipy.misc.imsave("out.bmp", np.squeeze(output))
> > >
> > > To do the same thing with ffmpeg:
> > > - generate halve_first_channel.pb with the above script
> > > - generate halve_first_channel.model with tools/python/convert.py
> > > - try with following commands
> > >   ./ffmpeg -i input.jpg -vf
> > >
> > dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_
> > > out:fmt=rgb24:dnn_backend=native -y out.native.png
> > >   ./ffmpeg -i input.jpg -vf
> > >
> > dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:f
> > > mt=rgb24:dnn_backend=tensorflow -y out.tf.png
> > It would be great if you could transform the above steps in a fate test,
> that
> > way one can automatically ensure the filter is always working properly.
>
> sure, I'll add a fate test to test this filter with
> halve_first_channel.model. There will
> be no test for tensorflow part since the fate test requires no external
> dependency.
>
> furthermore, more industry-famous models can be added into this fate test
> after we support them by
> adding more layers into native mode, and after we optimize the conv2d
> layer which is now
> very very very very slow.
>
> > > +};
> > > +
> > > +AVFilter ff_vf_dnn_processing = {
> > > +.name  = "dnn_processing",
> > > +.description   = NULL_IF_CONFIG_SMALL("Apply DNN processing
> > filter
> > > to the input."),
> > > +.priv_size = sizeof(DnnProcessingContext),
> > > +.init  = init,
> > > +.uninit= uninit,
> > > +.query_formats = query_formats,
> > > +.inputs= dnn_processing_inputs,
> > > +.outputs   = dnn_processing_outputs,
> > > +.priv_class= &dnn_processing_class,
> > > +};
> > > --
> > > 2.7.4
> > rest LGTM.
>
> thanks, could we first push this patch?
>
patch pushed, thanks.
I slight edited the commit message, changed "scipy.misc" to "imageio" as
the former is deprecated and not present in newer versions.


> I plan to add two more changes for this filter next:
> - add gray8 and gray32 support
> - add y_from_yuv support, in other words, the network only handles the Y
> channel,
> and uv parts ar

Re: [FFmpeg-devel] [PATCH] MAINTAINERS: add myself to libavfilter/dnn

2019-12-02 Thread Pedro Arthur

LGTM.

Em seg, 2 de dez de 2019 06:17, Steven Liu  escreveu:

>
>
> > 在 2019年11月30日，12:24，Guo, Yejun  写道：
> >
> > Signed-off-by: Guo, Yejun 
> > ---
> > MAINTAINERS | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 7f60ef0..5d02520 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -369,6 +369,8 @@ Filters:
> > Sources:
> >   vsrc_mandelbrot.c Michael Niedermayer
> >
> > +dnn Yejun Guo
> > +
> > libavformat
> > ===
> >
> > --
> > 2.7.4
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> Patch will be applied after 24 hours if there have no objections.
> Because Yejun Guo is interested in this part, he continuous focus on
> optimization the dnn codes.
>
> Thanks
> Steven
>
>
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/4] avfilter/vf_dnn_processing: add format GRAY8 and GRAYF32 support

2019-12-12 Thread Pedro Arthur

Hi,

how should I test this patch?

Em sex., 22 de nov. de 2019 às 04:57, Guo, Yejun 
escreveu:

> Signed-off-by: Guo, Yejun 
> ---
>  doc/filters.texi|   8 ++-
>  libavfilter/vf_dnn_processing.c | 147
> ++--
>  2 files changed, 118 insertions(+), 37 deletions(-)
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 1f86ae1..c3f7997 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -8992,7 +8992,13 @@ Set the input name of the dnn network.
>  Set the output name of the dnn network.
>
>  @item fmt
> -Set the pixel format for the Frame. Allowed values are
> @code{AV_PIX_FMT_RGB24}, and @code{AV_PIX_FMT_BGR24}.
> +Set the pixel format for the Frame, the value is determined by the input
> of the dnn network model.
>
This sentence is a bit confusing, also I think this property should be
removed. (I will explain bellow).

+
> +If the model handles RGB (or BGR) image and the data type of model input
> is uint8, fmt must be @code{AV_PIX_FMT_RGB24} (or @code{AV_PIX_FMT_BGR24}.
> +If the model handles RGB (or BGR) image and the data type of model input
> is float, fmt must be @code{AV_PIX_FMT_RGB24} (or @code{AV_PIX_FMT_BGR24},
> and this filter will do data type conversion internally.
> +If the model handles GRAY image and the data type of model input is
> uint8, fmt must be @code{AV_PIX_FMT_GRAY8}.
> +If the model handles GRAY image and the data type of model input is
> float, fmt must be @code{AV_PIX_FMT_GRAYF32}.
> +
>  Default value is @code{AV_PIX_FMT_RGB24}.
>
>  @end table
> diff --git a/libavfilter/vf_dnn_processing.c
> b/libavfilter/vf_dnn_processing.c
> index ce976ec..963dd5e 100644
> --- a/libavfilter/vf_dnn_processing.c
> +++ b/libavfilter/vf_dnn_processing.c
> @@ -70,10 +70,12 @@ static av_cold int init(AVFilterContext *context)
>  {
>  DnnProcessingContext *ctx = context->priv;
>  int supported = 0;
> -// as the first step, only rgb24 and bgr24 are supported
> +// to support more formats
>  const enum AVPixelFormat supported_pixel_fmts[] = {
>  AV_PIX_FMT_RGB24,
>  AV_PIX_FMT_BGR24,
> +AV_PIX_FMT_GRAY8,
> +AV_PIX_FMT_GRAYF32,
>  };
>  for (int i = 0; i < sizeof(supported_pixel_fmts) / sizeof(enum
> AVPixelFormat); ++i) {
>  if (supported_pixel_fmts[i] == ctx->fmt) {
> @@ -156,14 +158,38 @@ static int config_input(AVFilterLink *inlink)
>  return AVERROR(EIO);
>  }
>
I think the filter should not check formats manually in the init function
(unless I'm missing something), it would be best if you query for all the
above supported formats in query_formats and later in config_input you make
sure the expected model format matches the frame format.


> -if (model_input.channels != 3) {
> -av_log(ctx, AV_LOG_ERROR, "the model requires input channels
> %d\n",
> -   model_input.channels);
> -return AVERROR(EIO);
> -}
> -if (model_input.dt != DNN_FLOAT && model_input.dt != DNN_UINT8) {
> -av_log(ctx, AV_LOG_ERROR, "only support dnn models with input
> data type as float32 and uint8.\n");
> -return AVERROR(EIO);
> +if (ctx->fmt == AV_PIX_FMT_RGB24 || ctx->fmt == AV_PIX_FMT_BGR24) {
> +if (model_input.channels != 3) {
> +av_log(ctx, AV_LOG_ERROR, "channel number 3 is required, but
> the actual channel number is %d\n",
> +   model_input.channels);
> +return AVERROR(EIO);
> +}
> +if (model_input.dt != DNN_FLOAT && model_input.dt != DNN_UINT8) {
> +av_log(ctx, AV_LOG_ERROR, "only support dnn models with input
> data type as float32 and uint8.\n");
> +return AVERROR(EIO);
> +}
> +} else if (ctx->fmt == AV_PIX_FMT_GRAY8) {
> +if (model_input.channels != 1) {
> +av_log(ctx, AV_LOG_ERROR, "channel number 1 is required, but
> the actual channel number is %d\n",
> +   model_input.channels);
> +return AVERROR(EIO);
> +}
> +if (model_input.dt != DNN_UINT8) {
> +av_log(ctx, AV_LOG_ERROR, "only support dnn models with input
> data type as uint8.\n");
> +return AVERROR(EIO);
> +}
> +} else if (ctx->fmt == AV_PIX_FMT_GRAYF32) {
> +if (model_input.channels != 1) {
> +av_log(ctx, AV_LOG_ERROR, "channel number 1 is required, but
> the actual channel number is %d\n",
> +   model_input.channels);
> +return AVERROR(EIO);
> +}
> +if (model_input.dt != DNN_FLOAT) {
> +av_log(ctx, AV_LOG_ERROR, "only support dnn models with input
> data type as float.\n");
> +return AVERROR(EIO);
> +}
> +} else {
> +av_assert0(!"should not reach here.");
>  }
>
General comment on the above and following chained ifs testing pixel
formats, personally,  using switch(

Re: [FFmpeg-devel] [PATCH 2/4] convert_from_tensorflow.py: add support when kernel size is 1*1 with one input/output channel (gray image)

2019-12-12 Thread Pedro Arthur

Em sex., 22 de nov. de 2019 às 04:57, Guo, Yejun  escreveu:
>
> Signed-off-by: Guo, Yejun 
> ---
>  tools/python/convert_from_tensorflow.py | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/tools/python/convert_from_tensorflow.py 
> b/tools/python/convert_from_tensorflow.py
> index 605158a..5e87e22 100644
> --- a/tools/python/convert_from_tensorflow.py
> +++ b/tools/python/convert_from_tensorflow.py
> @@ -193,7 +193,10 @@ class TFConverter:
>  filter_width = ktensor.tensor_shape.dim[1].size
>  in_channels = ktensor.tensor_shape.dim[2].size
>  out_channels = ktensor.tensor_shape.dim[3].size
> -kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32)
> +if filter_height * filter_width * in_channels * out_channels == 1:
> +kernel = np.float32(ktensor.float_val[0])
> +else:
> +kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32)
>  kernel = kernel.reshape(filter_height, filter_width, in_channels, 
> out_channels)
>  kernel = np.transpose(kernel, [3, 0, 1, 2])
>
> --
> 2.7.4
LGTM, should push soon.

>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/4] avfilter/vf_dnn_processing: refine code for better naming

2019-12-12 Thread Pedro Arthur

Em sex., 22 de nov. de 2019 às 04:56, Guo, Yejun  escreveu:
>
> Signed-off-by: Guo, Yejun 
> ---
>  libavfilter/vf_dnn_processing.c | 90 
> -
>  1 file changed, 45 insertions(+), 45 deletions(-)
>
> diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c
> index f59cfb0..ce976ec 100644
> --- a/libavfilter/vf_dnn_processing.c
> +++ b/libavfilter/vf_dnn_processing.c
> @@ -136,40 +136,40 @@ static int config_input(AVFilterLink *inlink)
>  AVFilterContext *context = inlink->dst;
>  DnnProcessingContext *ctx = context->priv;
>  DNNReturnType result;
> -DNNData dnn_data;
> +DNNData model_input;
>
> -result = ctx->model->get_input(ctx->model->model, &dnn_data, 
> ctx->model_inputname);
> +result = ctx->model->get_input(ctx->model->model, &model_input, 
> ctx->model_inputname);
>  if (result != DNN_SUCCESS) {
>  av_log(ctx, AV_LOG_ERROR, "could not get input from the model\n");
>  return AVERROR(EIO);
>  }
>
>  // the design is to add explicit scale filter before this filter
> -if (dnn_data.height != -1 && dnn_data.height != inlink->h) {
> +if (model_input.height != -1 && model_input.height != inlink->h) {
>  av_log(ctx, AV_LOG_ERROR, "the model requires frame height %d but 
> got %d\n",
> -   dnn_data.height, inlink->h);
> +   model_input.height, inlink->h);
>  return AVERROR(EIO);
>  }
> -if (dnn_data.width != -1 && dnn_data.width != inlink->w) {
> +if (model_input.width != -1 && model_input.width != inlink->w) {
>  av_log(ctx, AV_LOG_ERROR, "the model requires frame width %d but got 
> %d\n",
> -   dnn_data.width, inlink->w);
> +   model_input.width, inlink->w);
>  return AVERROR(EIO);
>  }
>
> -if (dnn_data.channels != 3) {
> +if (model_input.channels != 3) {
>  av_log(ctx, AV_LOG_ERROR, "the model requires input channels %d\n",
> -   dnn_data.channels);
> +   model_input.channels);
>  return AVERROR(EIO);
>  }
> -if (dnn_data.dt != DNN_FLOAT && dnn_data.dt != DNN_UINT8) {
> +if (model_input.dt != DNN_FLOAT && model_input.dt != DNN_UINT8) {
>  av_log(ctx, AV_LOG_ERROR, "only support dnn models with input data 
> type as float32 and uint8.\n");
>  return AVERROR(EIO);
>  }
>
>  ctx->input.width= inlink->w;
>  ctx->input.height   = inlink->h;
> -ctx->input.channels = dnn_data.channels;
> -ctx->input.dt = dnn_data.dt;
> +ctx->input.channels = model_input.channels;
> +ctx->input.dt = model_input.dt;
>
>  result = (ctx->model->set_input_output)(ctx->model->model,
>  &ctx->input, ctx->model_inputname,
> @@ -201,28 +201,28 @@ static int config_output(AVFilterLink *outlink)
>  return 0;
>  }
>
> -static int copy_from_frame_to_dnn(DNNData *dnn_data, const AVFrame *in)
> +static int copy_from_frame_to_dnn(DNNData *dnn_input, const AVFrame *frame)
>  {
>  // extend this function to support more formats
> -av_assert0(in->format == AV_PIX_FMT_RGB24 || in->format == 
> AV_PIX_FMT_BGR24);
> -
> -if (dnn_data->dt == DNN_FLOAT) {
> -float *dnn_input = dnn_data->data;
> -for (int i = 0; i < in->height; i++) {
> -for(int j = 0; j < in->width * 3; j++) {
> -int k = i * in->linesize[0] + j;
> -int t = i * in->width * 3 + j;
> -dnn_input[t] = in->data[0][k] / 255.0f;
> +av_assert0(frame->format == AV_PIX_FMT_RGB24 || frame->format == 
> AV_PIX_FMT_BGR24);
> +
> +if (dnn_input->dt == DNN_FLOAT) {
> +float *dnn_input_data = dnn_input->data;
> +for (int i = 0; i < frame->height; i++) {
> +for(int j = 0; j < frame->width * 3; j++) {
> +int k = i * frame->linesize[0] + j;
> +int t = i * frame->width * 3 + j;
> +dnn_input_data[t] = frame->data[0][k] / 255.0f;
>  }
>  }
>  } else {
> -uint8_t *dnn_input = dnn_data->data;
> -av_assert0(dnn_data->dt == DNN_UINT8);
> -for (int i = 0; i < in->height; i++) {
> -for(int j = 0; j < in->width * 3; j++) {
> -int k = i * in->linesize[0] + j;
> -int t = i * in->width * 3 + j;
> -dnn_input[t] = in->data[0][k];
> +uint8_t *dnn_input_data = dnn_input->data;
> +av_assert0(dnn_input->dt == DNN_UINT8);
> +for (int i = 0; i < frame->height; i++) {
> +for(int j = 0; j < frame->width * 3; j++) {
> +int k = i * frame->linesize[0] + j;
> +int t = i * frame->width * 3 + j;
> +dnn_input_data[t] = frame->data[0][k];
>  }
>  }
>  }
> @@ -23

Re: [FFmpeg-devel] [PATCH 3/4] avfilter/vf_dnn_processing: add format GRAY8 and GRAYF32 support

2019-12-13 Thread Pedro Arthur

Em sex., 13 de dez. de 2019 às 08:23, Guo, Yejun  escreveu:
>
> > From: Pedro Arthur [mailto:bygran...@gmail.com]
> > Sent: Friday, December 13, 2019 12:45 AM
> > To: FFmpeg development discussions and patches 
> > Cc: Guo, Yejun 
> > Subject: Re: [FFmpeg-devel] [PATCH 3/4] avfilter/vf_dnn_processing: add
> > format GRAY8 and GRAYF32 support
> > Hi,
> >
> > how should I test this patch?
>
> the fourth patch of this patch set is the fate test for this feature, so I 
> ignored comments here.
> I'll add the test descriptions back in v2.
>
> >
> > Em sex., 22 de nov. de 2019 às 04:57, Guo, Yejun 
> > escreveu:
> >
> > > Signed-off-by: Guo, Yejun 
> > > ---
> > >  doc/filters.texi|   8 ++-
> > >  libavfilter/vf_dnn_processing.c | 147
> > > ++--
> > >  2 files changed, 118 insertions(+), 37 deletions(-)
> > >
> > > diff --git a/doc/filters.texi b/doc/filters.texi
> > > index 1f86ae1..c3f7997 100644
> > > --- a/doc/filters.texi
> > > +++ b/doc/filters.texi
> > > @@ -8992,7 +8992,13 @@ Set the input name of the dnn network.
> > >  Set the output name of the dnn network.
> > >
> > >  @item fmt
> > > -Set the pixel format for the Frame. Allowed values are
> > > @code{AV_PIX_FMT_RGB24}, and @code{AV_PIX_FMT_BGR24}.
> > > +Set the pixel format for the Frame, the value is determined by the input
> > > of the dnn network model.
> > >
> > This sentence is a bit confusing, also I think this property should be
> > removed. (I will explain bellow).
>
> sure, no problem.
>
> >
> > +
> > > +If the model handles RGB (or BGR) image and the data type of model input
> > > is uint8, fmt must be @code{AV_PIX_FMT_RGB24} (or
> > @code{AV_PIX_FMT_BGR24}.
> > > +If the model handles RGB (or BGR) image and the data type of model input
> > > is float, fmt must be @code{AV_PIX_FMT_RGB24} (or
> > @code{AV_PIX_FMT_BGR24},
> > > and this filter will do data type conversion internally.
> > > +If the model handles GRAY image and the data type of model input is
> > > uint8, fmt must be @code{AV_PIX_FMT_GRAY8}.
> > > +If the model handles GRAY image and the data type of model input is
> > > float, fmt must be @code{AV_PIX_FMT_GRAYF32}.
> > > +
> > >  Default value is @code{AV_PIX_FMT_RGB24}.
> > >
> > >  @end table
> > > diff --git a/libavfilter/vf_dnn_processing.c
> > > b/libavfilter/vf_dnn_processing.c
> > > index ce976ec..963dd5e 100644
> > > --- a/libavfilter/vf_dnn_processing.c
> > > +++ b/libavfilter/vf_dnn_processing.c
> > > @@ -70,10 +70,12 @@ static av_cold int init(AVFilterContext *context)
> > >  {
> > >  DnnProcessingContext *ctx = context->priv;
> > >  int supported = 0;
> > > -// as the first step, only rgb24 and bgr24 are supported
> > > +// to support more formats
> > >  const enum AVPixelFormat supported_pixel_fmts[] = {
> > >  AV_PIX_FMT_RGB24,
> > >  AV_PIX_FMT_BGR24,
> > > +AV_PIX_FMT_GRAY8,
> > > +AV_PIX_FMT_GRAYF32,
> > >  };
> > >  for (int i = 0; i < sizeof(supported_pixel_fmts) / sizeof(enum
> > > AVPixelFormat); ++i) {
> > >  if (supported_pixel_fmts[i] == ctx->fmt) {
> > > @@ -156,14 +158,38 @@ static int config_input(AVFilterLink *inlink)
> > >  return AVERROR(EIO);
> > >  }
> > >
> > I think the filter should not check formats manually in the init function
> > (unless I'm missing something), it would be best if you query for all the
> > above supported formats in query_formats and later in config_input you make
> > sure the expected model format matches the frame format.
>
> I'm afraid it is too late if we find the format mismatch in function 
> config_input.
>
> For example, the dnn module is designed to accept BGR24 data, and the actual
> format comes into config_input is RGB24 or YUV420P (we'll add yuv formats 
> later in
> supported pixel fmts) or something else such as GRAY8. We have two choices:
>
> - return error, and the application ends.
>   This is not what we want.
> - return no_error, and do format conversion at the beginning of function 
> filter_frame.
>   It makes this filter complex, and our implementation for the conversion 
> might not be the best optimized.
>   My idea is to keep this filter simple. And the users can c

Re: [FFmpeg-devel] [PATCH 3/4] avfilter/vf_dnn_processing: add format GRAY8 and GRAYF32 support

2019-12-16 Thread Pedro Arthur

Em seg., 16 de dez. de 2019 às 09:39, Guo, Yejun  escreveu:
>
>
>
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > myp...@gmail.com
> > Sent: Monday, December 16, 2019 7:43 PM
> > To: FFmpeg development discussions and patches 
> > Cc: Pedro Arthur 
> > Subject: Re: [FFmpeg-devel] [PATCH 3/4] avfilter/vf_dnn_processing: add
> > format GRAY8 and GRAYF32 support
> >
> > On Mon, Dec 16, 2019 at 7:18 PM Guo, Yejun  wrote:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: Pedro Arthur [mailto:bygran...@gmail.com]
> > > > Sent: Friday, December 13, 2019 10:40 PM
> > > > To: Guo, Yejun 
> > > > Cc: ffmpeg-devel@ffmpeg.org
> > > > Subject: Re: [FFmpeg-devel] [PATCH 3/4] avfilter/vf_dnn_processing: add
> > > > format GRAY8 and GRAYF32 support
> > > >
> > > > Em sex., 13 de dez. de 2019 às 08:23, Guo, Yejun 
> > > > escreveu:
> > > > >
> > > > > > From: Pedro Arthur [mailto:bygran...@gmail.com]
> > > > > > Sent: Friday, December 13, 2019 12:45 AM
> > > > > > To: FFmpeg development discussions and patches
> > > > 
> > > > > > Cc: Guo, Yejun 
> > > > > > Subject: Re: [FFmpeg-devel] [PATCH 3/4] avfilter/vf_dnn_processing: 
> > > > > > add
> > > > > > format GRAY8 and GRAYF32 support
> > > > > > Hi,
> > > > > >
> > > > > > how should I test this patch?
> > > > >
> > > > > the fourth patch of this patch set is the fate test for this feature, 
> > > > > so I
> > ignored
> > > > comments here.
> > > > > I'll add the test descriptions back in v2.
> > > > >
> > > > > >
> > > > > > Em sex., 22 de nov. de 2019 às 04:57, Guo, Yejun  > intel.com>
> > > > > > escreveu:
> > > > > >
> > > > > > > Signed-off-by: Guo, Yejun 
> > > > > > > ---
> > > > > > >  doc/filters.texi|   8 ++-
> > > > > > >  libavfilter/vf_dnn_processing.c | 147
> > > > > > > ++--
> > > > > > >  2 files changed, 118 insertions(+), 37 deletions(-)
> > > > > > >
> > > > > > > diff --git a/doc/filters.texi b/doc/filters.texi
> > > > > > > index 1f86ae1..c3f7997 100644
> > > > > > > --- a/doc/filters.texi
> > > > > > > +++ b/doc/filters.texi
> > > > > > > @@ -8992,7 +8992,13 @@ Set the input name of the dnn network.
> > > > > > >  Set the output name of the dnn network.
> > > > > > >
> > > > > > >  @item fmt
> > > > > > > -Set the pixel format for the Frame. Allowed values are
> > > > > > > @code{AV_PIX_FMT_RGB24}, and @code{AV_PIX_FMT_BGR24}.
> > > > > > > +Set the pixel format for the Frame, the value is determined by 
> > > > > > > the
> > input
> > > > > > > of the dnn network model.
> > > > > > >
> > > > > > This sentence is a bit confusing, also I think this property should 
> > > > > > be
> > > > > > removed. (I will explain bellow).
> > > > >
> > > > > sure, no problem.
> > > > >
> > > > > >
> > > > > > +
> > > > > > > +If the model handles RGB (or BGR) image and the data type of 
> > > > > > > model
> > > > input
> > > > > > > is uint8, fmt must be @code{AV_PIX_FMT_RGB24} (or
> > > > > > @code{AV_PIX_FMT_BGR24}.
> > > > > > > +If the model handles RGB (or BGR) image and the data type of 
> > > > > > > model
> > > > input
> > > > > > > is float, fmt must be @code{AV_PIX_FMT_RGB24} (or
> > > > > > @code{AV_PIX_FMT_BGR24},
> > > > > > > and this filter will do data type conversion internally.
> > > > > > > +If the model handles GRAY image and the data type of model input 
> > > > > > > is
> > > > > > > uint8, fmt must be @code{AV_PIX_FMT_GRAY8}.
> > > > > > > +If the model handles GRA

Re: [FFmpeg-devel] [PATCH 2/2] vf_dnn_processing: add support for more formats gray8 and grayf32

2020-01-07 Thread Pedro Arthur

Em sex., 27 de dez. de 2019 às 05:42, Guo, Yejun  escreveu:
>
> The following is a python script to halve the value of the gray
> image. It demos how to setup and execute dnn model with python+tensorflow.
> It also generates .pb file which will be used by ffmpeg.
>
> import tensorflow as tf
> import numpy as np
> from skimage import color
> from skimage import io
> in_img = io.imread('input.jpg')
> in_img = color.rgb2gray(in_img)
> io.imsave('ori_gray.jpg', np.squeeze(in_img))
> in_data = np.expand_dims(in_img, axis=0)
> in_data = np.expand_dims(in_data, axis=3)
> filter_data = np.array([0.5]).reshape(1,1,1,1).astype(np.float32)
> filter = tf.Variable(filter_data)
> x = tf.placeholder(tf.float32, shape=[1, None, None, 1], name='dnn_in')
> y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID', 
> name='dnn_out')
> sess=tf.Session()
> sess.run(tf.global_variables_initializer())
> graph_def = tf.graph_util.convert_variables_to_constants(sess, 
> sess.graph_def, ['dnn_out'])
> tf.train.write_graph(graph_def, '.', 'halve_gray_float.pb', as_text=False)
> print("halve_gray_float.pb generated, please use \
> path_to_ffmpeg/tools/python/convert.py to generate halve_gray_float.model\n")
> output = sess.run(y, feed_dict={x: in_data})
> output = output * 255.0
> output = output.astype(np.uint8)
> io.imsave("out.jpg", np.squeeze(output))
>
> To do the same thing with ffmpeg:
> - generate halve_gray_float.pb with the above script
> - generate halve_gray_float.model with tools/python/convert.py
> - try with following commands
>   ./ffmpeg -i input.jpg -vf 
> format=grayf32,dnn_processing=model=halve_gray_float.model:input=dnn_in:output=dnn_out:dnn_backend=native
>  out.native.png
>   ./ffmpeg -i input.jpg -vf 
> format=grayf32,dnn_processing=model=halve_gray_float.pb:input=dnn_in:output=dnn_out:dnn_backend=tensorflow
>  out.tf.png
>
> Signed-off-by: Guo, Yejun 
> ---
>  doc/filters.texi|   6 ++
>  libavfilter/vf_dnn_processing.c | 168 
> ++--
>  2 files changed, 132 insertions(+), 42 deletions(-)
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index f467378..57a129d 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -9075,6 +9075,12 @@ Halve the red channle of the frame with format rgb24:
>  ffmpeg -i input.jpg -vf 
> format=rgb24,dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:dnn_backend=native
>  out.native.png
>  @end example
>
> +@item
> +Halve the pixel value of the frame with format gray32f:
> +@example
> +ffmpeg -i input.jpg -vf 
> format=grayf32,dnn_processing=model=halve_gray_float.model:input=dnn_in:output=dnn_out:dnn_backend=native
>  -y out.native.png
> +@end example
> +
>  @end itemize
>
>  @section drawbox
> diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c
> index 4a6b900..13273f2 100644
> --- a/libavfilter/vf_dnn_processing.c
> +++ b/libavfilter/vf_dnn_processing.c
> @@ -104,12 +104,20 @@ static int query_formats(AVFilterContext *context)
>  {
>  static const enum AVPixelFormat pix_fmts[] = {
>  AV_PIX_FMT_RGB24, AV_PIX_FMT_BGR24,
> +AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAYF32,
>  AV_PIX_FMT_NONE
>  };
>  AVFilterFormats *fmts_list = ff_make_format_list(pix_fmts);
>  return ff_set_common_formats(context, fmts_list);
>  }
>
> +#define LOG_FORMAT_CHANNEL_MISMATCH()   \
> +av_log(ctx, AV_LOG_ERROR,   \
> +   "the frame's format %s does not match "  \
> +   "the model input channel %d\n",  \
> +   av_get_pix_fmt_name(fmt),\
> +   model_input->channels);
> +
>  static int check_modelinput_inlink(const DNNData *model_input, const 
> AVFilterLink *inlink)
>  {
>  AVFilterContext *ctx   = inlink->dst;
> @@ -131,17 +139,34 @@ static int check_modelinput_inlink(const DNNData 
> *model_input, const AVFilterLin
>  case AV_PIX_FMT_RGB24:
>  case AV_PIX_FMT_BGR24:
>  if (model_input->channels != 3) {
> -av_log(ctx, AV_LOG_ERROR, "the frame's input format %s does not 
> match "
> -   "the model input channels %d\n",
> -   av_get_pix_fmt_name(fmt),
> -   model_input->channels);
> +LOG_FORMAT_CHANNEL_MISMATCH();
>  return AVERROR(EIO);
>  }
>  if (model_input->dt != DNN_FLOAT && model_input->dt != DNN_UINT8) {
>  av_log(ctx, AV_LOG_ERROR, "only support dnn models with input 
> data type as float32 and uint8.\n");
>  return AVERROR(EIO);
>  }
> -break;
> +return 0;
> +case AV_PIX_FMT_GRAY8:
> +if (model_input->channels != 1) {
> +LOG_FORMAT_CHANNEL_MISMATCH();
> +return AVERROR(EIO);
> +}
> +if (model_input->dt != DNN_UINT8) {
> +av_log(c

Re: [FFmpeg-devel] [PATCH 1/2] vf_dnn_processing: remove parameter 'fmt'

2020-01-07 Thread Pedro Arthur

Em sex., 27 de dez. de 2019 às 05:42, Guo, Yejun  escreveu:
>
> do not request AVFrame's format in vf_ddn_processing with 'fmt',
> but to add another filter for the format.
>
> command examples:
> ./ffmpeg -i input.jpg -vf 
> format=bgr24,dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:dnn_backend=native
>  -y out.native.png
> ./ffmpeg -i input.jpg -vf 
> format=rgb24,dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:dnn_backend=native
>  -y out.native.png
>
> Signed-off-by: Guo, Yejun 
> ---
>  doc/filters.texi| 17 +---
>  libavfilter/vf_dnn_processing.c | 95 
> +
>  2 files changed, 60 insertions(+), 52 deletions(-)
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 8c5d3a5..f467378 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -9030,8 +9030,8 @@ ffmpeg -i INPUT -f lavfi -i 
> nullsrc=hd720,geq='r=128+80*(sin(sqrt((X-W/2)*(X-W/2
>
>  @section dnn_processing
>
> -Do image processing with deep neural networks. Currently only AVFrame with 
> RGB24
> -and BGR24 are supported, more formats will be added later.
> +Do image processing with deep neural networks. It works together with 
> another filter
> +which converts the pixel format of the Frame to what the dnn network 
> requires.
>
>  The filter accepts the following options:
>
> @@ -9066,12 +9066,17 @@ Set the input name of the dnn network.
>  @item output
>  Set the output name of the dnn network.
>
> -@item fmt
> -Set the pixel format for the Frame. Allowed values are 
> @code{AV_PIX_FMT_RGB24}, and @code{AV_PIX_FMT_BGR24}.
> -Default value is @code{AV_PIX_FMT_RGB24}.
> -
>  @end table
>
> +@itemize
> +@item
> +Halve the red channle of the frame with format rgb24:
> +@example
> +ffmpeg -i input.jpg -vf 
> format=rgb24,dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:dnn_backend=native
>  out.native.png
> +@end example
> +
> +@end itemize
> +
>  @section drawbox
>
>  Draw a colored box on the input image.
> diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c
> index ce976ec..4a6b900 100644
> --- a/libavfilter/vf_dnn_processing.c
> +++ b/libavfilter/vf_dnn_processing.c
> @@ -37,7 +37,6 @@ typedef struct DnnProcessingContext {
>
>  char *model_filename;
>  DNNBackendType backend_type;
> -enum AVPixelFormat fmt;
>  char *model_inputname;
>  char *model_outputname;
>
> @@ -60,7 +59,6 @@ static const AVOption dnn_processing_options[] = {
>  { "model",   "path to model file", OFFSET(model_filename),   
> AV_OPT_TYPE_STRING,{ .str = NULL }, 0, 0, FLAGS },
>  { "input",   "input name of the model",OFFSET(model_inputname),  
> AV_OPT_TYPE_STRING,{ .str = NULL }, 0, 0, FLAGS },
>  { "output",  "output name of the model",   OFFSET(model_outputname), 
> AV_OPT_TYPE_STRING,{ .str = NULL }, 0, 0, FLAGS },
> -{ "fmt", "AVPixelFormat of the frame", OFFSET(fmt),  
> AV_OPT_TYPE_PIXEL_FMT, { .i64=AV_PIX_FMT_RGB24 }, AV_PIX_FMT_NONE, 
> AV_PIX_FMT_NB - 1, FLAGS },
>  { NULL }
>  };
>
> @@ -69,23 +67,6 @@ AVFILTER_DEFINE_CLASS(dnn_processing);
>  static av_cold int init(AVFilterContext *context)
>  {
>  DnnProcessingContext *ctx = context->priv;
> -int supported = 0;
> -// as the first step, only rgb24 and bgr24 are supported
> -const enum AVPixelFormat supported_pixel_fmts[] = {
> -AV_PIX_FMT_RGB24,
> -AV_PIX_FMT_BGR24,
> -};
> -for (int i = 0; i < sizeof(supported_pixel_fmts) / sizeof(enum 
> AVPixelFormat); ++i) {
> -if (supported_pixel_fmts[i] == ctx->fmt) {
> -supported = 1;
> -break;
> -}
> -}
> -if (!supported) {
> -av_log(context, AV_LOG_ERROR, "pixel fmt %s not supported yet\n",
> -   av_get_pix_fmt_name(ctx->fmt));
> -return AVERROR(AVERROR_INVALIDDATA);
> -}
>
>  if (!ctx->model_filename) {
>  av_log(ctx, AV_LOG_ERROR, "model file for network is not 
> specified\n");
> @@ -121,14 +102,52 @@ static av_cold int init(AVFilterContext *context)
>
>  static int query_formats(AVFilterContext *context)
>  {
> -AVFilterFormats *formats;
> -DnnProcessingContext *ctx = context->priv;
> -enum AVPixelFormat pixel_fmts[2];
> -pixel_fmts[0] = ctx->fmt;
> -pixel_fmts[1] = AV_PIX_FMT_NONE;
> +static const enum AVPixelFormat pix_fmts[] = {
> +AV_PIX_FMT_RGB24, AV_PIX_FMT_BGR24,
> +AV_PIX_FMT_NONE
> +};
> +AVFilterFormats *fmts_list = ff_make_format_list(pix_fmts);
> +return ff_set_common_formats(context, fmts_list);
> +}
> +
> +static int check_modelinput_inlink(const DNNData *model_input, const 
> AVFilterLink *inlink)
> +{
> +AVFilterContext *ctx   = inlink->dst;
> +enum AVPixelFormat fmt = inlink->format;
> +
> +// the design is to add explicit scale filter before this filter
> +

Re: [FFmpeg-devel] [PATCH 3/8] avfilter/vf_dnn_processing: remove access of AV_PIX_FMT_NB

2020-01-07 Thread Pedro Arthur

Hi,

Em seg., 30 de dez. de 2019 às 10:55,  escreveu:
>
> From: Zhao Zhili 
>
> ---
>  libavfilter/vf_dnn_processing.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c
> index ce976ec3bd..afb7275a38 100644
> --- a/libavfilter/vf_dnn_processing.c
> +++ b/libavfilter/vf_dnn_processing.c
> @@ -60,7 +60,7 @@ static const AVOption dnn_processing_options[] = {
>  { "model",   "path to model file", OFFSET(model_filename),   
> AV_OPT_TYPE_STRING,{ .str = NULL }, 0, 0, FLAGS },
>  { "input",   "input name of the model",OFFSET(model_inputname),  
> AV_OPT_TYPE_STRING,{ .str = NULL }, 0, 0, FLAGS },
>  { "output",  "output name of the model",   OFFSET(model_outputname), 
> AV_OPT_TYPE_STRING,{ .str = NULL }, 0, 0, FLAGS },
> -{ "fmt", "AVPixelFormat of the frame", OFFSET(fmt),  
> AV_OPT_TYPE_PIXEL_FMT, { .i64=AV_PIX_FMT_RGB24 }, AV_PIX_FMT_NONE, 
> AV_PIX_FMT_NB - 1, FLAGS },
> +{ "fmt", "AVPixelFormat of the frame", OFFSET(fmt),  
> AV_OPT_TYPE_PIXEL_FMT, { .i64=AV_PIX_FMT_RGB24 }, AV_PIX_FMT_NONE, INT_MAX, 
> FLAGS },
>  { NULL }
>  };
>
> --
> 2.22.0
The fmt parameter was removed in [1].


[1] - 
https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/04e6f8a143dc8bcec385e94a653b89c67cbaaca1

>
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

1 2 3 >

1 - 100 of 205 matches

Mail list logo