Em sex., 13 de dez. de 2019 às 08:23, Guo, Yejun <yejun....@intel.com> escreveu: > > > From: Pedro Arthur [mailto:bygran...@gmail.com] > > Sent: Friday, December 13, 2019 12:45 AM > > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > > Cc: Guo, Yejun <yejun....@intel.com> > > Subject: Re: [FFmpeg-devel] [PATCH 3/4] avfilter/vf_dnn_processing: add > > format GRAY8 and GRAYF32 support > > Hi, > > > > how should I test this patch? > > the fourth patch of this patch set is the fate test for this feature, so I > ignored comments here. > I'll add the test descriptions back in v2. > > > > > Em sex., 22 de nov. de 2019 às 04:57, Guo, Yejun <yejun.guo at intel.com> > > escreveu: > > > > > Signed-off-by: Guo, Yejun <yejun.guo at intel.com> > > > --- > > > doc/filters.texi | 8 ++- > > > libavfilter/vf_dnn_processing.c | 147 > > > ++++++++++++++++++++++++++++++---------- > > > 2 files changed, 118 insertions(+), 37 deletions(-) > > > > > > diff --git a/doc/filters.texi b/doc/filters.texi > > > index 1f86ae1..c3f7997 100644 > > > --- a/doc/filters.texi > > > +++ b/doc/filters.texi > > > @@ -8992,7 +8992,13 @@ Set the input name of the dnn network. > > > Set the output name of the dnn network. > > > > > > @item fmt > > > -Set the pixel format for the Frame. Allowed values are > > > @code{AV_PIX_FMT_RGB24}, and @code{AV_PIX_FMT_BGR24}. > > > +Set the pixel format for the Frame, the value is determined by the input > > > of the dnn network model. > > > > > This sentence is a bit confusing, also I think this property should be > > removed. (I will explain bellow). > > sure, no problem. > > > > > + > > > +If the model handles RGB (or BGR) image and the data type of model input > > > is uint8, fmt must be @code{AV_PIX_FMT_RGB24} (or > > @code{AV_PIX_FMT_BGR24}. > > > +If the model handles RGB (or BGR) image and the data type of model input > > > is float, fmt must be @code{AV_PIX_FMT_RGB24} (or > > @code{AV_PIX_FMT_BGR24}, > > > and this filter will do data type conversion internally. > > > +If the model handles GRAY image and the data type of model input is > > > uint8, fmt must be @code{AV_PIX_FMT_GRAY8}. > > > +If the model handles GRAY image and the data type of model input is > > > float, fmt must be @code{AV_PIX_FMT_GRAYF32}. > > > + > > > Default value is @code{AV_PIX_FMT_RGB24}. > > > > > > @end table > > > diff --git a/libavfilter/vf_dnn_processing.c > > > b/libavfilter/vf_dnn_processing.c > > > index ce976ec..963dd5e 100644 > > > --- a/libavfilter/vf_dnn_processing.c > > > +++ b/libavfilter/vf_dnn_processing.c > > > @@ -70,10 +70,12 @@ static av_cold int init(AVFilterContext *context) > > > { > > > DnnProcessingContext *ctx = context->priv; > > > int supported = 0; > > > - // as the first step, only rgb24 and bgr24 are supported > > > + // to support more formats > > > const enum AVPixelFormat supported_pixel_fmts[] = { > > > AV_PIX_FMT_RGB24, > > > AV_PIX_FMT_BGR24, > > > + AV_PIX_FMT_GRAY8, > > > + AV_PIX_FMT_GRAYF32, > > > }; > > > for (int i = 0; i < sizeof(supported_pixel_fmts) / sizeof(enum > > > AVPixelFormat); ++i) { > > > if (supported_pixel_fmts[i] == ctx->fmt) { > > > @@ -156,14 +158,38 @@ static int config_input(AVFilterLink *inlink) > > > return AVERROR(EIO); > > > } > > > > > I think the filter should not check formats manually in the init function > > (unless I'm missing something), it would be best if you query for all the > > above supported formats in query_formats and later in config_input you make > > sure the expected model format matches the frame format. > > I'm afraid it is too late if we find the format mismatch in function > config_input. > > For example, the dnn module is designed to accept BGR24 data, and the actual > format comes into config_input is RGB24 or YUV420P (we'll add yuv formats > later in > supported pixel fmts) or something else such as GRAY8. We have two choices: > > - return error, and the application ends. > This is not what we want. > - return no_error, and do format conversion at the beginning of function > filter_frame. > It makes this filter complex, and our implementation for the conversion > might not be the best optimized. > My idea is to keep this filter simple. And the users can choose what they > want to do conversion in the filters graph. > Indeed if the filter receives the format already converted is much better, but if you already have to specify the format the model expects as a parameter one could simply use the -pix_fmt to set the format instead of having one more filter parameter. Is there any downside if it uses "-pix_fmt" that "fmt" avoids?
> Regarding the below description in doc/filters.texi, the reason is that we do > not have format such as rgbf32 and bgrf32, we have to do conversion within > this filter. > "If the model handles RGB (or BGR) image and the data type of model input is > float, fmt must be @code{AV_PIX_FMT_RGB24} (or @code{AV_PIX > _FMT_BGR24}, and this filter will do data type conversion internally." > > > Consider the filter vf_dnn_analytic in my plan, I think the conversion is > necessary since mostly a scale down is needed. > > > > > > > > - if (model_input.channels != 3) { > > > - av_log(ctx, AV_LOG_ERROR, "the model requires input channels > > > %d\n", > > > - model_input.channels); > > > - return AVERROR(EIO); > > > - } > > > - if (model_input.dt != DNN_FLOAT && model_input.dt != DNN_UINT8) { > > > - av_log(ctx, AV_LOG_ERROR, "only support dnn models with input > > > data type as float32 and uint8.\n"); > > > - return AVERROR(EIO); > > > + if (ctx->fmt == AV_PIX_FMT_RGB24 || ctx->fmt == AV_PIX_FMT_BGR24) > > { > > > + if (model_input.channels != 3) { > > > + av_log(ctx, AV_LOG_ERROR, "channel number 3 is required, > > but > > > the actual channel number is %d\n", > > > + model_input.channels); > > > + return AVERROR(EIO); > > > + } > > > + if (model_input.dt != DNN_FLOAT && model_input.dt != > > DNN_UINT8) { > > > + av_log(ctx, AV_LOG_ERROR, "only support dnn models with > > input > > > data type as float32 and uint8.\n"); > > > + return AVERROR(EIO); > > > + } > > > + } else if (ctx->fmt == AV_PIX_FMT_GRAY8) { > > > + if (model_input.channels != 1) { > > > + av_log(ctx, AV_LOG_ERROR, "channel number 1 is required, > > but > > > the actual channel number is %d\n", > > > + model_input.channels); > > > + return AVERROR(EIO); > > > + } > > > + if (model_input.dt != DNN_UINT8) { > > > + av_log(ctx, AV_LOG_ERROR, "only support dnn models with > > input > > > data type as uint8.\n"); > > > + return AVERROR(EIO); > > > + } > > > + } else if (ctx->fmt == AV_PIX_FMT_GRAYF32) { > > > + if (model_input.channels != 1) { > > > + av_log(ctx, AV_LOG_ERROR, "channel number 1 is required, > > but > > > the actual channel number is %d\n", > > > + model_input.channels); > > > + return AVERROR(EIO); > > > + } > > > + if (model_input.dt != DNN_FLOAT) { > > > + av_log(ctx, AV_LOG_ERROR, "only support dnn models with > > input > > > data type as float.\n"); > > > + return AVERROR(EIO); > > > + } > > > + } else { > > > + av_assert0(!"should not reach here."); > > > } > > > > > General comment on the above and following chained ifs testing pixel > > formats, personally, using switch(fmt) seems more readable. > > good point, I'll refine the code. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".