> 在 2018年11月5日,下午3:42,Guo, Yejun <yejun....@intel.com> 写道: > > ask for comment or merge, thanks. Will push after 24 hours if there have no objections. > >> -----Original Message----- >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf >> Of Guo, Yejun >> Sent: Monday, October 29, 2018 11:19 AM >> To: ffmpeg-devel@ffmpeg.org >> Subject: Re: [FFmpeg-devel] [PATCH V4] Add a filter implementing HDR >> image generation from a single exposure using deep CNNs >> >> any more comment? thanks. >> >>> -----Original Message----- >>> From: Guo, Yejun >>> Sent: Tuesday, October 23, 2018 6:46 AM >>> To: ffmpeg-devel@ffmpeg.org >>> Cc: Guo, Yejun <yejun....@intel.com>; Guo >>> Subject: [PATCH V4] Add a filter implementing HDR image generation >>> from a single exposure using deep CNNs >>> >>> see the algorithm's paper and code below. >>> >>> the filter's parameter looks like: >>> >> sdr2hdr=model_filename=/path_to_tensorflow_graph.pb:out_fmt=gbrp10l >>> e >>> >>> The input of the deep CNN model is RGB24 while the output is float for >>> each color channel. This is the filter's default behavior to output >>> format with gbrpf32le. And gbrp10le is also supported as the output, >>> so we can see the rendering result in a player, as a reference. >>> >>> To generate the model file, we need modify the original script a little. >>> - set name='y' for y_final within script at >>> https://github.com/gabrieleilertsen/hdrcnn/blob/master/network.py >>> - add the following code to the script at >>> https://github.com/gabrieleilertsen/hdrcnn/blob/master/hdrcnn_predict. >>> py >>> >>> graph = tf.graph_util.convert_variables_to_constants(sess, >>> sess.graph_def, >>> ["y"]) tf.train.write_graph(graph, '.', 'graph.pb', as_text=False) >>> >>> The filter only works when tensorflow C api is supported in the >>> system, native backend is not supported since there are some different >>> types of layers in the deep CNN model, besides CONV and >> DEPTH_TO_SPACE. >>> >>> https://arxiv.org/pdf/1710.07480.pdf: >>> author = "Eilertsen, Gabriel and Kronander, Joel, and Denes, Gyorgy >> and >>> Mantiuk, Rafał and Unger, Jonas", >>> title = "HDR image reconstruction from a single exposure using deep >>> CNNs", >>> journal = "ACM Transactions on Graphics (TOG)", >>> number = "6", >>> volume = "36", >>> articleno = "178", >>> year = "2017" >>> >>> https://github.com/gabrieleilertsen/hdrcnn >>> >>> btw, as a whole solution, metadata should also be generated from the >>> sdr video, so to be encoded as a HDR video. Not supported yet. >>> This patch just focuses on this paper. >>> >>> Signed-off-by: Guo, Yejun <yejun....@intel.com> >>> --- >>> configure | 1 + >>> doc/filters.texi | 35 +++++++ >>> libavfilter/Makefile | 1 + >>> libavfilter/allfilters.c | 1 + >>> libavfilter/vf_sdr2hdr.c | 268 >>> +++++++++++++++++++++++++++++++++++++++++++++++ >>> 5 files changed, 306 insertions(+) >>> create mode 100644 libavfilter/vf_sdr2hdr.c >>> >>> diff --git a/configure b/configure >>> index 85d5dd5..5e2efba 100755 >>> --- a/configure >>> +++ b/configure >>> @@ -3438,6 +3438,7 @@ scale2ref_filter_deps="swscale" >>> scale_filter_deps="swscale" >>> scale_qsv_filter_deps="libmfx" >>> select_filter_select="pixelutils" >>> +sdr2hdr_filter_deps="libtensorflow" >>> sharpness_vaapi_filter_deps="vaapi" >>> showcqt_filter_deps="avcodec avformat swscale" >>> showcqt_filter_suggest="libfontconfig libfreetype" >>> diff --git a/doc/filters.texi b/doc/filters.texi index >>> 17e2549..bba9f87 100644 >>> --- a/doc/filters.texi >>> +++ b/doc/filters.texi >>> @@ -14672,6 +14672,41 @@ Scale a subtitle stream (b) to match the main >>> video (a) in size before overlayin @end example @end itemize >>> >>> +@section sdr2hdr >>> + >>> +HDR image generation from a single exposure using deep CNNs with >>> TensorFlow C library. >>> + >>> +@itemize >>> +@item >>> +paper: see @url{https://arxiv.org/pdf/1710.07480.pdf} >>> + >>> +@item >>> +code with model and trained parameters: see >>> +@url{https://github.com/gabrieleilertsen/hdrcnn} >>> +@end itemize >>> + >>> +The filter accepts the following options: >>> + >>> +@table @option >>> + >>> +@item model_filename >>> +Set path to model file specifying network architecture and its parameters. >>> + >>> +@item out_fmt >>> +the data format of the filter's output. >>> + >>> +It accepts the following values: >>> +@table @samp >>> +@item gbrpf32le >>> +force gbrpf32le output >>> + >>> +@item gbrp10le >>> +force gbrp10le output >>> +@end table >>> + >>> +Default value is @samp{gbrpf32le}. >>> + >>> +@end table >>> + >>> @anchor{selectivecolor} >>> @section selectivecolor >>> >>> diff --git a/libavfilter/Makefile b/libavfilter/Makefile index >>> 62cc2f5..88e7da6 >>> 100644 >>> --- a/libavfilter/Makefile >>> +++ b/libavfilter/Makefile >>> @@ -360,6 +360,7 @@ OBJS-$(CONFIG_SOBEL_OPENCL_FILTER) += >>> vf_convolution_opencl.o opencl.o >>> OBJS-$(CONFIG_SPLIT_FILTER) += split.o >>> OBJS-$(CONFIG_SPP_FILTER) += vf_spp.o >>> OBJS-$(CONFIG_SR_FILTER) += vf_sr.o >>> +OBJS-$(CONFIG_SDR2HDR_FILTER) += vf_sdr2hdr.o >>> OBJS-$(CONFIG_SSIM_FILTER) += vf_ssim.o framesync.o >>> OBJS-$(CONFIG_STEREO3D_FILTER) += vf_stereo3d.o >>> OBJS-$(CONFIG_STREAMSELECT_FILTER) += f_streamselect.o >>> framesync.o >>> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index >>> 5e72803..1645c0f >>> 100644 >>> --- a/libavfilter/allfilters.c >>> +++ b/libavfilter/allfilters.c >>> @@ -319,6 +319,7 @@ extern AVFilter ff_vf_scale_npp; extern AVFilter >>> ff_vf_scale_qsv; extern AVFilter ff_vf_scale_vaapi; extern AVFilter >>> ff_vf_scale2ref; >>> +extern AVFilter ff_vf_sdr2hdr; >>> extern AVFilter ff_vf_select; >>> extern AVFilter ff_vf_selectivecolor; extern AVFilter ff_vf_sendcmd; >>> diff --git a/libavfilter/vf_sdr2hdr.c b/libavfilter/vf_sdr2hdr.c new >>> file mode >>> 100644 index 0000000..109b907 >>> --- /dev/null >>> +++ b/libavfilter/vf_sdr2hdr.c >>> @@ -0,0 +1,268 @@ >>> +/* >>> + * Copyright (c) 2018 Guo Yejun >>> + * >>> + * This file is part of FFmpeg. >>> + * >>> + * FFmpeg is free software; you can redistribute it and/or >>> + * modify it under the terms of the GNU Lesser General Public >>> + * License as published by the Free Software Foundation; either >>> + * version 2.1 of the License, or (at your option) any later version. >>> + * >>> + * FFmpeg is distributed in the hope that it will be useful, >>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >>> GNU >>> + * Lesser General Public License for more details. >>> + * >>> + * You should have received a copy of the GNU Lesser General Public >>> + * License along with FFmpeg; if not, write to the Free Software >>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA >>> +02110-1301 USA */ >>> + >>> +/** >>> + * @file >>> + * Filter implementing HDR image generation from a single exposure >>> +using >>> deep CNNs. >>> + * https://arxiv.org/pdf/1710.07480.pdf >>> + */ >>> + >>> +#include "avfilter.h" >>> +#include "formats.h" >>> +#include "internal.h" >>> +#include "libavutil/opt.h" >>> +#include "libavutil/qsort.h" >>> +#include "libavformat/avio.h" >>> +#include "libswscale/swscale.h" >>> +#include "dnn_interface.h" >>> +#include <math.h> >>> + >>> +typedef struct SDR2HDRContext { >>> + const AVClass *class; >>> + >>> + char* model_filename; >>> + enum AVPixelFormat out_fmt; >>> + DNNModule* dnn_module; >>> + DNNModel* model; >>> + DNNData input, output; >>> +} SDR2HDRContext; >>> + >>> +#define OFFSET(x) offsetof(SDR2HDRContext, x) #define FLAGS >>> +AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM >> static >>> const >>> +AVOption sdr2hdr_options[] = { >>> + { "model_filename", "path to model file specifying network >>> +architecture >>> and its parameters", OFFSET(model_filename), AV_OPT_TYPE_STRING, >>> {.str=NULL}, 0, 0, FLAGS }, >>> + { "out_fmt", "the data format of the filter's output, it could be >>> + gbrpf32le >>> [default] or gbrp10le", OFFSET(out_fmt), AV_OPT_TYPE_PIXEL_FMT, >>> {.i64=AV_PIX_FMT_GBRPF32LE}, AV_PIX_FMT_NONE, AV_PIX_FMT_NB, >> FLAGS }, >>> + { NULL } >>> +}; >>> + >>> +AVFILTER_DEFINE_CLASS(sdr2hdr); >>> + >>> +static av_cold int init(AVFilterContext* context) { >>> + SDR2HDRContext* ctx = context->priv; >>> + >>> + if (ctx->out_fmt != AV_PIX_FMT_GBRPF32LE && ctx->out_fmt != >>> AV_PIX_FMT_GBRP10LE) { >>> + av_log(context, AV_LOG_ERROR, "could not support the output >>> format\n"); >>> + return AVERROR(ENOSYS); >>> + } >>> + >>> + ctx->dnn_module = ff_get_dnn_module(DNN_TF); >>> + if (!ctx->dnn_module){ >>> + av_log(context, AV_LOG_ERROR, "could not create DNN module >>> + for >>> tensorflow backend\n"); >>> + return AVERROR(ENOMEM); >>> + } >>> + if (!ctx->model_filename){ >>> + av_log(context, AV_LOG_ERROR, "model file for network was not >>> specified\n"); >>> + return AVERROR(EIO); >>> + } >>> + if (!ctx->dnn_module->load_model) { >>> + av_log(context, AV_LOG_ERROR, "load_model for network was not >>> specified\n"); >>> + return AVERROR(EIO); >>> + } >>> + ctx->model = (ctx->dnn_module->load_model)(ctx->model_filename); >>> + if (!ctx->model){ >>> + av_log(context, AV_LOG_ERROR, "could not load DNN model\n"); >>> + return AVERROR(EIO); >>> + } >>> + return 0; >>> +} >>> + >>> +static int query_formats(AVFilterContext* context) { >>> + const enum AVPixelFormat in_formats[] = {AV_PIX_FMT_RGB24, >>> + AV_PIX_FMT_NONE}; >>> + enum AVPixelFormat out_formats[2]; >>> + SDR2HDRContext* ctx = context->priv; >>> + AVFilterFormats* formats_list; >>> + int ret = 0; >>> + >>> + formats_list = ff_make_format_list(in_formats); >>> + if ((ret = ff_formats_ref(formats_list, >>> + &context->inputs[0]->out_formats)) >>> < 0) >>> + return ret; >>> + >>> + out_formats[0] = ctx->out_fmt; >>> + out_formats[1] = AV_PIX_FMT_NONE; >>> + formats_list = ff_make_format_list(out_formats); >>> + if ((ret = ff_formats_ref(formats_list, >>> + &context->outputs[0]->in_formats)) >>> < 0) >>> + return ret; >>> + >>> + return 0; >>> +} >>> + >>> +static int config_props(AVFilterLink* inlink) { >>> + AVFilterContext* context = inlink->dst; >>> + SDR2HDRContext* ctx = context->priv; >>> + AVFilterLink* outlink = context->outputs[0]; >>> + DNNReturnType result; >>> + >>> + // the dnn model is tied with resolution due to deconv layer of >> tensorflow >>> + // now just support 1920*1080 and so the magic numbers within this file >>> + if (inlink->w != 1920 || inlink->h != 1080) { >>> + av_log(context, AV_LOG_ERROR, "only support frame size with >>> 1920*1080\n"); >>> + return AVERROR(ENOSYS); >>> + } >>> + >>> + ctx->input.width = 1920; >>> + ctx->input.height = 1088; //the model requires height is a multiple >>> of 32, >>> + ctx->input.channels = 3; >>> + >>> + result = (ctx->model->set_input_output)(ctx->model->model, &ctx- >>>> input, &ctx->output); >>> + if (result != DNN_SUCCESS){ >>> + av_log(context, AV_LOG_ERROR, "could not set input and output >>> + for >>> the model\n"); >>> + return AVERROR(EIO); >>> + } >>> + >>> + memset(ctx->input.data, 0, ctx->input.channels * ctx->input.width >>> + * ctx- >>>> input.height * sizeof(float)); >>> + outlink->h = 1080; >>> + outlink->w = 1920; >>> + return 0; >>> +} >>> + >>> +static float qsort_comparison_function_float(const void *a, const >>> +void >>> +*b) { >>> + return *(const float *)a - *(const float *)b; } >>> + >>> +static int filter_frame(AVFilterLink* inlink, AVFrame* in) { >>> + DNNReturnType dnn_result = DNN_SUCCESS; >>> + AVFilterContext* context = inlink->dst; >>> + SDR2HDRContext* ctx = context->priv; >>> + AVFilterLink* outlink = context->outputs[0]; >>> + AVFrame* out = ff_get_video_buffer(outlink, outlink->w, outlink->h); >>> + int total_pixels = in->height * in->width; >>> + >>> + if (!out){ >>> + av_log(context, AV_LOG_ERROR, "could not allocate memory for >>> output frame\n"); >>> + av_frame_free(&in); >>> + return AVERROR(ENOMEM); >>> + } >>> + >>> + av_frame_copy_props(out, in); >>> + >>> + for (int i = 0; i < in->linesize[0] * in->height; ++i) { >>> + ctx->input.data[i] = in->data[0][i] / 255.0f; >>> + } >>> + >>> + dnn_result = (ctx->dnn_module->execute_model)(ctx->model); >>> + if (dnn_result != DNN_SUCCESS){ >>> + av_log(context, AV_LOG_ERROR, "failed to execute loaded >> model\n"); >>> + return AVERROR(EIO); >>> + } >>> + >>> + if (ctx->out_fmt == AV_PIX_FMT_GBRPF32LE) { >>> + float* outg = (float*)out->data[0]; >>> + float* outb = (float*)out->data[1]; >>> + float* outr = (float*)out->data[2]; >>> + for (int i = 0; i < total_pixels; ++i) { >>> + float r = ctx->output.data[i*3]; >>> + float g = ctx->output.data[i*3+1]; >>> + float b = ctx->output.data[i*3+2]; >>> + outr[i] = r; >>> + outg[i] = g; >>> + outb[i] = b; >>> + } >>> + } else { >>> + // here, we just use a rough mapping to the 10bit contents >>> + // meta data generation for HDR video encoding is not supported yet >>> + float* converted_data = (float*)av_malloc(total_pixels * 3 * >>> sizeof(float)); >>> + int16_t* outg = (int16_t*)out->data[0]; >>> + int16_t* outb = (int16_t*)out->data[1]; >>> + int16_t* outr = (int16_t*)out->data[2]; >>> + >>> + float max = 1.0f; >>> + for (int i = 0; i < total_pixels * 3; ++i) { >>> + float d = ctx->output.data[i]; >>> + d = sqrt(d); >>> + converted_data[i] = d; >>> + max = FFMAX(d, max); >>> + } >>> + >>> + if (max > 1.0f) { >>> + AV_QSORT(converted_data, total_pixels * 3, float, >>> qsort_comparison_function_float); >>> + // 0.5% pixels are clipped >>> + max = converted_data[(int)(total_pixels * 3 * 0.995)]; >>> + max = FFMAX(max, 1.0f); >>> + >>> + for (int i = 0; i < total_pixels * 3; ++i) { >>> + float d = ctx->output.data[i]; >>> + d = sqrt(d); >>> + d = FFMIN(d, max); >>> + converted_data[i] = d; >>> + } >>> + } >>> + >>> + for (int i = 0; i < total_pixels; ++i) { >>> + float r = converted_data[i*3]; >>> + float g = converted_data[i*3+1]; >>> + float b = converted_data[i*3+2]; >>> + outr[i] = r / max * 1023; >>> + outg[i] = g / max * 1023; >>> + outb[i] = b / max * 1023; >>> + } >>> + >>> + av_free(converted_data); >>> + } >>> + >>> + av_frame_free(&in); >>> + return ff_filter_frame(outlink, out); } >>> + >>> +static av_cold void uninit(AVFilterContext* context) { >>> + SDR2HDRContext* ctx = context->priv; >>> + >>> + if (ctx->dnn_module){ >>> + (ctx->dnn_module->free_model)(&ctx->model); >>> + av_freep(&ctx->dnn_module); >>> + } >>> +} >>> + >>> +static const AVFilterPad sdr2hdr_inputs[] = { >>> + { >>> + .name = "default", >>> + .type = AVMEDIA_TYPE_VIDEO, >>> + .config_props = config_props, >>> + .filter_frame = filter_frame, >>> + }, >>> + { NULL } >>> +}; >>> + >>> +static const AVFilterPad sdr2hdr_outputs[] = { >>> + { >>> + .name = "default", >>> + .type = AVMEDIA_TYPE_VIDEO, >>> + }, >>> + { NULL } >>> +}; >>> + >>> +AVFilter ff_vf_sdr2hdr = { >>> + .name = "sdr2hdr", >>> + .description = NULL_IF_CONFIG_SMALL("HDR image generation from a >>> single exposure using deep CNNs."), >>> + .priv_size = sizeof(SDR2HDRContext), >>> + .init = init, >>> + .uninit = uninit, >>> + .query_formats = query_formats, >>> + .inputs = sdr2hdr_inputs, >>> + .outputs = sdr2hdr_outputs, >>> + .priv_class = &sdr2hdr_class, >>> + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC, >>> +}; >>> -- >>> 2.7.4 >> >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel