On 12/30/18, Nicolas George <geo...@nsup.org> wrote: > Uwe Freese (2018-12-29): >> The downside is that the term "power" (or "strength") would be more >> generic >> and would probably be better reusable for other modes. >> >> What do you think? Is "exponent" now ok, or should I change it back to >> "power", "strength" or something alike? > > I would say: do not bother. This can be decided when a new mode is > added, and options can have several names. > > By the way, you would not be interested in developing a logo-detection > filter, per chance? Something that detects is a certain rectangle is > likely to contain a logo and computes its bounding box. > >> Again, I would be glad about comments and reviews. - Let me know what >> should >> be changed and I'll create a new patch in some days. > > Here is a more in-depth review from me: > >> >From 83e79abb3311eb2dd92c63e8e15e0476af2f2891 Mon Sep 17 00:00:00 2001 >> From: breaker27 <m...@uwe-freese.de> >> Date: Wed, 26 Dec 2018 18:16:48 +0100 >> Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode >> >> --- >> doc/filters.texi | 28 +++++++ >> libavfilter/vf_delogo.c | 189 >> +++++++++++++++++++++++++++++++++++++++++++++--- >> 2 files changed, 208 insertions(+), 9 deletions(-) >> >> diff --git a/doc/filters.texi b/doc/filters.texi >> index 65ce25bc18..792560ad79 100644 >> --- a/doc/filters.texi >> +++ b/doc/filters.texi >> @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the >> rectangle (added to >> deprecated, setting higher values should no longer be necessary and >> is not recommended. >> >> +@item mode >> +Specify the interpolation mode used to remove the logo. >> +It accepts the following values: >> +@table @option >> +@item xy >> +Only the border pixels in straight x and y direction are considered >> +to replace any logo pixel. This mode tends to flicker and to show >> +horizontal and vertical lines. >> +@item uglarm >> +Consider the whole border for every logo pixel to replace. This mode >> +uses the distance raised to the power of the given @var{exponent} as >> +weight that decides to which amount every border pixel is taken into >> +account. This results in a more blurred area, which tends to be less >> +distracting. The uglarm mode was first implemented in VirtualDub's >> +LogoAway filter and is also known from ffdshow and is the >> +abbreviation for "Uwe's Great LogoAway Remove Mode". >> +@end table >> +The default value is @code{xy}. >> + >> +@item exponent >> +Specify the exponent used to calculate the weight out of the >> +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}). >> +The value @code{0} results in a logo area which has >> +the same average color everywhere. The higher the value, the more >> +relevant a near border pixel will get, meaning that the borders of >> +the logo area are more similar to the surrounding pixels. The default >> +value is @code{3}. >> + >> @item show >> When set to 1, a green rectangle is drawn on the screen to simplify >> finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters. >> diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c >> index 065d093641..dcb0366e63 100644 >> --- a/libavfilter/vf_delogo.c >> +++ b/libavfilter/vf_delogo.c >> @@ -2,6 +2,7 @@ >> * Copyright (c) 2002 Jindrich Makovicka <makov...@gmail.com> >> * Copyright (c) 2011 Stefano Sabatini >> * Copyright (c) 2013, 2015 Jean Delvare <jdelv...@suse.com> >> + * Copyright (c) 2018 Uwe Freese <m...@uwe-freese.de> >> * >> * This file is part of FFmpeg. >> * >> @@ -25,12 +26,16 @@ >> * A very simple tv station logo remover >> * Originally imported from MPlayer libmpcodecs/vf_delogo.c, >> * the algorithm was later improved. >> + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for >> Virtual >> + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into >> ffdshow's >> + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg. >> */ >> >> #include "libavutil/common.h" >> #include "libavutil/imgutils.h" >> #include "libavutil/opt.h" >> #include "libavutil/pixdesc.h" > >> +#include "libavutil/avassert.h" > > We try to keep the includes in alphabetical order. > >> #include "avfilter.h" >> #include "formats.h" >> #include "internal.h" >> @@ -50,6 +55,10 @@ >> * @param logo_w width of the logo >> * @param logo_h height of the logo >> * @param band the size of the band around the processed area >> + * @param *uglarmtable pointer to weight table in UGLARM >> interpolation mode, >> + * zero when x-y mode is used >> + * @param *uglarmweightsum pointer to weight sum table in UGLARM >> interpolation mode, >> + * zero when x-y mode is used >> * @param show show a rectangle around the processed area, useful for >> * parameters tweaking >> * @param direct if non-zero perform in-place processing >> @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int >> dst_linesize, >> uint8_t *src, int src_linesize, >> int w, int h, AVRational sar, >> int logo_x, int logo_y, int logo_w, int logo_h, >> - unsigned int band, int show, int direct) >> + unsigned int band, double *uglarmtable, >> + double *uglarmweightsum, int show, int direct) >> { >> int x, y; >> uint64_t interp, weightl, weightr, weightt, weightb, weight; >> @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int >> dst_linesize, >> dst += (logo_y1 + 1) * dst_linesize; >> src += (logo_y1 + 1) * src_linesize; >> >> + if (!uglarmtable) { >> for (y = logo_y1+1; y < logo_y2; y++) { >> left_sample = topleft[src_linesize*(y-logo_y1)] + >> topleft[src_linesize*(y-logo_y1-1)] + >> @@ -151,12 +162,125 @@ static void apply_delogo(uint8_t *dst, int >> dst_linesize, >> dst += dst_linesize; >> src += src_linesize; >> } >> + } else { >> + int bx, by; >> + double interpd; >> + > >> + for (y = logo_y1 + 1; y < logo_y2; y++) { >> + for (x = logo_x1 + 1, >> + xdst = dst + logo_x1 + 1, >> + xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++, >> xsrc++) { > > I think a line break after the semicolons would make this easier to > understand. > > Also, since x and y always used subtracted to logo_x1 or logo_y1, I > think changing the bounds of the loop would make the rest of the code > more readable: > > for (x = 1; x < logo_w_minus_one; x++) > >> + >> + if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 || >> + x == logo_x1 + 1 || x == logo_x2 - 1)) { > > Nit: the alignment is off. > >> + *xdst = 0; >> + continue; >> + } >> + >> + interpd = 0; >> + > >> + for (bx = 0; bx < logo_w; bx++) { >> + interpd += topleft[bx] * >> + uglarmtable[abs(bx - (x - logo_x1)) + (y - >> logo_y1) * (logo_w - 1)]; >> + interpd += botleft[bx] * >> + uglarmtable[abs(bx - (x - logo_x1)) + (logo_h - >> (y - logo_y1) - 1) * (logo_w - 1)]; >> + } > > This can be optimized, and since it is the inner loop of the filter, I > think it is worth it. You can declare a pointer that will stay the same > for the whole y loop: > > double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1); > > and then use it in that loop: > > interpd += ... * xweight[abs(bx - (x - logo_x1))]; > > It avoids a lot of multiplications. > >> + >> + for (by = 1; by < logo_h - 1; by++) { >> + interpd += topleft[by * src_linesize] * >> + uglarmtable[(x - logo_x1) + abs(by - (y - >> logo_y1)) * (logo_w - 1)]; >> + interpd += topleft[by * src_linesize + (logo_w - 1)] >> * >> + uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by - >> (y - logo_y1)) * (logo_w - 1)]; >> + } > > This one is more tricky to optimize, because of the abs() within the > multiplication, but it can be done by splitting the loop in two: > > for (by = 1; by < y; by++) { > interpd += ... * yweight[x - logo_x1]; > yweight -= logo_w_minus_one; > } > for (; by < logo_h_minus_one; by++) { > interpd += ... * yweight[x - logo_x1]; > yweight += logo_w_minus_one; > } > av_assert2(yweight == the_correct_value); > > In fact, I think even the x loop would be a little more readable if it > was split in two like that. > >> + >> + interp = (uint64_t)(interpd / >> + uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) >> * (logo_w - 2)]); > > The cast to uint64_t seems suspicious. Can you explain? > >> + *xdst = interp; >> + } >> + >> + dst += dst_linesize; >> + src += src_linesize; >> + } >> + } >> +} >> + >> +/** >> + * Calculate the lookup tables to be used in UGLARM interpolation mode. >> + * >> + * @param *uglarmtable Pointer to table containing weights for each >> possible >> + * diagonal distance between a border pixel and >> an inner >> + * logo pixel. >> + * @param *uglarmweightsum Pointer to a table containing the weight sum >> to divide >> + * by for each pixel within the logo area. >> + * @param sar The sar to take into account when calculating >> lookup >> + * tables. >> + * @param logo_w width of the logo >> + * @param logo_h height of the logo >> + * @param exponent exponent used in uglarm interpolation >> + */ >> +static void calc_uglarm_tables(double *uglarmtable, double >> *uglarmweightsum, >> + AVRational sar, int logo_w, int logo_h, >> + float exponent) >> +{ > >> + double aspect = (double)sar.num / sar.den; > > Tiny optimization: > > double aspect2 = aspect * aspect; > > for use later. > >> + int x, y; > > >> + >> + /* uglarmtable will contain a weight for each possible diagonal >> distance >> + * between a border pixel and an inner logo pixel. The maximum >> distance in >> + * each direction between border and an inner pixel can be logo_w - >> 1. The >> + * weight of a border pixel which is x,y pixels away is stored at >> position >> + * x + y * (logo_w - 1). */ >> + for (y = 0; y < logo_h - 1; y++) >> + for (x = 0; x < logo_w - 1; x++) { >> + if (x + y != 0) { > >> + double d = pow(sqrt(x * x * aspect * aspect + y * y), >> exponent); > > pow(sqrt(x * x * aspect2 + y * y), exponent / 2); > >> + uglarmtable[x + y * (logo_w - 1)] = 1.0 / d; >> + } else { >> + uglarmtable[x + y * (logo_w - 1)] = 1.0; >> + } >> + } >> + >> + /* uglarmweightsum will contain the sum of all weights which is used >> when >> + * an inner pixel of the logo at position x,y is calculated out of >> the >> + * border pixels. The aggregated value has to be divided by that. The >> value >> + * to use for the inner 1-based logo position x,y is stored at >> + * (x - 1) + (y - 1) * (logo_w - 2). */ >> + for (y = 1; y < logo_h - 1; y++) >> + for (x = 1; x < logo_w - 1; x++) { >> + double weightsum = 0; >> + > >> + for (int bx = 0; bx < logo_w; bx++) { > > Our coding style forbids declaring variables on the fly like that. Same > below.
This is simply not more true. Look at developer guidelines for this project. > >> + /* top border */ >> + weightsum += uglarmtable[abs(bx - x) + y * (logo_w - >> 1)]; >> + /* bottom border */ >> + weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * >> (logo_w - 1)]; >> + } >> + >> + for (int by = 1; by < logo_h - 1; by++) { >> + /* left border */ >> + weightsum += uglarmtable[x + abs(by - y) * (logo_w - >> 1)]; >> + /* right border */ >> + weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * >> (logo_w - 1)]; >> + } >> + >> + uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = >> weightsum; >> + } >> } >> >> +enum mode { >> + MODE_XY, >> + MODE_UGLARM, >> + MODE_NB >> +}; >> + > >> +#define MAX_PLANES 10 > > You could use FF_ARRAY_ELEMS(AVPixFmtDescriptor.comp), and the > consistency would be guaranteed. Well, that syntax is not valid, but it > can be expressed: > > #define MAX_PLANES FF_ARRAY_ELEMS(((AVPixFmtDescriptor *)0)->comp) > > But I have a better suggestion: > > #define MAX_SUB 2 > > double uglarmtable[MAX_SUB + 1][MAX_SUB + 1] > > and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the > table will be computed only once for Y and A and once for U and V. > > The assert is still needed with that solution, though. > >> + >> typedef struct DelogoContext { >> const AVClass *class; >> - int x, y, w, h, band, show; >> -} DelogoContext; >> + int x, y, w, h, band, mode, show; >> + float exponent; >> + double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES]; >> +} DelogoContext; >> >> #define OFFSET(x) offsetof(DelogoContext, x) >> #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM >> @@ -171,6 +295,10 @@ static const AVOption delogo_options[]= { >> { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, >> { .i64 = 0 }, 0, INT_MAX, FLAGS }, >> { "t", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, >> { .i64 = 0 }, 0, INT_MAX, FLAGS }, >> #endif >> + { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, >> { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" }, >> + { "xy", "use pixels in straight x and y direction", >> OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY }, 0, 0, FLAGS, >> "mode" }, >> + { "uglarm", "UGLARM mode, use full border", >> OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, >> "mode" }, >> + { "exponent","exponent used for UGLARM interpolation", >> OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS }, >> { "show", "show delogo area", OFFSET(show), >> AV_OPT_TYPE_BOOL,{ .i64 = 0 }, 0, 1, FLAGS }, >> { NULL } >> }; >> @@ -215,8 +343,8 @@ static av_cold int init(AVFilterContext *ctx) >> #else >> s->band = 1; >> #endif >> - av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d >> show:%d\n", >> - s->x, s->y, s->w, s->h, s->band, s->show); >> + av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d >> exponent:%f show:%d\n", >> + s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, >> s->show); >> >> s->w += s->band*2; >> s->h += s->band*2; >> @@ -226,6 +354,18 @@ static av_cold int init(AVFilterContext *ctx) >> return 0; >> } >> >> +static av_cold void uninit(AVFilterContext *ctx) >> +{ >> + DelogoContext *s = ctx->priv; >> + > >> + if (s->mode == MODE_UGLARM) { > > No need to test, we can free anyway. > >> + for (int plane = 0; plane < MAX_PLANES; plane++) { > > Forbidden variable declaration. > >> + av_freep(&s->uglarmtable[plane]); >> + av_freep(&s->uglarmweightsum[plane]); >> + } >> + } >> +} >> + >> static int config_input(AVFilterLink *inlink) >> { >> DelogoContext *s = inlink->dst->priv; >> @@ -270,20 +410,50 @@ static int filter_frame(AVFilterLink *inlink, >> AVFrame *in) >> if (!sar.num) >> sar.num = sar.den = 1; >> >> + if (s->mode == MODE_UGLARM) >> + av_assert0(desc->nb_components <= MAX_PLANES); >> + >> for (plane = 0; plane < desc->nb_components; plane++) { >> int hsub = plane == 1 || plane == 2 ? hsub0 : 0; >> int vsub = plane == 1 || plane == 2 ? vsub0 : 0; >> > >> + /* Up and left borders were rounded down, inject lost bits >> + * into width and height to avoid error accumulation */ > > Not: the alignment is off. > >> + int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), >> hsub); >> + int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), >> vsub); >> + >> + /* Init lookup tables once */ >> + if (s->mode == MODE_UGLARM) { >> + if (!s->uglarmtable[plane]) { >> + s->uglarmtable[plane] = > >> + (double*)av_malloc((logo_w - 1) * (logo_h - 1) * >> sizeof(double)); > > No casting of malloc returns in C, this is a c++ thing. Also, use > av_malloc_array(). > >> + >> + if (!s->uglarmtable[plane]) { >> + return AVERROR(ENOMEM); >> + } > > The braces could be dropped. > >> + >> + s->uglarmweightsum[plane] = >> + (double*)av_malloc((logo_w - 2) * (logo_h - 2) * >> sizeof(double)); >> + >> + if (!s->uglarmweightsum[plane]) { >> + return AVERROR(ENOMEM); >> + } > > Same as above, of course. > >> + >> + calc_uglarm_tables(s->uglarmtable[plane], >> + s->uglarmweightsum[plane], >> + sar, logo_w, logo_h, s->exponent); >> + } >> + } >> + > > I think this whole block could and should be moved into config_input(). > >> apply_delogo(out->data[plane], out->linesize[plane], >> in ->data[plane], in ->linesize[plane], >> AV_CEIL_RSHIFT(inlink->w, hsub), >> AV_CEIL_RSHIFT(inlink->h, vsub), >> sar, s->x>>hsub, s->y>>vsub, >> - /* Up and left borders were rounded down, inject >> lost bits >> - * into width and height to avoid error accumulation >> */ >> - AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), >> hsub), >> - AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), >> vsub), >> + logo_w, logo_h, >> s->band>>FFMIN(hsub, vsub), >> + s->uglarmtable[plane], >> + s->uglarmweightsum[plane], >> s->show, direct); >> } >> >> @@ -317,6 +487,7 @@ AVFilter ff_vf_delogo = { >> .priv_size = sizeof(DelogoContext), >> .priv_class = &delogo_class, >> .init = init, >> + .uninit = uninit, >> .query_formats = query_formats, >> .inputs = avfilter_vf_delogo_inputs, >> .outputs = avfilter_vf_delogo_outputs, > > Regards, > > -- > Nicolas George > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel