On Mon, Apr 11, 2016 at 02:30:37PM +0200, Gerion Entrup wrote: > On Montag, 11. April 2016 12:57:17 CEST Michael Niedermayer wrote: > > On Mon, Apr 11, 2016 at 04:25:28AM +0200, Gerion Entrup wrote: > > > On Donnerstag, 7. April 2016 00:35:25 CEST Michael Niedermayer wrote: > > > > On Wed, Mar 30, 2016 at 11:02:36PM +0200, Gerion Entrup wrote: > > > > > On Mittwoch, 30. März 2016 22:57:47 CEST Gerion Entrup wrote: > > > > > > Add improved patch. > > > > > > > > > > Rebased to master. > > > > > > > > > > > > > > Changelog | 1 > > > > > configure | 1 > > > > > doc/filters.texi | 70 +++ > > > > > libavfilter/Makefile | 1 > > > > > libavfilter/allfilters.c | 1 > > > > > libavfilter/signature.h | 554 ++++++++++++++++++++++++++++++ > > > > > libavfilter/signature_lookup.c | 550 ++++++++++++++++++++++++++++++ > > > > > libavfilter/version.h | 4 > > > > > libavfilter/vf_signature.c | 741 > > > > > +++++++++++++++++++++++++++++++++++++++++ > > > > > 9 files changed, 1921 insertions(+), 2 deletions(-) > > > > > 9192f27ded45c607996b4e266b6746f807c9a7fd > > > > > 0001-add-signature-filter-for-MPEG7-video-signature.patch > > > > > From 9646ed6f0cf78356cf2914a60705c98d8f21fe8a Mon Sep 17 00:00:00 2001 > > > > > From: Gerion Entrup <gerion.ent...@flump.de> > > > > > Date: Sun, 20 Mar 2016 11:10:31 +0100 > > > > > Subject: [PATCH] add signature filter for MPEG7 video signature > > > > > > > > > > This filter does not implement all features of MPEG7. Missing > > > > > features: > > > > > - compression of signature files > > > > > - work only on (cropped) parts of the video > > > > > --- > > > > > Changelog | 1 + > > > > > configure | 1 + > > > > > doc/filters.texi | 70 ++++ > > > > > libavfilter/Makefile | 1 + > > > > > libavfilter/allfilters.c | 1 + > > > > > libavfilter/signature.h | 554 ++++++++++++++++++++++++++++++ > > > > > libavfilter/signature_lookup.c | 550 ++++++++++++++++++++++++++++++ > > > > > libavfilter/version.h | 4 +- > > > > > libavfilter/vf_signature.c | 741 > > > > > +++++++++++++++++++++++++++++++++++++++++ > > > > > 9 files changed, 1921 insertions(+), 2 deletions(-) > > > > > create mode 100644 libavfilter/signature.h > > > > > create mode 100644 libavfilter/signature_lookup.c > > > > > create mode 100644 libavfilter/vf_signature.c > > > > > > > > > > diff --git a/Changelog b/Changelog > > > > > index 7b0187d..8a2b7fd 100644 > > > > > --- a/Changelog > > > > > +++ b/Changelog > > > > > @@ -18,6 +18,7 @@ version <next>: > > > > > - coreimage filter (GPU based image filtering on OSX) > > > > > - libdcadec removed > > > > > - bitstream filter for extracting DTS core > > > > > +- MPEG-7 Video Signature filter > > > > > > > > > > version 3.0: > > > > > - Common Encryption (CENC) MP4 encoding and decoding support > > > > > diff --git a/configure b/configure > > > > > index e550547..fe29827 100755 > > > > > --- a/configure > > > > > +++ b/configure > > > > > @@ -2979,6 +2979,7 @@ showspectrum_filter_deps="avcodec" > > > > > showspectrum_filter_select="fft" > > > > > showspectrumpic_filter_deps="avcodec" > > > > > showspectrumpic_filter_select="fft" > > > > > +signature_filter_deps="gpl avcodec avformat" > > > > > smartblur_filter_deps="gpl swscale" > > > > > sofalizer_filter_deps="netcdf avcodec" > > > > > sofalizer_filter_select="fft" > > > > > diff --git a/doc/filters.texi b/doc/filters.texi > > > > > index 5d6cf52..a95f5a7 100644 > > > > > --- a/doc/filters.texi > > > > > +++ b/doc/filters.texi > > > > > @@ -11559,6 +11559,76 @@ saturation maximum: > > > > > %@{metadata:lavfi.signalstats.SATMAX@} > > > > > @end example > > > > > @end itemize > > > > > > > > > > +@anchor{signature} > > > > > +@section signature > > > > > + > > > > > +Calculates the MPEG-7 Video Signature. The filter could handle more > > > > > than one > > > > > +input. In this case the matching between the inputs could be > > > > > calculated. The > > > > > +filter passthrough the first input. The output is written in XML. > > > > > + > > > > > +It accepts the following options: > > > > > + > > > > > +@table @option > > > > > +@item mode > > > > > > > > > +Enable the calculation of the matching. The option value must be 0 > > > > > (to disable > > > > > +or 1 (to enable). Optionally you can set the mode to 2. Then the > > > > > detection ends, > > > > > +if the first matching sequence it reached. This should be slightly > > > > > faster. > > > > > +Per default the detection is disabled. > > > > > > > > these shuld probably support named identifers not (only) 0/1/2 > > > done > > > > it should use AV_OPT_TYPE_INT and AV_OPT_TYPE_CONST not a string > > > > > > > > > > > > > > > > > > > > + > > > > > +@item nb_inputs > > > > > +Set the number of inputs. The option value must be a non negative > > > > > interger. > > > > > +Default value is 1. > > > > > + > > > > > +@item filename > > > > > +Set the path to witch the output is written. If there is more than > > > > > one input, > > > > > +the path must be a prototype, i.e. must contain %d or %0nd (where n > > > > > is a positive > > > > > +integer), that will be replaced with the input number. If no > > > > > filename is > > > > > +specified, no output will be written. This is the default. > > > > > + > > > > > > > > > +@item xml > > > > > +Choose the output format. If set to 1 the filter will write XML, if > > > > > set to 0 > > > > > +the filter will write binary output. The default is 0. > > > > > > > > format=xml/bin/whatever > > > > seems better as its more extensible > > > done > > > > > > > > > > > > > > > > + > > > > > +@item th_d > > > > > +Set threshold to detect one word as similar. The option value must > > > > > be an integer > > > > > +greater than zero. The default value is 9000. > > > > > + > > > > > +@item th_dc > > > > > +Set threshold to detect all words as similar. The option value must > > > > > be an integer > > > > > +greater than zero. The default value is 60000. > > > > > + > > > > > +@item th_xh > > > > > +Set threshold to detect frames as similar. The option value must be > > > > > an integer > > > > > +greater than zero. The default value is 116. > > > > > + > > > > > +@item th_di > > > > > +Set the minimum length of a sequence in frames to recognize it as > > > > > matching > > > > > +sequence. The option value must be a non negative integer value. > > > > > +The default value is 0. > > > > > + > > > > > +@item th_it > > > > > +Set the minimum relation, that matching frames to all frames must > > > > > have. > > > > > +The option value must be a double value between 0 and 1. The default > > > > > value is 0.5. > > > > > +@end table > > > > > + > > > > > +@subsection Examples > > > > > + > > > > > +@itemize > > > > > +@item > > > > > +To calculate the signature of an input video and store it in > > > > > signature.xml: > > > > > +@example > > > > > +ffmpeg -i input.mkv -vf signature=filename=signature.xml -map 0:v -c > > > > > rawvideo -f null - > > > > > +@end example > > > > > > > > the output seems to differ between 32 an 64bit x86 > > > > this would make any regression testing rather difficult > > > > why is there a difference ? can this be avoided or would that result in > > > > some disadvantage ? > > > This is due to this line: > > > sum -= ((double) blocksum)/(blocksize * denum); > > > > > > sum was a double. It seems the difference leads to different results in > > > 32 and 64 bit > > > (the 5 decimal place). I have reworked the filter part so it does not use > > > double at all. > > > This also leads in some fewer divisions, but the numbers get really big. > > > The relevant > > > parts use int63_t. > > > > > > If the videos gets really big, the numbers could overflow. Can I restrict > > > this someway? > > > > > > An upper bound could be find with: > > > 255 * BLOCK_LCM * (width/32+1)^2 * (height/32+1)^2 < 2^63 > > > I tested it with 4K (UHD) input. This does not give any problems, but it > > > is near the limit. > > > (As a note: Especially 4K is a certain amount under the limit, because > > > the width 3840 is > > > dividable by 32, so the square in the above formula could be deleted) > > > > > > The filter should generate the same signatures as in 64 bit before, now > > > with 32 and 64 bit. > > > > if you really need more tha 64bit ints you can take a look at > > libavutil/integer.h > > it would be better if the operations can be reshuffled to keep using > > intXY_t > This depends, IMHO 4K UHD is enough for now, and given, that you can simply > rescale a higher > resolution to somewhat below, without changing the function of the signature, > I would simply add > a check in config_input or so, that throws an error, if the resolution is too > high. Would this be ok?
not optimal but ok i guess [...] [...] > > > > > > > > > > > Then I added a few TODOs in the code, was about parts I don't know. Would > > > be nice, > > > if you comment there, too. > > > > > > > > I attached the new (complete) patch, the diff to the last time and the > > > updated check script. > > > > looks like the old patch + diff to the new > Yes. Thought you can see the differences to the already rewieved patch much > faster. the new patch + diff is better than the old + diff for testing that is > > > > > [...] > > > +static int filter_frame(AVFilterLink *inlink, AVFrame *picref) > > > +{ > > > + AVFilterContext *ctx = inlink->dst; > > > + SignatureContext *sic = ctx->priv; > > > + StreamContext *sc = &(sic->streamcontexts[FF_INLINK_IDX(inlink)]); > > > + FineSignature* fs; > > > + > > > + static const uint8_t pot3[5] = { 3*3*3*3, 3*3*3, 3*3, 3, 1 }; > > > + /* indexes of words : 210,217,219,274,334 44,175,233,270,273 > > > 57,70,103,237,269 100,285,295,337,354 101,102,111,275,296 > > > + s2usw = sorted to unsorted wordvec: 44 is at index 5, 57 at index > > > 10... > > > + */ > > > + static const unsigned int wordvec[25] = > > > {44,57,70,100,101,102,103,111,175,210,217,219,233,237,269,270,273,274,275,285,295,296,334,337,354}; > > > + static const uint8_t s2usw[25] = { 5,10,11, 15, 20, 21, 12, 22, > > > 6, 0, 1, 2, 7, 13, 14, 8, 9, 3, 23, 16, 17, 24, 4, 18, 19}; > > > + > > > + uint8_t wordt2b[5] = { 0, 0, 0, 0, 0 }; /* word ternary to binary */ > > > + uint64_t intpic[32][32]; > > > + uint64_t rowcount; > > > + uint8_t *p = picref->data[0]; > > > + int inti, intj; > > > + int *intjlut; > > > + > > > + double conflist[DIFFELEM_SIZE]; > > > + int f = 0, g = 0, w = 0; > > > + int dh1 = 1, dh2 = 1, dw1 = 1, dw2 = 1, denum, a, b; > > > + int i,j,k,ternary; > > > + uint64_t blocksum; > > > + int blocksize; > > > + double th; /* threshold */ > > > + double sum; > > > + > > > + /* initialize fs */ > > > + if(sc->curfinesig){ > > > + fs = av_mallocz(sizeof(FineSignature)); > > > + if (!fs) > > > + return AVERROR(ENOMEM); > > > + sc->curfinesig->next = fs; > > > + fs->prev = sc->curfinesig; > > > + sc->curfinesig = fs; > > > + }else{ > > > + fs = sc->curfinesig = sc->finesiglist; > > > + sc->curcoursesig1->first = fs; > > > + } > > > + > > > + fs->pts = picref->pts; > > > + fs->index = sc->lastindex++; > > > + > > > + memset(intpic, 0, sizeof(uint64_t)*32*32); > > > + intjlut = av_malloc(inlink->w * sizeof(int)); > > > + if (!intjlut) > > > + return AVERROR(ENOMEM); > > > + for (i=0; i < inlink->w; i++){ > > > + intjlut[i] = (i<<5)/inlink->w; > > > + } > > > + > > > + for (i=0; i < inlink->h; i++){ > > > + inti = (i<<5)/inlink->h; > > > + for (j=0; j< inlink->w; j++){ > > > + intj = intjlut[j]; > > > + intpic[inti][intj] += p[j]; > > > + } > > > + p += picref->linesize[0]; > > > + } > > > + av_free(intjlut); > > > + > > > + /* The following calculate a summed area table (intpic) and brings > > > the numbers > > > + * in intpic to to the same denuminator. > > > + * So you only have to handle the numinator in the following > > > sections. > > > + */ > > > + dh1 = inlink->h/32; > > > + if (inlink->h%32) > > > + dh2 = dh1 + 1; > > > + dw1 = inlink->w/32; > > > + if (inlink->w%32) > > > + dw2 = dw1 + 1; > > > > > + denum = dh1 * dh2 * dw1 * dw2; > > > > this will overflow if w and h are not multiplies of 32 and large > > the multiplication is done in 32bit not 64 > Don't get it. All of this are 32 bit integer. Given the input is: > 3842x2160 (nearly 4K), this would lead in a denum of: > 120 * 121 * 67 * 68 = 66153120 > > This is far below the 32 bit maximum. it will overflow with higher resolution [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB I do not agree with what you have to say, but I'll defend to the death your right to say it. -- Voltaire
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel