Am Donnerstag 17 Juli 2014, 13:00:13 schrieb Clément Bœsch: > On Thu, Jul 17, 2014 at 12:33:41PM +0200, Gerion Entrup wrote: > > Good day, > > > > I'm currently working on a video signature filter for ffmpeg. This allows > > you to fingerprint videos. > > Oh, nice. > > > This fingerprint is built up of 9mb/s of bits or 2-3 mb/s bits compressed. Argh, fail, sorry. I meant: 9mb per hour of video (and 2-3 mb per hour). > > > > In this context a few questions come into my mind: > > - Should I print this whole bitstream to stdout/stderr at the end? Is it > > maybe a better choice to made an own stream out of this. But which type > > of stream this is? > > How does the fingerprint looks like? Could it make sense as a gray video > output fractal, or maybe some kind of audio signal? There a finesignatures per frame and coursesignatures per 90 finesignatures. coursesignature are binarized histograms (0 or 1 possible as count). finesignature is mainly a vector of 380 difference values between -128 and 127 which are ternarized into 0 1 or 2. (See the MPEG-7 Standard for more details).
I doubt, this is a good video or audio stream. Definitely, interpreting this as video make sense in some way, but metadata looks more useful. > > Also, you still have the string metadata possibility (git grep SET_META > libavfilter). Hmm, thank you, I will take a look at it. If I see it right, it is used to fill a dictionary per frame with some kind of data? > > > (btw, the video signature algorithm needs 90 following frames, so I can > > > > theoretically write every 90 frames something somewhere.) > > Do you cache all these frames or just update your caches/stats & drop > them? ATM I don't cache the frames, but the whole signature. As said above, the coursesignatures (the part, which needs the 90 frames) is calculated only from the finesignatures (the finesignatures are cached, anyway). > > > - If I print the whole bitstream to stdout/stderr (my current > > implementation), is there a possibility to use this later in an external > > program? The only other globally analyze filter I found is volumedetect. > > This filter at the end prints per print_stats the calculated results to > > the console. Is there a possibility within the API for an external > > program to use these values or do I have to grep the output? > > stdout/stderr really isn't a good thing. Using metadata is way better > because you can output them from ffprobe, and parse them according to > various outputs (XML, CSV, JSON, ...). Sounds good… > > Another solution I can now think of is to simply pass an output file as > option to the filter. That's typically how we do the 2-pass thing with > vidstab filter. I don't like output files. If you want to write a program, that performs a lookup to signatures somewhere stored in a database and this program uses ffmpeg internally and then always has to write a file and read it again, it's not that elegant. (btw, an example for such a program is Musicbrainz Picard, but for AcousticID ;)) > > [...] > > > Another thing that came into my mind: Can filter force other filters to go > > into the filterchain? I see it, when I force GREY_8 only in my filter, it > > automatically enables the scale filter, too. > > Some filter are inserted automatically for conversion & constraints, but > that's not decided by the filters but the framework itself. > > > The reason I asked is the > > lookup > > > > for my filter. Currently my filter analyzes a video and then produces a > > lot of numbers. To compare two videos and decide, wheather they match or > > not, these numbers has to be compared. I see three possibilities: > > 1. Write an VV->V filter. Reimplement (copy) the code from the V->V > > signature filter and give a boolean as output (match or match not). > > 2. Take the V->V filter and write a python (or whatever) script that fetch > > the output and calculates then the rest. > > 3. Write an VV->V filter, but enforce, that the normal signature filter is > > executed first to both streams, use the result and then calculate the > > matching type. Unfortunately I have no idea, how to do this and whether > > it is possible at all. Can you give me an advice? > > So if you output a file in the filter itself: > ffmpeg -i video -vf fingerprint=video.sig -f null - > ffmpeg -i another -vf fingerprint=video.sig:check=1 -f null - > > Or if you save the signature "stream" in a video (in gray8 for instance): > ffmpeg -i video -vf fingerprint -c:v ffv1 sig.nut > ffmpeg -i another -i sig.nut -vf '[0][1] fingerprint=mode=check' -f null - > > The 2nd method is "better" because it doesn't require file handling in the > library, and it also allows stuff like using a diff filter (if you also > apply fingerprint - not with mode=check - on `another`) > > Am I understanding right your wondering? No ;), but anyway thanks for your answer. In your 2nd method your filter is a VV->V filter? Am I right, that this filter then also can take only one stream? Said in another way: Can a VV->V filter also behave as a V->V filter? My original thinking was something like (see it monospaced): in1------>fingerprint1---. |----> fingerprintcombo ---> out in2------>fingerprint2---` fingerprintcombo could anyhow force the framework to insert fingerprint1 and 2 in the filterchain, then uses its output to calculate the matching. Your second proposal is better :) (if it works as V->V, too). > > > The last possibility also would allow something like twopass volume > > normalisation. Currently there is a volumedetect and volume filter. To > > normalize once could run volumedetect, then fetch the output, and put the > > values into the volume filter, but I currently don't see a way to do this > > automatically directly in ffmpeg. > > Check tools/normalize.py, it's using ebur128 and the metadata system. Thats what I mean. Someone has to write an external script, which calls ffmpeg/ffprobe two times, parse stdout of the first call and pass it to the filteroptions of the second call. As I see, there is no direct way. Something like: ffmpeg -i foo -f:a volume=mode=autodetect normalized.opus Internal: volume filter recognizes: I need a value from ebur128, volumedetect etc. and says the framework: I cannot work, need input from ... the framework inserts volumedetect and says: here is your input, do what you want volumedetect says volume: here is my output, do your work volume says framework: I could work now, but needed the first sample again. The normalize script e.g. has the disadvantage to take only one soundstream (if I see it correct). Anyway, thank you for your answers, helped a lot by now. > > > (Once the filter is in a good state, I will try to bring it upstream.) > > Cool > > > Best, > > Gerion _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel