Le duodi 2 brumaire, an CCXXIII, Clement Boesch a écrit : > More still standing problems while we are at it: > > 1.7. Metadata > > Metadata are not available at "graph" level, or at least filter > level, only at frame level. We also need to define how they can be > injected and fetched from the users (think "rotate" metadata).
That is an interesting issue. At graph level, that is easy, but that would be mostly useless (the rotate filter changes the rotate metadata). At filter level, that is harder because it requires all filters to forward the metadata at init time, so extra code in a lot of filters. Furthermore, since our graphs are not constructed in order, and can even theoretically contain cycles, it requires another walk-over to ensure stabilization. The whole query_formats() / config_props() is already too complex IMHO. Actually, I believe I can propose a simple solution: inject the stream metadata as frame metadata on dummy frames. Filters that need them are changed to examine the dummy frames, filters that do not need them just ignore them and let the framework forward it. (Of course, the whole metadata system can never work perfectly: the scale filter does not update any "dpi" metadata; the crop filter would need to update the "aperture" metadata for photos, and if the crop is not centered I am not even sure this makes sense, etc. If someone adds "xmin", "xmax", "xscl" (no, not this one, bad work habit), "ymin", "ymax" to the frames produced by vsrc_mandelbrot, or the geographic equivalent to satellite images, how is the rotate filter supposed to handle that? The best answer would probably be "we do not care much".) > 1.8. Seeking > > Way more troublesome: being able to request an exact frame in the past. > This currently limits a lot the scope of the filters. > > thumbnail filter is a good example of this problem: the filter > doesn't need to keep all the frames it analyzes in memory, it just > needs statistics about them, and then fetches the best in the batch. > Currently, it needs to keep them all because we are in a forward > stream based logic. This model is kind of common and quite a pain to > implement currently. > > I don't think the compression you propose at the end would really > solve that. You raise an interesting point. Unlimited FIFOs (with or without external storage or compression: they are just means of handling larger FIFOs with smaller hardware) can be of some help in that case, but not much. In the particular example you indicate, I can imagine a solution with two filters: thumbnail-detect outputs just pseudo-frame metadata with the timestamp of the selected thumbnails images, and thumbnail-select use that metadata from one input, reading the actual frames from its second input connected to a large FIFO. But that is outright ugly. For actual seeking, I suppose we would need a mechanism to send messages backward on the graph. As for the actual implementation, I suppose that a filter that supports seeking would be required to advertise so on its output: "I can seek back to pts=42", and a filter that requires seeking from its input would give forewarning: "I may need to seek back to pts=12", so that the framework can buffer all frames from 12 to 42. That requires thinking. > 1.9. Automatic I/O count > > "... [a] split [b][c] ..." should guess there is 2 outputs. > "... [a][b][c] concat [d] ..." as well I believe this one to be pretty easy, design-wise, in fact: just decide on a standard name for the options that give the number of input and outputs, maybe just nb_inputs and nb_outputs, and then it is only a matter of tweaking the graph parser to set them if possible and necessary. > Did you already started some development? Do you need help? > > I'm asking because it looks like it could be split into small relatively > easy tasks on the Trac and helps introducing new comers (and also track > the progress if some people assign themselves to these tickets). I have not started writing code: for large re-design, I would not risk someone telling me "this is stupid, you can do the same thing ten times simpler like that". You are right, some of the points I raise are mostly stand-alone tasks. > > AVFilterLink.pts: current timestamp of the link, i.e. end timestamp of > > the last forwarede frame, assuming the duration was correct. This is > > somewhat redundant with the fields in AVFrame, but can carry the > > information even when there is no actual frame. > The timeline system seems to be able to workaround this. How is this going > to help? I do not see how this is related. When the timeline system is invoked, there is a frame, with a timestamp. The timestamp may be NOPTS, but that is just a matter for the enable expression to handle correctly. The issue I am trying to address is the one raised in this example: suppose overlay detects EOF on its secondary input; the last secondary frames were at PTS 40, 41, 42, and now here comes a main frame at PTS 42.04: should overlay slap the last secondary frame on it or not? In this particular case, it is pretty obvious that EOF happens at PTS 43, but teaching a program to see the obvious is not easy, and that may actually be wrong. The previous filters, or the application (through the demuxer) may have more accurate information. What I propose for that issue is to have something like AVFilterLink.head_pts that records the PTS of the last activity on a link. When a frame is passed on the link, it is updated to frame.pts + frame.duration, but it may be updated by other circumstances too. The core idea is to have as much as possible information directly available to filters without requiring them to work for it. A filter could always update head_pts in its own private context, but then, if a new way of updating it is added, all filters may need to be updated. > Well, I believe it should be handled by the framework transparently > somehow. Yes, exactly. > Users can already fix the timestamps themselves with [a]setpts > filters, but it's often not exactly obvious why they do need to. Hum, I do not think that setpts is suitable in this case: you can not use it to set frame.duration = next_frame.pts - frame.pts because next_frame is not available. Even for cases it can handle, it requires complex formulas (with escaping; I still have to look at Michael's patch for balanced escaping); we do not want users to copy-paste half-broken expressions found in obsolete examples on the web. Plus it uses floats. A dedicated filter seems more correct: fixpts=delta2duration=1 for example. > It doesn't need to be a filter and can be part of the framework itself. I believe the framework should do the work, but also expose it as an internal API to be used by the fixpts filter when explicit handling and user-settable options are necessary. Regards, -- Nicolas George
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel