Let me respond on two levels. Before exploring the design space of a separation of libavdevice and libavformat below, I think it is important to first comment on the current state (and whether the AVDevice Capabilities part of my patch series should be blocked by this discussion).
Importantly, I would suppose that any reorganization of libavdevice and libavformat and redesign of the libavdevice API must aim to offer at least the same functionality as the current API, that is, an avdevice should be able to be queried for what devices it offers (get_device_list), should for each device provide information about what formats it accepts/can provide (create_device_capabilities/free_device_capabilities) and should be able to be controlled through the API (control_message). Perhaps these take different forms, but same functionality should be offered. As such, having AVDevice Capabilities API implemented for one of the devices should help, not hamper, redesign efforts because it shows how this API would actually be used in practice. Fundamental changes such as a new avdevice API will be backwards incompatible no matter what, so having one more bit of important functionality (create_device_capabilities/free_device_capabilities) implemented doesn't create a larger threshold to initiating such a redesign effort. Instead, it forces that all the current API functionality is thought out as well during the redesign effort and nothing is forgotten. I thus argue that its a good thing to bring back the AVDevice Capabilities API, since it helps, not hinders the redesign effort. And lets not forget it offers users of the current API functionality (me at least) they need now, not at some indeterminate timepoint in the future. On Wed, Jun 9, 2021 at 10:33 PM Anton Khirnov <an...@khirnov.net> wrote: > Look through the threads > [...] Thanks for the pointers! > The problem is that libavdevice is a separate library from libavformat, > but fundamentally depends on accessing libavformat internals. Ah ok, so this is at first instance about cleanup/separation, not necessarily about adding new functionality (I do see Mark's list of opportunities that a new API offer, copied below). I see Nicolas argue this entanglement of internals is not a problem in practice, and i suppose there is a certain amount of taste involved here. Nothing wrong with that. I guess for me personally that it is a little funky to have to add/change things in AVFormat when changing the AVDevice API, and that it may be good to for the longer term look at disentangling them. I will get back to that below, in response to some quotes of Mark's messages last January. Mark's (non-exhaustive) list of opportunities a libavdevice API redesign offers (numbered by me): On 20/01/2021 12:41, Mark Thompson wrote: > 1. Handle frames as well as packets. > 1a. Including hardware frames - DRM objects from KMS/V4L2, D3D surfaces from Windows desktop duplication (which doesn't currently exist but should). > 2. Clear core option set - currently almost everything is set by inconsistent private options; things like pixel/sample format, frame/sample rate, geometry and hardware device should be common options to all. > 3. Asynchronicity - a big annoyance in current recording scenarios with the ffmpeg utility is that both audio and video capture block, and do so on the same thread which results in skipped frames. > 4. Capability probing - the existing method of options which log the capabilities are not very useful for API users. 1 and 3 i cannot speak to, but 4 is indeed what i ran into: the current state of most avdevices is not useful at all for an API user like me when it comes to capability probing (not a reason though to get rid of the whole API, but to wonder why it wasn't implemented. while nobody apparently bothered to do it before me, i think there will be more than just me who will actually use it). Currently I'd have to issue device specific options on a not-yet opened device, listen to the log output, parse it, etc. But the current API already solves this, if only it was implemented. A clear core option set would be nice indeed. And the AVDevice Capabilities API actually offers a start at that, since it lists a bunch of options that should be relevant to query (and set) for each device in the form of ff_device_capabilities (in my patchset), or av_device_capabilities before Andreas' patch removing it in January. I don't think its complete, but its a good starting point. Mark Thompson (2021-01-25): > * Many of those are using it via the ffmpeg utility, but not all. Indeed, i am an (aspiring) API user, of the dshow device specifically, and possibly v4l2 later (but my project is Windows-only right now). Currently hampered by lack of some API not being implemented for dshow, hence my patch set. > * The libavdevice API is the libavformat API because it was originally > split out from libavformat, and it has the nice property that devices > and files end up being interchangable in some contexts. I can't underline enough how nice this is. My situation is simple: devices such as webcams (but plenty others) may deliver video in various formats, including encoded. I would have to decode those to use them, output provided by the devices would thus have to go through much the same pipeline as data from video files. I already had code for reading in video files, so changes to also support webcams were absolutely minimal. However, i needed some APIs implemented to really round things off, make things both convenient (already the case) and flexible (my patch set). > * The libavdevice API, being the libavformat API for files, is not > particularly well-suited in other contexts, because devices may not > have the same properties as files. Yeah, not every field in the AVFormatxxx structs is relevant for an AVDevice. And some are a bit funkily named (like url to stuff the device name of my webcam into). But are there specific fields one would wish to provide for an avdevice that are currently not available? > * Some odd things like the completely-unused capabilities API and the > almost-never-used message API are hacked on top of that to try to > avoid some libavformat issues, but are not actually useful to anyone > (hence the lack of use). They certainly are useful! As are the avdevices themselves. I was surprised that these APIs are not/hardly implemented. My patch set makes using my webcam much more useful, as i am now able to pause and restart capture (not leading to a buffer filling up when not interested in the output!), allow me to discover what devices the user has attached, and what formats these expose, so i can make a proper UI (like e.g. OBS studio has). And making this UI is minimal effort as i would not first have to learn how to work with DirectShow, or to add yet another dependency to my application (again, ffmpeg would be needed anyway, as i'd need to decode incoming video). It makes ffmpeg a tool that allows you to move fast, something you can really build upon, without losing out on device-specific config/access. > * To implement devices as AVInputFormat/AVOutputFormat instances, > libavdevice currently needs access to the internals of libavformat. > * Many developers want to get rid of that dependency on libavformat > internals, because it creates a corresponding ugliness on the > libavformat side which has to leave those parts exposed in an > ABI-constrained way. What specific internals does libavdevice depend on? Is it only the various function pointers in AVInputFormat and AVOutputFormat which are specific to devices, not all formats? Or is there more? I also understand that avdevices need to implement some of the other function pointers to be functional (e.g. read_header, read_packet and read_close), but that seems unavoidable if we'd want avdevices to be usable where avformats are (and again: that's a huge plus in my view). I also understand that the AVDevice API being exposed in the libavformat makes it harder to evolve the AVDevice API. Let me make an observation though: if we would not want to lose the possibility to use avdevices drop-in in the place of AVFormats, some kind of component that has access to internals of both seems unavoidable. To me, the logical way to keep AVdevices interchangeable with AVFormats while separating out the AVDevice API would be to provide some kind of avdevice generic wrapper/adapter format that would translate between the AVFormat and AVDevice API. This wrapper would presumably be an AVFormat, but for it to work it would need access to AVDevice internals (if only to remap function pointers). If it is in the avdevice library, it would need access to AVFormat internals. So some entanglement involving internals is unavoidable, and a bullet that has to be swallowed. Agreed? Anyway, out of Mark's options i'd vote for a separate new AVDevice API, and an adapter component to expose/plug in AVDevices as formats. This general adapter can expose the generic options (and device-specific options as child options), handle any threading as needed, map device names to the url field, etc. Workflow could then be something like (rough proposal to get this started): AVDeviceContext* dev_ctx = avdevice_alloc_context(); AVInputDevice* dev_inp_ctx = av_find_input_device("dshow"); // or av_input_device_next(AV_DEVICE_VIDEO) or av_device_next(AV_DEVICE_VIDEO | AV_DEVICE_INPUT) for any avdevice_open_input(dev_ctx, dev_inp_ctx, options); // or: AVDeviceContext* dev_ctx = avdevice_alloc_input_context(AVInputDevice* device, const char* dev_name); // e.g. dev_name="dshow" avdevice_open_input(dev_ctx, NULL, options); // and similar for output. // to start capture, discovers stream parameters if not yet known avdevice_start(); // to just discover stream parameters without starting avdevice_probe(); // after open // NB: need to provide a way for devices to provide multiple streams (e.g. dshow can provide video and audio simultaneously). Should AVDeviceContexts have AVStreams? Then you introduce a bunch of extra entanglement again... // then AVFormatContext* fmt_ctx = avformat_adapt_avdevice(dev_ctx); // and use format like usual (except its already opened!) What this does not offer is av_find_input_format being able to find devices (some user code may depend on that!), which is a nice part of the current situation as well. code like AVFormatContext* fmt_ctx = NULL; AVInputFormat* fmt = av_find_input_format("something"); avformat_open_input(&fmt_ctx, "url", fmt, &opts); works for devices as well, you just need to call avdevice_register_all() first, use a device name like "dshow" in av_find_input_format, and use a special url such as "video=Integrated Webcam". Without such functionality you'd need a bunch of special cases in your app to allow users to use devices as well. Perhaps this can also still be provided as is currently the case. We should then also implement a avformat_get_avdevice() function to get the avdevice from the avdevice adapter format. As seen above, and argued earlier, complete separation appears to me impossible without losing most of the benefits of having avdevices in the first place, and their current ease of use. But happy middle ground allowing an advanced+flexible libavdevice API and a cleaned up libavformat API does seem possible. There is a sweet spot there. All that said, lets not stop work on the current avdevice component (my patch set) while figuring out the way forward. Cheers, Dee _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".