On 08.01.2019 11:24, Daniel Vetter wrote: > On Tue, Jan 8, 2019 at 10:22 AM Andrzej Hajda <a.ha...@samsung.com> wrote: >> On 08.01.2019 09:47, Daniel Vetter wrote: >>> On Tue, Jan 8, 2019 at 9:35 AM Andrzej Hajda <a.ha...@samsung.com> wrote: >>>> On 07.01.2019 22:56, Daniel Vetter wrote: >>>>> On Mon, Jan 7, 2019 at 5:27 PM Andrzej Hajda <a.ha...@samsung.com> wrote: >>>>>> On 07.01.2019 17:08, Daniel Vetter wrote: >>>>>>> On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote: >>>>>>>> On 07.01.2019 11:45, Daniel Vetter wrote: >>>>>>>>> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux >>>>>>>>> wrote: >>>>>>>>>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> lately I've been trying to make the Himax HX8837 chip that drives >>>>>>>>>>> the OLPC >>>>>>>>>>> LCD display work with Armada DRM driver. I've been advised to >>>>>>>>>>> create a >>>>>>>>>>> bridge driver and not an encoder driver since the silicon is >>>>>>>>>>> separate from >>>>>>>>>>> the LCDC. >>>>>>>>>>> >>>>>>>>>>> The Armada DRM driver (and, I think, the i.MX one) creates the >>>>>>>>>>> drm_device >>>>>>>>>>> once the component infrastructure sees the necessary sub-devices >>>>>>>>>>> appear. >>>>>>>>>>> The sub-devices being the LCDCs and the encoders (not bridges) that >>>>>>>>>>> it >>>>>>>>>>> expects to be created externally. >>>>>>>>>>> >>>>>>>>>>> Currently, it seems, the only driver that can actually work with >>>>>>>>>>> this (that >>>>>>>>>>> is -- creates a drm_encoder for a drm_device when the component is >>>>>>>>>>> bound) >>>>>>>>>>> is the tda998x. All other similar drivers create a drm_bridge >>>>>>>>>>> instead and >>>>>>>>>>> not use the component infrastructure at all. (In fact, tilcdc driver >>>>>>>>>>> contains a hack to handle tda998x specially.) >>>>>>>>>>> >>>>>>>>>>> I'm wondering how to reconcile the two? >>>>>>>>>>> >>>>>>>>>>> * The tda998x driver has recently been modified to create a bridge >>>>>>>>>>> on probe >>>>>>>>>>> and eventually encoder on component bind. Is this an okay thing >>>>>>>>>>> to do in >>>>>>>>>>> a new driver? (this probably means the tilcdc hack can be >>>>>>>>>>> removed...) >>>>>>>>>>> >>>>>>>>>>> * If a non-componentized bridge were to be used (along with a dummy >>>>>>>>>>> encoder), at what point would it make sense to look for the >>>>>>>>>>> bridge? >>>>>>>>>>> Would it be a good idea to defer the probe of crtc until a bridge >>>>>>>>>>> can be >>>>>>>>>>> looked up and the attach it on component bind? What if the >>>>>>>>>>> bridge goes >>>>>>>>>>> away (a module is unloaded, etc.) in between? >>>>>>>>>>> >>>>>>>>>>> I'd be thankful for opintions and advice before I move ahead with >>>>>>>>>>> this. >>>>>>>>>> This is the long-standing problem with the conflict between bridge >>>>>>>>>> support and component support, and I'm not sure that there is really >>>>>>>>>> any answer to it. >>>>>>>>>> >>>>>>>>>> I've gone into the details of the two several times on the list, >>>>>>>>>> particularly about the short-comings of the bridge approach, but it >>>>>>>>>> seems no one cares to fix those short-comings. >>>>>>>>>> >>>>>>>>>> You are re-identifying some of the issues that I've already pointed >>>>>>>>>> out - such as what happens to DRM drives when the bridge driver is >>>>>>>>>> unbound (it's really not about modules being unloaded, and the >>>>>>>>>> problem >>>>>>>>>> can't be solved by taking a module reference count - all that the >>>>>>>>>> module reference count does is ensure that the module doesn't go >>>>>>>>>> away unexpected, there is no way to ensure that the device isn't >>>>>>>>>> unbound.) >>>>>>>>>> >>>>>>>>>> The issue of unbinding is precisely the issue which the component >>>>>>>>>> support was created to solve - but everyone seems to prefer the buggy >>>>>>>>>> bridge approach, and no one seems willing to do anything about the >>>>>>>>>> bugs or even acknowledge that it's a problem. It's strange - if one >>>>>>>>>> identifies bugs that result in kernel oops in other kernel >>>>>>>>>> subsystems, >>>>>>>>>> one is generally taken seriously and the problem is solved. >>>>>>>>> Unbinding is really not the most important feature, especially for >>>>>>>>> SoC. If >>>>>>>>> you feel different, working together with others, getting some >>>>>>>>> agreement, >>>>>>>>> getting the patches reviewed and finding someone to get them merged is >>>>>>>>> very much appreciated. But just complaining won't move this forward. >>>>>>>>> >>>>>>>>>> The issue about the encoders is something that I've tried to discuss, >>>>>>>>>> and I've pointed out that moving it into the DRM driver adds >>>>>>>>>> additional >>>>>>>>>> complexity there, and I'd hoped that my patch set I posted would've >>>>>>>>>> generated discussion, but alas not. >>>>>>>>>> >>>>>>>>>> What I'm not prepared to do is to introduce _known_ bugs into any >>>>>>>>>> driver that I maintain. >>>>>>>>> I thought last time around the idea was to use device links (and teach >>>>>>>>> drm_bridge about them), so that the unloading works correctly. >>>>>>>> With current device_links implementation registering links in probe of >>>>>>>> the consumer is just incorrect - it can happen that neither consumer >>>>>>>> neither provider is fully probed and creating device links in such >>>>>>>> state >>>>>>>> is wrong according to docs, and my experiments. See [1] for discussion >>>>>>>> and [2] for docs. >>>>>>> We could set up the device link only at drm_dev_register time. At that >>>>>>> point >>>>>>> we know that driver loading has fully succeeded. We do have a list of >>>>>>> bridges at hand, so that's not going to be an issue. >>>>>>> >>>>>>> The only case where this breaks is if a driver is still using the >>>>>>> deprecated ->load callback. But that ->load callback doesn't mix well >>>>>>> with >>>>>>> EDEFER_PROBE/component framework anyway, so I think not going to be a >>>>>>> problem hopefully? >>>>>> drm_dev_register usually is called from bind callback, which is called >>>>>> from probe callback of one of the components or master (depending on >>>>>> particular probe order). If you want to register device link in this >>>>>> function it is possible that the bad scenario will happen - there will >>>>>> be attempt to create device link between not-yet-probed consumer and >>>>>> not-yet-probed provider. >>>>> If you call drm_dev_register before you have all the components >>>>> assembled (i.e. anywhere else than in master bind) that sounds like a >>>>> driver bug. >>>> This is how components work, every component calls component_add usually >>>> in probe, and this function checks if all components are present, if yes >>>> (ie. during probe of the last component) master bind is called and it >>>> creates drm device and then registers it. If you add device link at this >>>> moment, it is still before end of probe of the last component. >>>> >>>> On the other side provider is registered (drm_bridge_add in case of >>>> bridges) during its probe so it becomes available to the consumers >>>> BEFORE end of its probe and again it can happen that device link will be >>>> added to early. >>> Hm I read that thread again. Is the reason why we can't add the device >>> link only the exynos behaviour to allow drm_panel to be >>> hot-added/removed? >> >> Not only. What I have described above is common to all drivers it has >> nothing specific to Exynos. > I'm pretty sure that most drivers do not hot-add/remove drm things > after drm_dev_register (except drm_connector for dp mst support). It > would be buggy. Do you have more examples? I haven't reviewed them all > in detail, and things might have changed, so could very well be > there's exceptions.
Issues with device links have nothing to do with hotplugging, they are generic - lifetime of the objects (drm_bridge, drm_panel) is just slightly different of lifetime of device links, and this is racy even if you do not want hotplugging. Moreover since drm_dev is not device (has no associated struct device) assuming we can reuse its parent to create device link results in circular dependencies. > >> Of course it was tested on Exynos as this is HW I work on. Linus Walleij >> has also reported in this thread that device links have different issue >> - circular dependencies (last few emails in this lengthy thread). My >> response explains it in detail. >> >> >>> Note that for bridge allowing this is 100% buggy: drm core does not >>> allow you to hotplug/hotunplug bridges. >> >> What part of drm core disallows it? As I remember discussions about >> drm_bridge design there were voices that they should be >> hot(un)pluggable, and they are IMO, of course if they are not active. > There's no locking at all to handle bridges appearing/disappearing > while the drm_device can be accessed. There's also no lifetime > management/refcounting. So it "works" but it's racy like mad. I have not recently examined the code, but as I remember core uses the bridge only if it is attached to encoder which is attached to pipeline and the connector is in connected state (or in transition phase to/from this state). > >>> In theory drm_panel can be hotplugged (because we can hotplug >>> drm_connector), but I'm pretty sure exynos gets it all wrong since >>> hotplugging panels is no easy thing to do. I don't think this is >>> something we want to support, forcing the entire driver to be unload >>> sounds like what we want to do here. >> >> I do not know if it is 'all wrong', but tests shows it is >> hot(un)pluggable. In both cases of panel and bridge :) > There's no drm_connector_get/put in exynos (except the one in > hdmi_mode_fixup, which looks rather fishy tbh). And drm_connector are > the only things in drm you can hotplug/unplug (except the overall > drm_device ofc). So again it maybe "works", but you're guaranteed to > have lots of races all over the place. Personally I have implemented only panel hotplug support and I have it tested/analyzed quite thoroughly - it does not play with connector removal it just makes connector disconnected in case panel is removed, much simpler case. In case of bridge I have helped in design but I have not tested it thoroughly, and as you pointed it out there are fishy places, but I guess the design should be OK. The design is as follows (just bridge removal scenario and tc358764 bridge): 1. User requests bridge unbind - tc358764_remove is called. 2. tc358764_remove calls mipi_dsi_detach - encoder can perform pipeline cleanup, including connector removal. 3. Now the bridge is detached, so it can be removed - tc358764_remove calls drm_bridge_remove. As I see in the code it looks like there are missing pieces/bugs but it is just something which can be fixed. BTW the approach that last element of the pipeline should create connector complicates things a lot, as Laurent (and me) argued multiple times moving connector creation out of bridges would simplify things. Regards Andrzej > -Daniel > >> >> Regards >> >> Andrzej >> >> >> >>> -Daniel >>> >>>> Regards >>>> >>>> Andrzej >>>> >>>> >>>>> drm_dev_register publishes the drm device instance to the >>>>> world, if you're not ready to handle userspace requests at that point >>>>> (because not everything is loaded yet) then things will go boom in >>>>> very colorful ways. And from my (admittedly very rough) understanding >>>>> we should be able to register the the device links as the very last >>>>> step in the master bind function (and drm_dev_register should be about >>>>> the last thing you do in the master bind). >>>>> >>>>> So not clear on why this won't work? >>>> >>>> >>>>> -Daniel >>>>> -- >>>>> Daniel Vetter >>>>> Software Engineer, Intel Corporation >>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch >>>>> >>>>> > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel