On Mon, May 20, 2019 at 10:59:35PM +0200, Daniel Vetter wrote:
> On Mon, May 20, 2019 at 10:51 PM Imre Deak <imre.d...@intel.com> wrote:
> >
> > On Mon, May 20, 2019 at 10:15:20PM +0200, Daniel Vetter wrote:
> > > On Mon, May 20, 2019 at 10:06 PM Imre Deak <imre.d...@intel.com> wrote:
> > > > On Mon, May 20, 2019 at 09:23:00PM +0200, Daniel Vetter wrote:
> > > > > On Mon, May 20, 2019 at 10:09:24PM +0300, Imre Deak wrote:
> > > > > > On Mon, May 20, 2019 at 08:37:46PM +0200, Daniel Vetter wrote:
> > > > > > > On Mon, May 20, 2019 at 08:41:09PM +0300, Imre Deak wrote:
> > > > > > > > We allowed modesetting an unregistered connector only in the 
> > > > > > > > case the
> > > > > > > > mode is getting disabled on the connector.
> > > > > > > >
> > > > > > > > The reason for this check was the lack of proper refcounting 
> > > > > > > > for the
> > > > > > > > backing memory objects. That problem has been solved meanwhile 
> > > > > > > > so there
> > > > > > > > is no reason any more to reject the modesetting in general.
> > > > > > >
> > > > > > > I'm not parsing this at all ... maybe references to the commits 
> > > > > > > that fix
> > > > > > > this? Or do you mean the refcounting work for all the things 
> > > > > > > hanging of
> > > > > > > connectors, including the entire mst tree?
> > > > > >
> > > > > > Yes the check was added to solve the issue related to the removal 
> > > > > > of MST
> > > > > > connectors that could happen asynchronously wrt. a modeset 
> > > > > > referring to
> > > > > > that MST connector.  That could happen since the MST core doesn't 
> > > > > > hold
> > > > > > any locks (for instance the connection_mutex) during removing an MST
> > > > > > connector that would prevent doing a modeset at the same time.
> > > > > >
> > > > > > Adding the refcounting for such MST connectors (via the
> > > > > > drm_connector_get()/drm_connector_put()) got rid of the above 
> > > > > > problem.
> > > > >
> > > > > We added the check way after that stuff landed. Before all the 
> > > > > connector
> > > > > reworking connectors were forcefully disabled by the kernel. The idea
> > > > > behind this check is to make sure that that userspace notices a 
> > > > > connector
> > > > > is gone (only thing that's not allowed is enabling it, you can keep
> > > > > pageflipping). I think we've always had behaviour like ever since mst 
> > > > > (all
> > > > > userspace has some "oops mst connector probably gone" failure catching
> > > > > around modesets).
> > > >
> > > > Right, pageflipping works.
> > > >
> > > > > So no idea what all blows up if we stop catching userspace this way.
> > > > >
> > > > > Now very much possible I'm getting all this wrong again or missing
> > > > > something, this stuff is often way over my head. But I'm really vary 
> > > > > of
> > > > > breaking userspace here. E.g. just the drm_connector_get/put lifetime
> > > > > changes results in some userspace breaking if you unplug/replug fast
> > > > > enough, because then it doesn't notice the connector change anymore. I
> > > > > couldn't figure out a way to paper over that regression without
> > > > > reintroduce the rather broken and oops-prone old connector lifetime
> > > > > management.
> > > >
> > > > Yes, but what is the the actual description of the failing scenario? I
> > > > can't see how anything can go wrong without this check. The resume time
> > > > restoration modeset may have to act in the same way on an old connector.
> > >
> > > That's what the !state->duplicated thing is meant to check for btw.
> > >
> > > > I don't understand how userspace would not notice the connector change.
> > > > It will get a hotplug uevent in response to which it would have to do a
> > > > detect which returns to it the updated information about the new MST
> > > > connector tree.
> > >
> > > uevent handling can take positively forever, at which point the new
> > > connector could already be plugged in, and then userspace makes a mess
> > > aliasing the two since the path property matches. Just needs a wobbly
> > > cable. This is the issue with the "fixed" lifetime management, and I'm
> > > wary of breaking more stuff.
> >
> > Yea, processing can be delayed arbitrarily, the corresponding
> > GETRESOURCES/GETCONNECTOR should still return to the userspace the
> > correct information to do right thing, even if the path property
> > matches: the connector ID for the old and new connector will be
> > different and the GETCONNECTOR for the old ID will return properly that
> > this old connector is already disconnected (see intel_dp_mst_detect()).
> > So I don't see how userspace could mess up things. Obviously if user
> > space is buggy, then well, it is just buggy and then that user space bug
> > would need to be fixed instead of papering over such problems in the
> > kernel.
> 
> Uh that's not how this "kernel change caused a regression" thing
> works. Dave just dropped a rant about this and i915 recently ...

I still don't see what would depend on the current behavior instead of
acting on the hotplug uevents properly. 

> > > Other way round: What do we gain if userspace is allowed to turn on a
> > > connector again which doesn't exist and will not ever show any pixels?
> > > I'm not seeing a benefit here in allowing that. And history says we
> > > change something around mst handling, it'll break something somewhere.
> >
> > Let's then describe the actual reason for this check, that description
> > is missing.
> >
> > We should remove this check to simplify things if there isn't an actual
> > need for it. Userspace can do already a modeset on connectors in general
> > that are not in a connected state. So this special casing - for MST
> > connectors only - is a bad idea if there is no reason for special casing
> > it. If it's there to paper over some user space bug that user space bug
> > should be fixed instead of adding this special casing which adds
> > complexity to the driver.
> >
> > The only justification for a check in the driver if something could blow
> > up in the kernel without that check. Nothing can blow up in the kernel
> > without this check. Every other failure scenario should be handled/fixed
> > in userspace.
> 
> So I thought you need this because it's a pain on resume, but that
> should be handled with the state->duplicated check. I guess I still
> don't understand why you want to nuke this check, it seemed like it's
> getting in the way somewhere.
> 
> If it's not getting in the way then imo let's not touch stuff, I'm too
> much burned child in this area.

It's a pain to maintain such papering overs in the kernel, that nobody
knows what purpose they serve. What I'm asking here is to preserve this
check here only if there is a purpose for it along with a precise
description of its purpose. If noone can give an explanation for it -
or how it can lead to a regression - then yes, let's remove it to
simplify things.

> -Daniel
> 
> > > -Daniel
> > >
> > > >
> > > > > > > > The check
> > > > > > > > for that also makes driver internal modesets more cumbersome 
> > > > > > > > where we
> > > > > > > > need to add exemptions for the cases where we do need to allow 
> > > > > > > > the
> > > > > > > > modeset even for unregistered connectors. One such case is the
> > > > > > > > restoration of the mode during resume.
> > > > > > >
> > > > > > > Yeah this one actually makes sense to me. We could still keep 
> > > > > > > this check
> > > > > > > here, but for the atomic ioctl only when called from userspace. 
> > > > > > > But iirc
> > > > > > > Lyude also said she has some plans here, so no idea whether that 
> > > > > > > all fits.
> > > > > > >
> > > > > > > > Simplify things by removing the unneeded check. I can't see how
> > > > > > > > modesetting an unregistered connector can cause any problem and 
> > > > > > > > the race
> > > > > > > > (described in the code comment) can anyway result in such a 
> > > > > > > > modeset (if
> > > > > > > > the connector is unregistered right after the check).
> > > > > > >
> > > > > > > Not saying we don't need this, but there's fairly enormous 
> > > > > > > amounts of
> > > > > > > history behind all this stuff, and lots of discussions. Would be 
> > > > > > > good to
> > > > > > > at least reference those, so we have a good story for when this 
> > > > > > > then all
> > > > > > > goes wrong again.
> > > > > >
> > > > > > I still don't see why this check is needed. There is no 
> > > > > > justification
> > > > > > for it - besides the original reason for it as discussed above 
> > > > > > about the
> > > > > > refcounting problem, which is solved now - so I think we should 
> > > > > > remove
> > > > > > it, instead of just making it a special case for the user space 
> > > > > > modeset.
> > > > > >
> > > > > > As I wrote a user space modeset can end up anyway doing a modeset 
> > > > > > on an
> > > > > > unregistered connector when the unregistering - by MST core - 
> > > > > > happens just
> > > > > > right after the check.
> > > > >
> > > > > Yup. Always been like that.
> > > > > -Daniel
> > > > >
> > > > > >
> > > > > > > -Daniel
> > > > > > >
> > > > > > > >
> > > > > > > > Cc: Lyude Paul <ly...@redhat.com>
> > > > > > > > Cc: Daniel Vetter <daniel.vet...@ffwll.ch>
> > > > > > > > Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > > > > > > > Signed-off-by: Imre Deak <imre.d...@intel.com>
> > > > > > > > ---
> > > > > > > >  drivers/gpu/drm/drm_atomic_helper.c | 29 
> > > > > > > > ++---------------------------
> > > > > > > >  1 file changed, 2 insertions(+), 27 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> > > > > > > > b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > index 2e0cb4246cbd..e94e69483498 100644
> > > > > > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > @@ -319,33 +319,6 @@ update_connector_routing(struct 
> > > > > > > > drm_atomic_state *state,
> > > > > > > >                 return 0;
> > > > > > > >         }
> > > > > > > >
> > > > > > > > -       crtc_state = drm_atomic_get_new_crtc_state(state,
> > > > > > > > -                                                  
> > > > > > > > new_connector_state->crtc);
> > > > > > > > -       /*
> > > > > > > > -        * For compatibility with legacy users, we want to make 
> > > > > > > > sure that
> > > > > > > > -        * we allow DPMS On->Off modesets on unregistered 
> > > > > > > > connectors. Modesets
> > > > > > > > -        * which would result in anything else must be 
> > > > > > > > considered invalid, to
> > > > > > > > -        * avoid turning on new displays on dead connectors.
> > > > > > > > -        *
> > > > > > > > -        * Since the connector can be unregistered at any point 
> > > > > > > > during an
> > > > > > > > -        * atomic check or commit, this is racy. But that's OK: 
> > > > > > > > all we care
> > > > > > > > -        * about is ensuring that userspace can't do anything 
> > > > > > > > but shut off the
> > > > > > > > -        * display on a connector that was destroyed after it's 
> > > > > > > > been notified,
> > > > > > > > -        * not before.
> > > > > > > > -        *
> > > > > > > > -        * Additionally, we also want to ignore connector 
> > > > > > > > registration when
> > > > > > > > -        * we're trying to restore an atomic state during 
> > > > > > > > system resume since
> > > > > > > > -        * there's a chance the connector may have been 
> > > > > > > > destroyed during the
> > > > > > > > -        * process, but it's better to ignore that then cause
> > > > > > > > -        * drm_atomic_helper_resume() to fail.
> > > > > > > > -        */
> > > > > > > > -       if (!state->duplicated && 
> > > > > > > > drm_connector_is_unregistered(connector) &&
> > > > > > > > -           crtc_state->active) {
> > > > > > > > -               DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] is not 
> > > > > > > > registered\n",
> > > > > > > > -                                connector->base.id, 
> > > > > > > > connector->name);
> > > > > > > > -               return -EINVAL;
> > > > > > > > -       }
> > > > > > > > -
> > > > > > > >         funcs = connector->helper_private;
> > > > > > > >
> > > > > > > >         if (funcs->atomic_best_encoder)
> > > > > > > > @@ -390,6 +363,8 @@ update_connector_routing(struct 
> > > > > > > > drm_atomic_state *state,
> > > > > > > >
> > > > > > > >         set_best_encoder(state, new_connector_state, 
> > > > > > > > new_encoder);
> > > > > > > >
> > > > > > > > +       crtc_state = drm_atomic_get_new_crtc_state(state,
> > > > > > > > +                                                  
> > > > > > > > new_connector_state->crtc);
> > > > > > > >         crtc_state->connectors_changed = true;
> > > > > > > >
> > > > > > > >         DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] using 
> > > > > > > > [ENCODER:%d:%s] on [CRTC:%d:%s]\n",
> > > > > > > > --
> > > > > > > > 2.17.1
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Daniel Vetter
> > > > > > > Software Engineer, Intel Corporation
> > > > > > > http://blog.ffwll.ch
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to