On Wed, Aug 06, 2025 at 12:02:02PM +0000, nicusor.huhu...@siemens.com wrote:
> 
> 
> >>-----Original Message-----
> >>From: Imre Deak <imre.d...@intel.com>
> >>Sent: Tuesday, August 5, 2025 6:20 PM
> >>To: Dmitry Baryshkov <dmitry.barysh...@oss.qualcomm.com>
> >>Cc: Huhulea, Nicusor Liviu (FT FDS CES LX PBU 1)
> >><nicusor.huhu...@siemens.com>; sta...@vger.kernel.org; dri-
> >>de...@lists.freedesktop.org; intel-...@lists.freedesktop.org; 
> >>cip-...@lists.cip-
> >>project.org; shradhagu...@linux.microsoft.com; jouni.hogan...@intel.com;
> >>neil.armstr...@linaro.org; jani.nik...@linux.intel.com;
> >>maarten.lankho...@linux.intel.com; mrip...@kernel.org; tzimmerm...@suse.de;
> >>airl...@gmail.com; dan...@ffwll.ch; joonas.lahti...@linux.intel.com;
> >>rodrigo.v...@intel.com; laurentiu.pa...@oss.nxp.com; Hombourger, Cedric (FT
> >>FDS CES LX) <cedric.hombour...@siemens.com>; Bobade, Shrikant Krishnarao
> >>(FT FDS CES LX PBU 1) <shrikant.bob...@siemens.com>
> >>Subject: Re: [PATCH] drm/probe-helper: fix output polling not resuming 
> >>after HPD
> >>IRQ storm
> >>
> >>On Tue, Aug 05, 2025 at 01:46:51PM +0300, Dmitry Baryshkov wrote:
> >>> On Mon, Aug 04, 2025 at 11:13:59PM +0300, Nicusor Huhulea wrote:
> >>> > A regression in output polling was introduced by commit
> >>> > 4ad8d57d902fbc7c82507cfc1b031f3a07c3de6e
> >>> > ("drm: Check output polling initialized before disabling") in the 6.1.y 
> >>> > stable
> >>tree.
> >>> > As a result, when the i915 driver detects an HPD IRQ storm and
> >>> > attempts to switch from IRQ-based hotplug detection to polling, output 
> >>> > polling
> >>fails to resume.
> >>> >
> >>> > The root cause is the use of dev->mode_config.poll_running. Once
> >>> > poll_running is set (during the first connector detection) the calls
> >>> > to drm_kms_helper_poll_enable(), such as
> >>> > intel_hpd_irq_storm_switch_to_polling() fails to schedule 
> >>> > output_poll_work as
> >>expected.
> >>> > Therefore, after an IRQ storm disables HPD IRQs, polling does not start,
> >>breaking hotplug detection.
> >>>
> >>> Why doesn't disable path use drm_kms_helper_poll_disable() ?
> >>
> >>In general i915 doesn't disable polling as a whole after an IRQ storm and a 
> >>2
> >>minute delay (or runtime resuming), since on some other connectors the 
> >>polling
> >>may be still required.
> >>
> >>Also, in the 6.1.y stable tree drm_kms_helper_poll_disable() is:
> >>
> >>        if (drm_WARN_ON(dev, !dev->mode_config.poll_enabled))
> >>                return;
> >>
> >>        cancel_delayed_work_sync(&dev->mode_config.output_poll_work);
> >>
> >>so calling that wouldn't really fix the problem, which is clearly just an 
> >>incorrect
> >>backport of the upstream commit 5abffb66d12bcac8 ("drm:
> >>Check output polling initialized before disabling") to the v6.1.y stable 
> >>tree by
> >>commit 4ad8d57d902f ("drm: Check output polling initialized before 
> >>disabling") in
> >>v6.1.y.
> >>
> >>The upstream commit did not add the check for
> >>dev->mode_config.poll_running in drm_kms_helper_poll_enable(), the
> >>condition was only part of the diff context in the commit. Hence adding the
> >>condition in the 4ad8d57d902f backport commit was incorrect.
> >>
> >>> > The fix is to remove the dev->mode_config.poll_running in the check
> >>> > condition, ensuring polling is always scheduled as requested.
> I'm not a full-time kernel developer, but I spent some time trying to really 
> understand both the rationale and the effects of commit "Check output polling 
> initialized before disabling."
> Here's how I see the issue:
> That commit was introduced mainly as a defensive measure, to protect
> drivers such as hyperv-drm that do not initialize connector polling.
> In those drivers, calling drm_kms_helper_poll_disable() without proper
> polling setup could trigger warnings or even stack traces, since there
> are no outputs to poll and the polling helpers don't apply.  From what
> I understand, the poll_running variable was added to prevent cases
> where polling was accidentally disabled for drivers that were never
> set up for it. However, this fix ended up creating a new class of
> breakage, which I have observed and describe here.


It was added to implement the very simple logic: If something isn't
initialized, enabling or disabling it is an error. If something isn't
enabled, disabling it is an error. If something isn't disabled, enabling
it is an error.

> 
> To me, the core logic should be simple: if polling is needed, then we should 
> allow it to proceed (regardless of whether it's been scheduled previously).
> 
> Looking at the  drm_kms_helper_poll_disable()
> if (drm_WARN_ON(dev, !dev->mode_config.poll_enabled))
>     return;
> 
> If the driver never enabled polling (that is, drm_kms_helper_poll_enable() 
> was never called), then the disable operation is effectively a no-op-totally 
> safe for drivers like hyperv-drm.
> 
> And in the enable function drm_kms_helper_poll_enable(..):
>         if (drm_WARN_ON_ONCE(dev, !dev->mode_config.poll_enabled) ||
> -           !drm_kms_helper_poll || dev->mode_config.poll_running)
> +           !drm_kms_helper_poll)
>                 return;

Why?

> The main thing being guarded here is whether polling has actually been 
> initialized:
> 1.For polling drivers like i915, removing poll_running from the enable path 
> is both safe and necessary: it fixes the regression with HPD polling after 
> IRQ storms

I believe in paired calls. If you want to use
drm_kms_helper_poll_enable(), it should be previously
drm_kms_helper_poll_disable()d. If you have manually disabled the IRQ,
it should also be manually enabled.

Pairing the calls makes life much much easier.

> 2.For non-polling drivers like hyperv-drm, the existing checks on 
> poll_enabled in both enable and disable functions are sufficient. Removing 
> poll_running doesn't affect these drivers-they don't go through the polling 
> code paths, so no polling gets scheduled or canceled by mistake
> 
> Therefore based on my understanding and testing removing poll_running guard 
> not only fixes a real bug but also does not introduce new regressions for 
> drivers that don't use output polling.

-- 
With best wishes
Dmitry

Reply via email to