During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.

This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.

v2:
- Add a code comment about the relation of this fix to the freeze/thaw
  vs. the suspend/resume phases. (Ville)
- Add a code comment about the inconsistent ordering of set power state
  and device enable calls. (Chris)

CC: Ville Syrjälä <ville.syrj...@linux.intel.com>
CC: Chris Wilson <ch...@chris-wilson.co.uk>
CC: sta...@vger.kernel.org
Signed-off-by: Imre Deak <imre.d...@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d550ae2..3b79e97 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
 static int i915_drm_resume_early(struct drm_device *dev)
 {
        struct drm_i915_private *dev_priv = dev->dev_private;
-       int ret = 0;
+       int ret;
 
        /*
         * We have a resume ordering issue with the snd-hda driver also
@@ -814,6 +814,36 @@ static int i915_drm_resume_early(struct drm_device *dev)
         * FIXME: This should be solved with a special hdmi sink device or
         * similar so that power domains can be employed.
         */
+
+       /*
+        * Note that we need to set the power state explicitly, since we
+        * powered off the device during freeze and the PCI core won't power
+        * it back up for us during thaw. Powering off the device during
+        * freeze is not a hard requirement though, and during the
+        * suspend/resume phases the PCI core makes sure we get here with the
+        * device powered on. So in case we change our freeze logic and keep
+        * the device powered we can also remove the following set power state
+        * call.
+        */
+       ret = pci_set_power_state(dev->pdev, PCI_D0);
+       if (ret) {
+               DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
+               goto out;
+       }
+
+       /*
+        * Note that pci_enable_device() first enables any parent bridge
+        * device and only then sets the power state for this device. The
+        * bridge enabling is a nop though, since bridge devices are resumed
+        * first. The order of enabling power and enabling the device is
+        * imposed by the PCI core as described above, so here we preserve the
+        * same order for the freeze/thaw phases.
+        *
+        * TODO: eventually we should remove pci_disable_device() /
+        * pci_enable_enable_device() from suspend/resume. Due to how they
+        * depend on the device enable refcount we can't anyway depend on them
+        * disabling/enabling the device.
+        */
        if (pci_enable_device(dev->pdev)) {
                ret = -EIO;
                goto out;
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to