This is more of RFC than a proper patch. I was looking at https://bugzilla.kernel.org/show_bug.cgi?id=15673 and I wrote a script to show where we call drm_gem_object_unreference() without holding struct_mutex and it showed these two places.
There call tree to here is: i915_load_modeset_init() => intel_modeset_init() => intel_setup_overlay() "reg_bo" is a local variable so we could probably leave that as drm_gem_object_unreference() although I changed it in this patch. The i915_gem_object_pin() function calls i915_gem_object_bind_to_gtt() where we start unreferencing stuff if we're low on space. So possibly that lock is needed. The call tree is: i915_gem_object_pin() => i915_gem_object_bind_to_gtt() => i915_gem_evict_something() => i915_gem_retire_requests() => i915_gem_retire_request() => drm_gem_object_unreference() On the other hand, this is all in the module init so it seems like locking shouldn't be necessary, and certainly I can't see how it would cause the suspend bug. I don't have this hardware btw, so I can't actually test this patch. Signed-off-by: Dan Carpenter <error27 at gmail.com> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index 6d524a1..a103582 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -1347,7 +1347,9 @@ void intel_setup_overlay(struct drm_device *dev) overlay->reg_bo = to_intel_bo(reg_bo); if (OVERLAY_NONPHYSICAL(dev)) { + mutex_lock(&dev->struct_mutex); ret = i915_gem_object_pin(reg_bo, PAGE_SIZE); + mutex_unlock(&dev->struct_mutex); if (ret) { DRM_ERROR("failed to pin overlay register bo\n"); goto out_free_bo; @@ -1385,7 +1387,7 @@ void intel_setup_overlay(struct drm_device *dev) return; out_free_bo: - drm_gem_object_unreference(reg_bo); + drm_gem_object_unreference_unlocked(reg_bo); out_free: kfree(overlay); return;