On Mon, 2025-05-05 at 21:03 +0300, Tomi Valkeinen wrote: Hello, > Hi, > > On 05/05/2025 20:47, Vitor Soares wrote: > > On Mon, 2025-05-05 at 18:30 +0300, Tomi Valkeinen wrote: > > > Hi, > > > > > > On 05/05/2025 17:45, Vitor Soares wrote: > > > > On Tue, 2025-04-29 at 09:32 +0300, Tomi Valkeinen wrote: > > > > > Hi, > > > > > > > > > > On 28/04/2025 12:40, Vitor Soares wrote: > > > > > > From: Vitor Soares <vitor.soa...@toradex.com> > > > > > > > > > > > > The deprecated UNIVERSAL_DEV_PM_OPS() macro uses the provided > > > > > > callbacks > > > > > > for both runtime PM and system sleep. This causes the DSI clocks to > > > > > > be > > > > > > disabled twice: once during runtime suspend and again during system > > > > > > suspend, resulting in a WARN message from the clock framework when > > > > > > attempting to disable already-disabled clocks. > > > > > > > > > > > > [ 84.384540] clk:231:5 already disabled > > > > > > [ 84.388314] WARNING: CPU: 2 PID: 531 at /drivers/clk/clk.c:1181 > > > > > > clk_core_disable+0xa4/0xac > > > > > > ... > > > > > > [ 84.579183] Call trace: > > > > > > [ 84.581624] clk_core_disable+0xa4/0xac > > > > > > [ 84.585457] clk_disable+0x30/0x4c > > > > > > [ 84.588857] cdns_dsi_suspend+0x20/0x58 [cdns_dsi] > > > > > > [ 84.593651] pm_generic_suspend+0x2c/0x44 > > > > > > [ 84.597661] ti_sci_pd_suspend+0xbc/0x15c > > > > > > [ 84.601670] dpm_run_callback+0x8c/0x14c > > > > > > [ 84.605588] __device_suspend+0x1a0/0x56c > > > > > > [ 84.609594] dpm_suspend+0x17c/0x21c > > > > > > [ 84.613165] dpm_suspend_start+0xa0/0xa8 > > > > > > [ 84.617083] suspend_devices_and_enter+0x12c/0x634 > > > > > > [ 84.621872] pm_suspend+0x1fc/0x368 > > > > > > > > > > > > To address this issue, replace UNIVERSAL_DEV_PM_OPS() with > > > > > > DEFINE_RUNTIME_DEV_PM_OPS(), which avoids redundant suspend/resume > > > > > > calls > > > > > > by checking if the device is already runtime suspended. > > > > > > > > > > > > Cc: <sta...@vger.kernel.org> # 6.1.x > > > > > > Fixes: e19233955d9e ("drm/bridge: Add Cadence DSI driver") > > > > > > Signed-off-by: Vitor Soares <vitor.soa...@toradex.com> > > > > > > --- > > > > > > drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c | 10 +++++----- > > > > > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c > > > > > > b/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c > > > > > > index b022dd6e6b6e..62179e55e032 100644 > > > > > > --- a/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c > > > > > > +++ b/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c > > > > > > @@ -1258,7 +1258,7 @@ static const struct mipi_dsi_host_ops > > > > > > cdns_dsi_ops > > > > > > = { > > > > > > .transfer = cdns_dsi_transfer, > > > > > > }; > > > > > > > > > > > > -static int __maybe_unused cdns_dsi_resume(struct device *dev) > > > > > > +static int cdns_dsi_resume(struct device *dev) > > > > > > { > > > > > > struct cdns_dsi *dsi = dev_get_drvdata(dev); > > > > > > > > > > > > @@ -1269,7 +1269,7 @@ static int __maybe_unused > > > > > > cdns_dsi_resume(struct > > > > > > device *dev) > > > > > > return 0; > > > > > > } > > > > > > > > > > > > -static int __maybe_unused cdns_dsi_suspend(struct device *dev) > > > > > > +static int cdns_dsi_suspend(struct device *dev) > > > > > > { > > > > > > struct cdns_dsi *dsi = dev_get_drvdata(dev); > > > > > > > > > > > > @@ -1279,8 +1279,8 @@ static int __maybe_unused > > > > > > cdns_dsi_suspend(struct > > > > > > device *dev) > > > > > > return 0; > > > > > > } > > > > > > > > > > > > -static UNIVERSAL_DEV_PM_OPS(cdns_dsi_pm_ops, cdns_dsi_suspend, > > > > > > cdns_dsi_resume, > > > > > > - NULL); > > > > > > +static DEFINE_RUNTIME_DEV_PM_OPS(cdns_dsi_pm_ops, cdns_dsi_suspend, > > > > > > + cdns_dsi_resume, NULL); > > > > > > > > > > I'm not sure if this, or the UNIVERSAL_DEV_PM_OPS, is right here. When > > > > > the system is suspended, the bridge drivers will get a call to the > > > > > *_disable() hook, which then disables the device. If the bridge driver > > > > > would additionally do something in its system suspend hook, it would > > > > > conflict with normal disable path. > > > > > > > > > > I think bridges/panels should only deal with runtime PM. > > > > > > > > > > Tomi > > > > > > > > > > > > > In the proposed change, we make use of pm_runtime_force_suspend() during > > > > system-wide suspend. If the device is already suspended, this call is a > > > > no-op and disables runtime PM to prevent spurious wakeups during the > > > > suspend period. Otherwise, it triggers the device’s runtime_suspend() > > > > callback. > > > > > > > > I briefly reviewed other bridge drivers, and those that implement > > > > runtime > > > > PM appear to follow a similar approach, relying solely on runtime PM > > > > callbacks and using pm_runtime_force_suspend()/resume() to handle > > > > system-wide transitions. > > > > > > Yes, I see such a solution in some of the bridge and panel drivers. I'm > > > probably missing something here, as I don't think it's correct. > > > > > > Why do we need to set the system suspend/resume hooks? What is the > > > scenario where those will be called, and the > > > pm_runtime_force_suspend()/resume() do something that's not already done > > > via the normal DRM pipeline enable/disable? > > > > > > Tomi > > > > > > > I'm not a DRM expert, but my understanding is that there might be edge cases > > where the system suspend sequence occurs without the DRM core properly > > disabling > > the bridge — for example, due to a bug or if the bridge is not bound to an > > active pipeline. In such cases, having suspend/resume callbacks ensures that > > the > > device is still properly suspended and resumed. > > > > Additionally, pm_runtime_force_suspend() disables runtime PM for the device > > during system suspend, preventing unintended wakeups (e.g., via IRQs, > > delayed > > work, or sysfs access) until pm_runtime_force_resume() is invoked. > > > > From my perspective, the use of pm_runtime_force_suspend() and > > pm_runtime_force_resume() serves as a safety mechanism to guarantee a well- > > defined and race-free state during system suspend. > > But then we must be sure that the suspend sequence is just right. > > At least in tidss's case, tidss_drv.c has tidss_suspend() which calls > drm_mode_config_helper_suspend(), which, if I recall right, will then > disable the pipeline. This must happen before the bridge's system > suspend call, otherwise the bridge might go to suspend while the > pipeline is still running, which might cause errors on the still-running > pipeline entities, and probably crash the bridge's disable() call. If a > bridge is a platform device, I don't think there's any ordering between > the tidss's and the bridge's suspend calls. > > If the bridge is not bound to a pipeline, why would it be enabled in the > first place? > > For the bug case... We're in random territory, then. If the driver is > bugging, are you sure it's safe and useful to suspend it? Or would it be > better to not do anything... > > I'm not nacking the patch, as this approach seems to be used in multiple > drivers. It just rings multiple alarm bells here, and I don't understand > how exactly it's supposed to work. That said, the driver is using > UNIVERSAL_DEV_PM_OPS(), so I think switching to > DEFINE_RUNTIME_DEV_PM_OPS() is at least not worse (well, I can't be > quite sure even about that =). > > Tomi >
I conducted further tests based on your concerns, specifically regarding the suspend ordering between the tidss_suspend() and the bridge suspend. Here are my observations: - The bridge (controlled via TI SCI PD) suspends after tidss_suspend(), which uses device-specific PM operations (platform_pm_suspend). - I attempted to influence the probe/suspend order via DT node placement and delays, but that had no effect on suspend sequencing. - I added some debug prints and the pm_runtime_force_suspend() is invoked before cdns_dsi_suspend(). However, I did not observe any misbehavior during suspend/resume. - I also tested with only runtime PM support in the driver (without pm_runtime_force_suspend/resume()), and I couldn't detect any functional difference nor the issue originally addressed in this patch. Given that, I will send a v2 of the patch implementing only runtime PM support. If issues arise in the future due to the lack of explicit pm_runtime_force_suspend/resume() handling, we can revisit and address them at that time with clearer justification. Best regards, Vitor Soares