On Wed, May 08, 2019 at 08:28:30PM +0000, mario.limoncie...@dell.com wrote: > You might think this would be adding runtime_suspend/runtime_resume > callbacks, but those also get called actually at runtime which is not > the goal here. At runtime, these types of disks should rely on APST which > should calculate the appropriate latencies around the different power states. > > This code path is only applicable in the suspend to idle state, which /does/ > call suspend/resume functions associated with dev_pm_ops. There isn't > a dedicated function in there for use only in suspend to idle, which is > why pm_suspend_via_s2idle() needs to get called.
The problem is that it also gets called for others paths: #ifdef CONFIG_PM_SLEEP #define SET_SYSTEM_SLEEP_PM_OPS(suspend_fn, resume_fn) \ .suspend = suspend_fn, \ .resume = resume_fn, \ .freeze = suspend_fn, \ .thaw = resume_fn, \ .poweroff = suspend_fn, \ .restore = resume_fn, #else else #define SET_SYSTEM_SLEEP_PM_OPS(suspend_fn, resume_fn) #endif #define SIMPLE_DEV_PM_OPS(name, suspend_fn, resume_fn) \ const struct dev_pm_ops name = { \ SET_SYSTEM_SLEEP_PM_OPS(suspend_fn, resume_fn) \ } And at least for poweroff this new code seems completely wrong, even for freeze it looks rather borderline. And more to the points - if these "modern MS standby" systems are becoming common, which it looks they are, we need support in the PM core for those instead of working around the decisions in low-level drivers. > SIMPLE_DEV_PM_OPS normally sets the same function for suspend and > freeze (hibernate), so to avoid any changes to the hibernate case it seems > to me that there needs to be a new nvme_freeze() that calls into the existing > nvme_dev_disable for the freeze pm op and nvme_thaw() that calls into the > existing nvme_reset_ctrl for the thaw pm op. At least, yes. > > > enterprise class NVMe devices > > that don't do APST and don't really do different power states at > > all in many cases. > > Enterprise class NVMe devices that don't do APST - do they typically > have a non-zero value for ndev->ctrl.npss? > > If not, they wouldn't enter this new codepath even if the server entered into > S2I. No, devices that do set NPSS will have at least some power states per definition, although they might not be too useful. I suspect checking APSTA might be safer, but if we don't want to rely on APST we should check for a power state supporting the condition that the MS document quoted in the original document supports.