On Tuesday, 25 February 2025 08:55:50 Central European Summer Time Tomeu Vizoso wrote: > This initial version supports the NPU as shipped in the RK3588 SoC and > described in the first part of its TRM, in Chapter 36. > > This NPU contains 3 independent cores that the driver can submit jobs > to. > > This commit adds just hardware initialization and power management. > > v2: > - Split cores and IOMMUs as independent devices (Sebastian Reichel) > - Add some documentation (Jeffrey Hugo) > - Be more explicit in the Kconfig documentation (Jeffrey Hugo) > - Remove resets, as these haven't been found useful so far (Zenghui Yu) > - Repack structs (Jeffrey Hugo) > - Use DEFINE_DRM_ACCEL_FOPS (Jeffrey Hugo) > - Use devm_drm_dev_alloc (Jeffrey Hugo) > - Use probe log helper (Jeffrey Hugo) > - Introduce UABI header in a later patch (Jeffrey Hugo) > > Signed-off-by: Tomeu Vizoso <to...@tomeuvizoso.net> > --- > Documentation/accel/index.rst | 1 + > Documentation/accel/rocket/index.rst | 19 + > MAINTAINERS | 8 + > drivers/accel/Kconfig | 1 + > drivers/accel/Makefile | 1 + > drivers/accel/rocket/Kconfig | 25 + > drivers/accel/rocket/Makefile | 8 + > drivers/accel/rocket/rocket_core.c | 71 + > drivers/accel/rocket/rocket_core.h | 29 + > drivers/accel/rocket/rocket_device.c | 29 + > drivers/accel/rocket/rocket_device.h | 29 + > drivers/accel/rocket/rocket_drv.c | 273 ++ > drivers/accel/rocket/rocket_drv.h | 13 + > drivers/accel/rocket/rocket_registers.h | 4425 > +++++++++++++++++++++++++++++++ > 14 files changed, 4932 insertions(+)
Hi Tomeu, I've got some more comments on the driver, this time specific to some power management stuff I've noticed. > +++ b/drivers/accel/rocket/rocket_core.c > > [...] > > +int rocket_core_init(struct rocket_core *core) > +{ > + struct device *dev = core->dev; > + struct platform_device *pdev = to_platform_device(dev); > + uint32_t version; > + int err = 0; > + > + err = rocket_clk_init(core); > + if (err) { > + dev_err(dev, "clk init failed %d\n", err); > + return err; > + } > + > + core->iomem = devm_platform_ioremap_resource(pdev, 0); > + if (IS_ERR(core->iomem)) > + return PTR_ERR(core->iomem); > + > + pm_runtime_use_autosuspend(dev); We're enabling autosuspend here, but don't use pm_runtime_dont_use_autosuspend(core->dev); in rocket_core_fini. dont_use_autosuspend is only handled for us automagically on driver unload if we use devm wrappers for pm_runtime_enable, so this is most definitely an oversight. > + pm_runtime_set_autosuspend_delay(dev, 50); /* ~3 frames */ The 50 = 3 frames thing here seems suspect. 3 frames of what, and why? If it's 3 frames of something the hardware processed, then doesn't that depend on the specific hardware and its clock rate, which may change? Plus, shouldn't auto- suspend be blocked anyway when there's still a job processing? The RK3588 TRM doesn't make a mention of "frame" in the RKNN section, so if this refers to a specific workload then that will be another parameter. > + pm_runtime_enable(dev); > + > + err = pm_runtime_get_sync(dev); No error checking done here, so if a clock fails to enable, we just hang on the read later if it was the register's clock. Though that particular error case is never passed out from the runtime resume callback, which should probably be fixed as well. > + > + version = rocket_read(core, REG_PC_VERSION); > + version += rocket_read(core, REG_PC_VERSION_NUM) & 0xffff; > + > + pm_runtime_mark_last_busy(dev); > + pm_runtime_put_autosuspend(dev); > + > + dev_info(dev, "Rockchip NPU core %d version: %d\n", core->index, > version); > + > + return 0; > +} > + > +void rocket_core_fini(struct rocket_core *core) > +{ > + pm_runtime_disable(core->dev); > +} > > [...] > > diff --git a/drivers/accel/rocket/rocket_drv.c > b/drivers/accel/rocket/rocket_drv.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..c22d965f20f1239a36b1d823d5fe5f372713555d > --- /dev/null > +++ b/drivers/accel/rocket/rocket_drv.c > > [...] > > +static int rocket_device_runtime_resume(struct device *dev) > +{ > + struct rocket_device *rdev = dev_get_drvdata(dev); > + > + for (unsigned int core = 0; core < rdev->num_cores; core++) { > + if (dev != rdev->cores[core].dev) > + continue; > + > + if (core == 0) { > + clk_prepare_enable(rdev->clk_npu); > + clk_prepare_enable(rdev->pclk); > + } > + > + clk_prepare_enable(rdev->cores[core].a_clk); > + clk_prepare_enable(rdev->cores[core].h_clk); > + } > + > + return 0; > +} Here is where we will probably want to check the return code of each clk_prepare_enable, and potentially do quite ugly "always return hardware to known state" handling if any of them fails to enable, i.e. unwind the enables in the function exit before returning the error code. Seems pointless because if a clock fails to enable it's a nuclear meltdown type situation anyway, but especially when people are writing DTSes or porting things to new SoCs, it can be nicer to have the driver fail rather than the whole SoC. I do wish we had cleanup.h helpers for clock enables though... > + > +static int rocket_device_runtime_suspend(struct device *dev) > +{ > + struct rocket_device *rdev = dev_get_drvdata(dev); > + > + for (unsigned int core = 0; core < rdev->num_cores; core++) { > + if (dev != rdev->cores[core].dev) > + continue; > + > + clk_disable_unprepare(rdev->cores[core].a_clk); > + clk_disable_unprepare(rdev->cores[core].h_clk); > + > + if (core == 0) { > + clk_disable_unprepare(rdev->pclk); > + clk_disable_unprepare(rdev->clk_npu); > + } > + } > + > + return 0; > +} > + > +EXPORT_GPL_DEV_PM_OPS(rocket_pm_ops) = { > + RUNTIME_PM_OPS(rocket_device_runtime_suspend, > rocket_device_runtime_resume, NULL) > + SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume) > +}; > + > +static struct platform_driver rocket_driver = { > + .probe = rocket_probe, > + .remove = rocket_remove, > + .driver = { > + .name = "rocket", > + .pm = pm_ptr(&rocket_pm_ops), > + .of_match_table = dt_match, > + }, > +}; > +module_platform_driver(rocket_driver); > + > +MODULE_LICENSE("GPL"); > +MODULE_DESCRIPTION("DRM driver for the Rockchip NPU IP"); > +MODULE_AUTHOR("Tomeu Vizoso"); > > [...] I'll send a second reply with PM comments on the job stuff in the patch that adds it, since I found something peculiar there while experimenting on RK3576. Kind regards, Nicolas Frattaroli