Hi Ulf, On Tue, Jun 23, 2015 at 2:50 PM, Ulf Hansson <ulf.hans...@linaro.org> wrote: > On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+rene...@glider.be> wrote: >> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end >> up with an infinite loop in genpd_dev_pm_{at,de}tach(). >> >> This may happen due to a genpd.prepared_count imbalance. This is a bug >> elsewhere, but it will result in a system lock up, possibly during >> reboot of an otherwise functioning system. >> >> To avoid this, put a limit on the maximum number of loop iterations, >> including a simple back-off mechanism. If the limit is reached, the >> operation will just fail. An error message is already printed. >> >> Signed-off-by: Geert Uytterhoeven <geert+rene...@glider.be> >> --- >> drivers/base/power/domain.c | 16 ++++++++++++++-- >> 1 file changed, 14 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >> index cdd547bd67df8218..60e0309dd8dd0264 100644 >> --- a/drivers/base/power/domain.c >> +++ b/drivers/base/power/domain.c >> @@ -6,6 +6,7 @@ >> * This file is released under the GPLv2. >> */ >> >> +#include <linux/delay.h> >> #include <linux/kernel.h> >> #include <linux/io.h> >> #include <linux/platform_device.h> >> @@ -19,6 +20,9 @@ >> #include <linux/suspend.h> >> #include <linux/export.h> >> >> +#define GENPD_RETRIES 20 >> +#define GENPD_DELAY_US 10 >> + >> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \ >> ({ \ >> type (*__routine)(struct device *__d); \ >> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider); >> static void genpd_dev_pm_detach(struct device *dev, bool power_off) >> { >> struct generic_pm_domain *pd; >> + unsigned int i; >> int ret = 0; >> >> pd = pm_genpd_lookup_dev(dev); >> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, >> bool power_off) >> >> dev_dbg(dev, "removing from PM domain %s\n", pd->name); >> >> - while (1) { >> + for (i = 0; i < GENPD_RETRIES; i++) { >> ret = pm_genpd_remove_device(pd, dev); >> if (ret != -EAGAIN) >> break; >> + >> + if (i > GENPD_RETRIES / 2) >> + udelay(GENPD_DELAY_US); >> cond_resched(); >> } >> >> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev) >> { >> struct of_phandle_args pd_args; >> struct generic_pm_domain *pd; >> + unsigned int i; >> int ret; >> >> if (!dev->of_node) >> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev) >> >> dev_dbg(dev, "adding to PM domain %s\n", pd->name); >> >> - while (1) { >> + for (i = 0; i < GENPD_RETRIES; i++) { >> ret = pm_genpd_add_device(pd, dev); >> if (ret != -EAGAIN) >> break; >> + >> + if (i > GENPD_RETRIES / 2) >> + udelay(GENPD_DELAY_US); > > In this execution path, we retry when getting -EAGAIN while believing > the reason to the error are only *temporary* as we are soon waiting > for all devices in the genpd to be system PM resumed. At least that's > my understanding to why we want to deal with -EAGAIN here, but I might > be wrong. > > In this regards, I wonder whether it could be better to re-try only a > few times but with a far longer interval time than a couple us. What > do you think?
That's indeed viable. I have no idea for how long this temporary state can extend. > However, what if the reason to why we get -EAGAIN isn't *temporary*, > because we are about to enter system PM suspend state. Then the caller > of this function which comes via some bus' ->probe(), will hang until > the a system PM resume is completed. Is that really going to work? So, > for this case your limited re-try approach will affect this scenario > as well, have you considered that? There's a limit on the number of retries, so it won't hang indefinitely. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/