On Mon, Aug 10, 2020 at 06:31:36PM +0100, Mark Brown wrote: > On Mon, Aug 10, 2020 at 06:09:36PM +0200, Michał Mirosław wrote: > > On Mon, Aug 10, 2020 at 04:39:28PM +0100, Mark Brown wrote: > > > On Mon, Aug 10, 2020 at 12:25:37AM +0200, Michał Mirosław wrote: > > > > regulator_lock_dependent() starts by taking regulator_list_mutex, The > > > > same mutex covers eg. regulator initialization, including memory > > > > allocations > > > > that happen there. This will deadlock when you have filesystem on eg. > > > > eMMC > > > > (which uses a regulator to control module voltages) and you register > > > > a new regulator (hotplug a device?) when under memory pressure. > > > OK, that's very much a corner case, it only applies in the case of > > > coupled regulators. The most obvious thing here would be to move the > > > allocations on registration out of the locked region, we really only > > > need this in the regulator_find_coupler() call I think. If the > > > regulator isn't coupled we don't need to take the lock at all. > > Currently, regulator_lock_dependent() is called by eg. regulator_enable() > > and > > regulator_get_voltage(), so actually any regulator can deadlock this way. > The initialization cases that are the trigger are only done for coupled > regulators though AFAICT, otherwise we're not doing allocations with the > lock held and should be able to progress.
I caught a few lockdep complaints that suggest otherwise, but I'm still looking into that. > > > > Basically, we have a BKL for regulator_enable() and we're using ww_mutex > > > > as a recursive mutex with no deadlock prevention whatsoever. The locks > > > > also seem to cover way to much (eg. initialization even before making > > > > the > > > > regulator visible to the system). > > > Could you be more specific about what you're looking at here? There's > > > nothing too obvious jumping out from the code here other than the bit > > > around the coupling allocation, otherwise it looks like we're locking > > > list walks. > > When you look at the regulator API (regulator_enable() and friends), > > then in their implementation we always start by .._lock_dependent(), > > which takes regulator_list_mutex around its work. This mutex is what > > makes the code deadlock-prone vs memory allocations. I have a feeling > > that this lock is a workaround for historical requirements (recursive > > locking of regulator_dev) that might be no longer needed or is just > > too defensive programming. Hence my other patches and this inquiry. > We need to both walk up the tree for operations that involve the parent > and rope in any peers that are coupled, the idea is basically to avoid > changes to the coupling while we're trying to figure that out. This part I understand and this is what ww_mutex should take care of. But there is another lock that's taken around ww_mutex locking transaction that causes the trouble. I would expect that the rdev->mutex should be enough to guard against changes in coupled regulators list, but this might need a fix in the registration phase to guarantee that (it probably should do the same ww_mutex locking dance as .._lock_dependent() does now). Best Regards, Michał Mirosław