Hi Maxime, On Mon, 29 Sep 2025 14:45:10 +0200 Maxime Ripard <[email protected]> wrote:
> On Fri, Sep 26, 2025 at 05:59:43PM +0200, Luca Ceresoli wrote: > > drm_encoder_cleanup() modifies the encoder chain by removing bridges via > > drm_bridge_detach(). Protect this whole operation by taking the mutex, so > > any users iterating over the chain will not access it during the change. > > > > Signed-off-by: Luca Ceresoli <[email protected]> > > --- > > drivers/gpu/drm/drm_encoder.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_encoder.c b/drivers/gpu/drm/drm_encoder.c > > index > > 3261f142baea30c516499d23dbf8d0acf5952cd6..3a04bedf9b759acd6826864b7f2cc9b861a8f170 > > 100644 > > --- a/drivers/gpu/drm/drm_encoder.c > > +++ b/drivers/gpu/drm/drm_encoder.c > > @@ -195,9 +195,11 @@ void drm_encoder_cleanup(struct drm_encoder *encoder) > > * the indices on the drm_encoder after us in the encoder_list. > > */ > > > > + mutex_lock(&encoder->bridge_chain_mutex); > > list_for_each_entry_safe(bridge, next, &encoder->bridge_chain, > > chain_node) > > drm_bridge_detach(bridge); > > + mutex_unlock(&encoder->bridge_chain_mutex); > > You were claiming that the mutex was to prevent issues with concurrent > iteration and removal of the list members. list_for_each_entry_safe() is > explicitly made to protect against that. Why do we need both? You're right saying we don't need both. With a mutex preventing the list from any change, we can actually simpify code a bit to use the non-safe list macro: - struct drm_bridge *bridge, *next; + struct drm_bridge *bridge; ... + mutex_lock(&encoder->bridge_chain_mutex); - list_for_each_entry_safe(bridge, next, &encoder->bridge_chain, + list_for_each_entry(bridge, &encoder->bridge_chain, chain_node) drm_bridge_detach(bridge); + mutex_unlock(&encoder->bridge_chain_mutex); But it's not fully correct that list_for_each_entry_safe() protects against concurrent removal. As I see it, all the _safe variants of list_for_each*() macros protect against one specific and frequent use case: list_for_each_entry_safe(curs, n, some_list, membername) { ... list_del(&curs->membername); ... } So the _safe variant protect from removing the element at the current iteration (*curs). It does so by taking the following element pointer in advance, in the auxiliary variable @n. But _concurrent_ removal (the topic of this series) means the element being removed could just be the following one. Consider this sequence: 1. start loop: list_for_each_entry_safe() sets; pos = list_first_entry() = <bridge 1> n = list_next_entry(pos) = <bridge 2> 2. enter the loop 1st time, do something with *pos (bridge 1) 3. in the meanwhile bridge 2 is hot-unplugged -> another thread removes item 2 -> drm_bridge_detach() -> list_del() sets bridge->next = LIST_POISON1 4. loop iteration 1 finishes, list_for_each_entry_safe() sets: pos = n (previously set to bridge 2) n = (bridge 2)->next = LIST_POISON1 5. enter the loop 2nd time, do something with *pos (bridge 2) 6. loop iteration 2 finishes, list_for_each_entry_safe() sets: pos = n (previously set to LIST_POISON1) -> bug Do you think this explains the need for this series? If it does, I should probably add this to the cover letter and maybe patch 1. Luca -- Luca Ceresoli, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
