Michael Ellerman <m...@ellerman.id.au> writes: > Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> writes: > >> Calling arch_update_cpu_topology from a CPU hotplug state machine callback >> hits a deadlock because the function tries to get a read lock on >> cpu_hotplug_lock while the state machine still holds a write lock on it. >> >> Since all callers of arch_update_cpu_topology except rtasd already hold >> cpu_hotplug_lock, this patch changes the function to use >> stop_machine_cpuslocked and creates a separate function for rtasd which >> still tries to obtain the lock. >> >> Michael Bringmann investigated the bug and provided a detailed analysis >> of the deadlock on this previous RFC for an alternate solution: >> >> https://patchwork.ozlabs.org/patch/771293/ > > Do we know when this broke? Or has it never worked?
It's been broken since at least v4.4, I think. I don't know about earlier versions. > Should it go to stable? (can't in its current form AFAICS) It's not hard to backport both this patch and commit fe5595c07400 ("stop_machine: Provide stop_machine_cpuslocked()") from branch smp/hotplug in tip.git for stable. Since rtasd only started calling arch_update_cpu_topology since v4.11, for earlier versions this patch can be simplified to making that function call stop_machine_cpuslocked unconditionally instead of defining a separate function. >> Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> >> --- >> >> Notes: >> This patch applies on tip/smp/hotplug, it should probably be carried >> there. > > stop_machine_cpuslocked() doesn't exist in mainline so I think it has to > be carried there right? Yes. I said "probably" because I don't know if you want to wait until that branch is merged so that you can carry this patch in your tree. -- Thiago Jung Bauermann IBM Linux Technology Center