On 08/09, Bart Van Assche wrote: > > Hello Oleg, > > Something that puzzles me is that removing the "else" keyword from > abort_exclusive_wait() is sufficient to avoid the hang.
Yes, we need to understand this. > If there would > be code that clears PG_locked without calling wake_up() this hang > probably would also be triggered by workloads that do not wake up > lock_page_killable() with a signal. Yes, and I already have another debugging patch to test this... it simply turns lock_page_killable() into lock_page(). But lets check __ClearPageLocked() first (the patch I sent a minute ago). > BTW, the > WARN_ONCE(!list_empty(&wait->task_list) && waitqueue_active(q), "mode = > %#x\n", mode) statement that I added in abort_exclusive_wait() just > produced the following call stack: This condition is fine, and the trace is clear. This means that lock_page_killable() was interrupted and wake_bit_function() was not called. We do not need another wakeup in this case but somehow it helps. Again, I think because the necessary wakeup was already lost/missed. Oleg.