On 08/08, Bart Van Assche wrote: > > This is the sequence of which I think that it leads to the missed wakeup: > > Task 1 Task 2 Task 3 > Task 4 > > lock_page() > ... > lock_page_killable() > __lock_page_killable() > __wait_on_bit_lock() > bit_wait_io() > io_schedule() > ... > > lock_page() > > __lock_page() > > __wait_on_bit_lock() > > bit_wait_io() > > io_schedule() > > ... > > > (signal delivery to task > 2) > try_to_wake_up(task2, > ..., ...) > (try_to_wake_up() returns > 1) > > unlock_page() > wake_up_page() > __wake_up_bit() > __wake_up(wq, TASK_NORMAL, 1, &key) > __wake_up_common(wq, mode=TASK_NORMAL, nr_exclusive=1, 0, key) > wake_bit_function() > autoremove_wake_function() > default_wake_function() > try_to_wake_up() <- skips task 2 because task 3 already changed > the task state of task 2 > (autoremove_wake_function() does not do > list_del_init(&wait->task_list))
Yes. But since it skips task2, __wake_up_common() doesn't decrement nr_exclusive, doesn't stop. It continues the list_for_each_entry_safe() loop, and finds the sleeping task4, and wakes it up, > bit_wait_io() returns -EINTR > abort_exclusive_wait() is called by > __wait_on_bit_lock() > > > In the above sequence task 1 does not remove task 2 from the waitqueue > because task 3 had already woken up task 2. The result is that when task 2 > calls abort_exclusive_wait() that task 2 is still on the waitqueue. Yes, but this is fine, > With the > current implementation of abort_exclusive_wait() in the above scenario task > 4 is not woken up although it should be woken up. See above, it must be already woken by __wake_up_common(). So far _I think_ that the bug is somewhere else... Say, someone clears PG_locked without wake_up(). Then SIGKILL sent to the task sleeping in sys_read() "adds" the necessary wakeup... Do you use external modules during the testing? Oleg.