Oleg Nesterov <o...@redhat.com> writes: > sorry for noise, but after I read the changelog I have a minor nit, > feel free to ignore... > > On 02/12, Eric W. Biederman wrote: >> >> Skipping past dequeue_signal when we know a fatal signal has already >> been delivered resulted in SIGKILL remaining pending and >> TIF_SIGPENDING remaining set. This in turn caused the >> scheduler to not sleep in PTACE_EVENT_EXIT as it figured >> a fatal signal was pending. > > Yes, but the status of TIF_SIGPENDING doesn't matter. However I agree > with recalc_sigpending() added by this patch, simply because this is what > the "normal" dequeue_signal() paths do. > >> This also caused ptrace_freeze_traced >> in ptrace_check_attach to fail because it left a per thread >> SIGKILL pending which is what fatal_signal_pending tests for. > > this is possible too, but in the likely case ptrace_check_attach() won't > be even called exactly because the tracee won't stop and thus waitpid() > won't report WIFSTOPPED. And even if waitpid() "wins" the race and debugger > calls ptrace(), most probably ptrace_freeze_traced() will fail because > task_is_traced() will be false. > > I think this part of the changelog looks a bit confusing. It doesn't matter > why ptrace_check_attach() fails, it must fail if the tracee didn't stop. > > PTACE_EVENT_EXIT won't stop and thus this event won't be reported, that is > all.
There might be a better way to say it. What I meant to convey is that in my testing I could get PTRACE_EVENT_EXIT to stop by clearing TIF_SIGPENDING (and leaving SIGKILL in pending). That was insufficient to fix the bug. Without SIGKILL in pending ptrace_check_attach would fail when the process was stopped in PTRACE_EVENT_EXIT. Or in short it really bugs me that we have to have signal_group_exit and fatal_signal_pending not in agreement after in do_exit to make the code work. It bothers me because fatal_signal_pending and signal_group_exit are in all other cases just different ways to ask the same question. Eric