On 2018/10/25 20:13, Oleg Nesterov wrote: > On 10/25, Tetsuo Handa wrote: >> >> Oleg Nesterov wrote: >>> On 10/22, Tetsuo Handa wrote: >>>>> And again, I do not know how/if yama ensures that child is rcu-protected, >>>>> perhaps >>>>> task_is_descendant() needs to check pid_alive(child) right after >>>>> rcu_read_lock() ? >>>> >>>> Since the caller (ptrace() path) called get_task_struct(child), child >>>> itself can't be >>>> released. Do we still need pid_alive(child) ? >>> >>> get_task_struct(child) can only ensure that this task_struct can't be freed. >> >> The report says that it is a use-after-free read at >> >> walker = rcu_dereference(walker->real_parent); >> >> which means that walker was already released. > > quite possibly I missed something, but I am not sure I understand your > concerns... > > So again, suppose that "child" is already dead. Its task_struct can't be > freed, > but child->real_parent can point to the already freed memory.
Yes. But if child->real_parent is pointing to the already freed memory, why does pid_alive(child) == true help? > > This means that the 1st walker = rcu_dereference(walker->real_parent) is fine, > this simply reads the child->real_parent pointer, Yes. > but on the second iteration > > walker = rcu_dereference(walker->real_parent); > > reads the alredy freed memory. Yes. > >> I wonder whether pid_alive() test helps. >> >> We can get >> >> [ 40.620318] parent or walker is dead. >> [ 40.624146] tracee is dead. >> >> messages using below patch and reproducer. > > again, I do not understand, this all looks correct... > >> ---------- >> diff --git a/kernel/ptrace.c b/kernel/ptrace.c >> index 99cfddd..0d9d786 100644 >> --- a/kernel/ptrace.c >> +++ b/kernel/ptrace.c >> @@ -385,6 +385,7 @@ static int ptrace_attach(struct task_struct *task, long >> request, >> if (mutex_lock_interruptible(&task->signal->cred_guard_mutex)) >> goto out; >> >> + schedule_timeout_killable(HZ); >> task_lock(task); >> retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS); >> task_unlock(task); >> diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c >> index ffda91a..a231ec6 100644 >> --- a/security/yama/yama_lsm.c >> +++ b/security/yama/yama_lsm.c >> @@ -283,6 +283,11 @@ static int task_is_descendant(struct task_struct >> *parent, >> return 0; >> >> rcu_read_lock(); >> + if (!pid_alive(parent) || !pid_alive(walker)) { >> + rcu_read_unlock(); >> + printk("parent or walker is dead.\n"); > > This is what we need to do, except I think we should change > yama_ptrace_access_check(). > And iiuc parent == current, pid_alive(parent) looks unnecessary. Although we > need to > check ptracer_exception_found(), may be it needs some changes too. There are two task_is_descendant() callers, and one of them is not passing current. > > And yes, task_is_descendant() can hit the dead child, if nothing else it can > be killed. This can explain the kasan report. The kasan is reporting that child->real_parent (or maybe child->real_parent->real_parent or child->real_parent->real_parent->real_parent ...) was pointing to already freed memory, isn't it? How can we check that that pointer is pointing to already freed memory? As soon as walker = rcu_dereference(walker->real_parent); is executed, task_alive(walker) will try to read from already freed memory...