On Tue, Aug 27, 2013 at 02:35:11PM -0700, Eric W. Biederman wrote: > Peter Zijlstra <pet...@infradead.org> writes: > > > On Mon, Aug 26, 2013 at 10:37:22PM -0400, Richard Guy Briggs wrote: > >> On Fri, Aug 23, 2013 at 08:36:21AM +0200, Peter Zijlstra wrote: > >> > Except that's not the case, with namespaces there's a clear hierarchy > >> > and the task_struct::pid is the one true value aka. root namespace. > >> > >> Peter, I agonized over the access efficiency of dropping this one or the > >> duplicate in task_struct::pids and this one was far easier to drop in > >> terms of somehow always forcing > >> task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid. > > > > You mean there's more than 1 site that sets task_struct::pid? I thought > > we only assign that thing once in fork.c someplace. > > No there is not and that is not a concern. > > Now I had thought given how the perf subsystem was constructed that only > the god like root was even allowed to use the code. But it turns out > there is sysctl_perf_event_paranoid that can bet twiddled that in some > circumstance that unprivileged users are allowed to use perf.
Even without poking at that, a user is always allowed to use perf on his own tasks. > Which > ultimately means perf will return the wrong data. How so, perf uses pid-namespaces, have a look at kernel/events/core.c:perf_event_[pt]id(). We store the namespace of the task creating the event (which is also -- hopefully -- the consumer of the data it generates). If you create an event and then switch namespaces you've bigger issues I suppose. > Meaning that perf is broken by design and perf has no excuse to be using > task->pid. It doesn't. > Similarly every other place in the kernel that has made the > same mistake. I mention perf explicitly for two reasons. perf gets the > namespace handling horribly wrong, Do tell. > perf is the only place in the kernel > where we are accessing pids frequently enough for an extra cache line > miss to be a concern. > > When really pids in the kernel what we care about is not some stupid > number but the stuct pid which points to that tasks, process groups, and > sessions. You know the object that a pid is the name for. > > So yes as a long term direction task->pid and task->tgid need to be > killed because we keep getting subsystems like perf that return the > wrong data to userspace, or perform the wrong checks, because the > current structure makes it seem like it is ok to do the wrong thing. I think you should have a look at code before you start raving.. makes you looks silly. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/