On Mon, Mar 25, 2019 at 10:45:29AM -0700, Linus Torvalds wrote: > On Fri, Mar 22, 2019 at 11:34 AM Michael Tirado <mtirado...@gmail.com> wrote: > > > > On Wed, Mar 20, 2019 at 8:08 PM Alexey Dobriyan <adobri...@gmail.com> wrote: > > > > > > pidfd code should be backed out immediately. Forget about /proc. > > > > Seems like Torvalds just merges this sort of "stuff" without reading > > it now, or there's something that auto accepted pull request to RC tree? > > There is no auto-accept. > > But there also didn't seem to be any valid arguments against it, and > the android people had arguments for it. > > Arguing against it based on "I don't like /proc" is pointless. The > fact is, /proc is our system interface for a lot of things. > > Arguing against it based on "I worry about the _other_ > non-signal-sending things that could be done with this" is also > pointless. What other things? The only thing that got merged was the > signalling.
To back Linus defense up with a glimpse into the future. We will not be to rely on dirfds from proc to do general process management. That is even in the commit message for the pidfd_send_signal syscall, that we intend to decouple this from procfs, i.e. decouple process management from process metadata reading. We have an ongoing discussion and what a lot of people agree upon is that pidfds will be anon inode file descriptors that stash a reference to struct pid in their private_data member. They can be pollable if ever need be and they are just conceptually cleaner and way simpler and mirror what will happen in the new mount api as well. The idea is to translate these pidfds e.g. via a simple ioctl() interface that takes a pidfd and gives back - with standard permissions applied as are today - a corresponding /proc/<pid> fd that can be used to read metadata of a process (see the suggestion by Andy and Jann [1]). The advantage is that this means that pidfd_clone() or something similar can simply return a pidfd and does not need to care about what procfs the process is supposed to be located in/reference and is in general way safer. But there is absolutely nothing wrong with allowing users to use /proc/<pid> to signal processes. One of the reasons why I did this is that it is so intuitive to users that non-kernel people have requested this be possible over and over. As mentioned in the orignal patchset the future was always to decouple this from procfs (see the references in there) and this is what the new pidctl() syscall is for that transparently translates between the pid-based api and the pidfd-based api. [1]: https://lore.kernel.org/lkml/CAG48ez3VMjLJBC_F3BxC2sc2s-28NdsrUduR=jx66xh0w2o...@mail.gmail.com/ > > Now, arguing that signalling should use the open-time credentials > might make sense, but this isn't read/write. You can't fool some suid > program to do magic randon system calls for you, and if you can, then > arguing about pidfd is kind of pointless. > > So the model of using a file descriptor instead of a 'pid' for signal > handling is actually very unix-like. Maybe that's how pid's should > have worked to begin with. Remember that whole "everything is a file" > thing? > > Now, the fact that fork() and clone() return a pid obviously means > that pidfd isn't the primary model (not to decades of just history), > but that doesn't make pidfd wrong. > > And namespace issues etc are all also kind of irrelevant. If you open > random files in /proc and randomly do pidfd_send_signal() on those, > you get random results. If that worries you, then DON'T DO THAT THEN, > for chrissake! That's not a sane model to begin with, but it's not the > usage model for this, so it's another completely specious argument. > > So yes, I thought about the pidfd pull (which was why it happened at > the very end of the merge window), and I found the arguments against > it bad. > > Linus