On Thu, Apr 18, 2019 at 10:26 AM Christian Brauner <christ...@brauner.io> wrote: > > On April 18, 2019 7:23:38 PM GMT+02:00, Jann Horn <ja...@google.com> wrote: > >On Wed, Apr 17, 2019 at 3:09 PM Oleg Nesterov <o...@redhat.com> wrote: > >> On 04/16, Joel Fernandes wrote: > >> > On Tue, Apr 16, 2019 at 02:04:31PM +0200, Oleg Nesterov wrote: > >> > > > >> > > Could you explain when it should return POLLIN? When the whole > >process exits? > >> > > >> > It returns POLLIN when the task is dead or doesn't exist anymore, > >or when it > >> > is in a zombie state and there's no other thread in the thread > >group. > >> > >> IOW, when the whole thread group exits, so it can't be used to > >monitor sub-threads. > >> > >> just in case... speaking of this patch it doesn't modify > >proc_tid_base_operations, > >> so you can't poll("/proc/sub-thread-tid") anyway, but iiuc you are > >going to use > >> the anonymous file returned by CLONE_PIDFD ? > > > >I don't think procfs works that way. /proc/sub-thread-tid has > >proc_tgid_base_operations despite not being a thread group leader.
Huh. That seems very weird. Is that too late to change now? It feels like a bug. > >(Yes, that's kinda weird.) AFAICS the WARN_ON_ONCE() in this code can > >be hit trivially, and then the code will misbehave. > > > >@Joel: I think you'll have to either rewrite this to explicitly bail > >out if you're dealing with a thread group leader If you're _not_ dealing with a leader, right? > , or make the code > >work for threads, too. > The latter case probably being preferred if this API is supposed to be > useable for thread management in userspace. IMHO, focusing on the thread group case for now might be best. We can always support thread management in future work. Besides: I'm not sure that we need kernel support for thread monitoring. Can't libc provide a pollable FD for a thread internally? libc can always run code just before thread exit, and it can wake a signalfd at that point. Directly terminating individual threads without going through userland is something that breaks the process anyway: it's legal and normal to SIGKILL a process a whole, but if an individual thread terminates without going through libc, the process is likely going to be fatally broken anyway. (What if it's holding the heap lock?) I know that in some tools want to wait for termination of individual threads in an external monitored process, but couldn't these tools cooperate with libc to get these per-thread eventfds? Is there a use case I'm missing?