Hi Andy,

 ---- On Tue, 09 Jun 2026 08:01:57 +0800  Andy Lutomirski <[email protected]> 
wrote --- 
 > On Thu, May 28, 2026 at 4:05 AM Christian Brauner <[email protected]> wrote:
 > >
 > > On Thu, May 28, 2026 at 05:52:21PM +0800, Li Chen wrote:
 > > > Hi,
 > > >
 > > > This is an early RFC for an idea that is probably still rough in both the
 > > > UAPI and implementation details. Sorry for the rough edges; I am sending
 > > > it now to check whether this direction is worth pursuing and to get
 > > > feedback on the kernel/userspace boundary.
 > >
 > > The idea of having a builder api for exec isn't all that crazy. But it
 > > should simply be built on top of pidfds and thus pidfs itself instead.
 > > It has all the basic infrastructure in place already. Any implementation
 > > should also allow userspace to implement posix_spawn() on top of it.
 > >
 > > fd = pidfd_open(0, PIDFD_EMPTY /* or better name */)
 > >
 > > pidfd_config(fd, ...) // modeled similar to fsconfig()
 > >
 > 
 > After contemplating this for a bit... why pidfd?  Doesn't a pidfd
 > refer to an actual process that is, or at least was, running?  This
 > new thing is a process that we are contemplating spawning.  I can
 > imagine that basically all pidfd APIs would be a bit confused by the
 > nonexistence of the process in question.
 > 

Yes, I think that is a real concern.                                            
                                                                                
                                   
                                                                                
 
In my current local WIP I tried to keep that distinction explicit.              
                       
pidfd_spawn_open() returns a pidfs-backed builder fd, not a normal pidfd
referring to a process. The builder fd is allocated as an anonymous pidfs       
                                                                                
                                                 
file with builder-specific file operations:       
                                                                                
                       
    file = pidfs_alloc_anon_file("[pidfd_spawn]",                               
                       
                                 &pidfd_spawn_builder_fops, builder,      
                                 O_RDWR);                                       
                       
                                                  
and the normal pidfd helpers still reject it because it does not use the
ordinary pidfd file operations:                                                 
                       
                                                                                
                       
    struct pid *pidfd_pid(const struct file *file)
    {
        if (file->f_op != &pidfs_file_operations)                               
                       
            return ERR_PTR(-EBADF);               
        return file_inode(file)->i_private;                                     
                                                                                
                                                 
    }                                                                           
                                                                                
                                                 
                                                                                
                                                                                
                                                 
So the current split is:                                                        
                       
                                                                                
                       
    builder_fd = pidfd_spawn_open(...);       /* builder object */
    pidfd_config(builder_fd, ...);     
    child_pidfd = pidfd_spawn_run(builder_fd, ...); /* real pidfd */
                                                                                
                       
Only the last fd is a normal pidfd for an actual child process. The
builder fd is only accepted by the builder operations.                          
                                                                                
                                                 
                                                                                
                       
This avoids having to define what waitid(P_PIDFD), pidfd_send_signal(),
pidfd_getfd(), poll(), etc. mean before the process exists. The downside        
                                                                                
                                                 
is that it adds a separate open-style entry point and is less uniform than      
                                                                                
                                                 
the pidfd_open(0, PIDFD_EMPTY) spelling Christian sketched.                     
                                                                                
                                                 
                                                                                
                                                                                
                                                 
If people think there is a better way to represent the pre-spawn builder
state, or if the preference is to integrate it directly into pidfd_open()
with an explicit empty/future-pidfd state, I would be happy to discuss
that.

Regards,
Li​


Reply via email to