Hi Andy, ---- On Tue, 09 Jun 2026 08:01:57 +0800 Andy Lutomirski <[email protected]> wrote --- > On Thu, May 28, 2026 at 4:05 AM Christian Brauner <[email protected]> wrote: > > > > On Thu, May 28, 2026 at 05:52:21PM +0800, Li Chen wrote: > > > Hi, > > > > > > This is an early RFC for an idea that is probably still rough in both the > > > UAPI and implementation details. Sorry for the rough edges; I am sending > > > it now to check whether this direction is worth pursuing and to get > > > feedback on the kernel/userspace boundary. > > > > The idea of having a builder api for exec isn't all that crazy. But it > > should simply be built on top of pidfds and thus pidfs itself instead. > > It has all the basic infrastructure in place already. Any implementation > > should also allow userspace to implement posix_spawn() on top of it. > > > > fd = pidfd_open(0, PIDFD_EMPTY /* or better name */) > > > > pidfd_config(fd, ...) // modeled similar to fsconfig() > > > > After contemplating this for a bit... why pidfd? Doesn't a pidfd > refer to an actual process that is, or at least was, running? This > new thing is a process that we are contemplating spawning. I can > imagine that basically all pidfd APIs would be a bit confused by the > nonexistence of the process in question. >
Yes, I think that is a real concern.
In my current local WIP I tried to keep that distinction explicit.
pidfd_spawn_open() returns a pidfs-backed builder fd, not a normal pidfd
referring to a process. The builder fd is allocated as an anonymous pidfs
file with builder-specific file operations:
file = pidfs_alloc_anon_file("[pidfd_spawn]",
&pidfd_spawn_builder_fops, builder,
O_RDWR);
and the normal pidfd helpers still reject it because it does not use the
ordinary pidfd file operations:
struct pid *pidfd_pid(const struct file *file)
{
if (file->f_op != &pidfs_file_operations)
return ERR_PTR(-EBADF);
return file_inode(file)->i_private;
}
So the current split is:
builder_fd = pidfd_spawn_open(...); /* builder object */
pidfd_config(builder_fd, ...);
child_pidfd = pidfd_spawn_run(builder_fd, ...); /* real pidfd */
Only the last fd is a normal pidfd for an actual child process. The
builder fd is only accepted by the builder operations.
This avoids having to define what waitid(P_PIDFD), pidfd_send_signal(),
pidfd_getfd(), poll(), etc. mean before the process exists. The downside
is that it adds a separate open-style entry point and is less uniform than
the pidfd_open(0, PIDFD_EMPTY) spelling Christian sketched.
If people think there is a better way to represent the pre-spawn builder
state, or if the preference is to integrate it directly into pidfd_open()
with an explicit empty/future-pidfd state, I would be happy to discuss
that.
Regards,
Li

