On Tue, 31 Mar 2026 21:13:34 +0200 "Arnd Bergmann" <[email protected]> wrote:
> On Tue, Mar 31, 2026, at 19:19, Jori Koolstra wrote: > > Currently there is no way to race-freely create and open a directory. > > For regular files we have open(O_CREAT) for creating a new file inode, > > and returning a pinning fd to it. The lack of such functionality for > > directories means that when populating a directory tree there's always > > a race involved: the inodes first need to be created, and then opened > > to adjust their permissions/ownership/labels/timestamps/acls/xattrs/..., > > but in the time window between the creation and the opening they might > > be replaced by something else. > > > > Addressing this race without proper APIs is possible (by immediately > > fstat()ing what was opened, to verify that it has the right inode type), > > but difficult to get right. Hence, mkdirat_fd() that creates a directory > > and returns an O_DIRECTORY fd is useful. > > > > This feature idea (and description) is taken from the UAPI group: > > https://github.com/uapi-group/kernel-features?tab=readme-ov-file#race-free-creation-and-opening-of-non-file-inodes > > > > Signed-off-by: Jori Koolstra <[email protected]> > > I checked that the calling conventions are fine, i.e. this will work > as expected across all architectures. I assume you are also aware > that the non-RFC patch will need to add the syscall number to all > .tbl files. > > The hardest problem here does seem to be the naming of the > new syscall, and I'm sorry to not be able to offer any solution > either, just two observations: > > - mkdirat/mkdirat_fd sounds similar to the existing > quotactl/quotactl_fd pair, but quotactl_fd() takes a file > descriptor argument rather than returning it, which makes > this addition quite confusing. > > - the nicest interface IMO would have been a variation of > openat(dfd, filename, O_CREAT | O_DIRECTORY, mode) > but that is a minefield of incompatible implementations[1], > so we can't do that without changing the behavior for > existing callers that currently run into an error. Just require O_TMPFILE to be set as well :-) You know you'll never regret it one Apr-1 is over. Can something be done with the flags to openat2(). That might save allocating an extra system call. David > > Arnd > > [1] https://lwn.net/Articles/926782/ >

