Hi, On Sun, 2025-01-12 at 14:07 -0600, Glenn Washburn wrote: > On Fri, 10 Jan 2025 17:13:05 +0100 > Benjamin Berg <benja...@sipsolutions.net> wrote: > > > From: Benjamin Berg <benjamin.b...@intel.com> > > > > The stub execution uses the somewhat new close_range and execveat > > syscalls. Of these two, the execveat call is essential, but the > > close_range call is more about stub process hygiene rather than safety > > (and its result is ignored). > > > > Replace both calls with a raw syscall as older machines might not have a > > recent enough kernel for close_range (with CLOSE_RANGE_CLOEXEC) or a > > libc that does not yet expose both of the syscalls. > > > > Fixes: 32e8eaf263d9 ("um: use execveat to create userspace MMs") > > This change fixes the immediate issue, allowing the compile to complete > successfully. > > > Reported-by: Glenn Washburn <developm...@efficientek.com> > > Closes: https://lore.kernel.org/20250108022404.05e0de1e@crass-HP-ZBook-15-G2 > > Signed-off-by: Benjamin Berg <benjamin.b...@intel.com> > > --- > > arch/um/os-Linux/skas/process.c | 16 +++++++++++++--- > > 1 file changed, 13 insertions(+), 3 deletions(-) > > > > diff --git a/arch/um/os-Linux/skas/process.c > > b/arch/um/os-Linux/skas/process.c > > index f683cfc9e51a..64e89921bc7f 100644 > > --- a/arch/um/os-Linux/skas/process.c > > +++ b/arch/um/os-Linux/skas/process.c > > @@ -181,6 +181,10 @@ extern char __syscall_stub_start[]; > > > > static int stub_exe_fd; > > > > +#ifndef CLOSE_RANGE_CLOEXEC > > +#define CLOSE_RANGE_CLOEXEC (1U << 2) > > +#endif > > + > > static int userspace_tramp(void *stack) > > { > > char *const argv[] = { "uml-userspace", NULL }; > > @@ -202,8 +206,12 @@ static int userspace_tramp(void *stack) > > init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset); > > init_data.stub_data_offset = MMAP_OFFSET(offset); > > > > - /* Set CLOEXEC on all FDs and then unset on all memory related FDs */ > > - close_range(0, ~0U, CLOSE_RANGE_CLOEXEC); > > + /* > > + * Avoid leaking unneeded FDs to the stub by setting CLOEXEC on all FDs > > + * and then unsetting it on all memory related FDs. > > + * This is not strictly necessary from a safety perspective. > > + */ > > + syscall(__NR_close_range, 1, ~0U, CLOSE_RANGE_CLOEXEC); > > Was this intentional to change the fd parameter from 0 to 1?
Ah, no, that was an accident. It is fine (zero FD is replaced anyway later), but it should be zero. > > fcntl(init_data.stub_data_fd, F_SETFD, 0); > > for (iomem = iomem_regions; iomem; iomem = iomem->next) > > @@ -224,7 +232,9 @@ static int userspace_tramp(void *stack) > > if (ret != sizeof(init_data)) > > exit(4); > > > > - execveat(stub_exe_fd, "", argv, NULL, AT_EMPTY_PATH); > > + /* Raw execveat for compatibility with older libc versions */ > > + syscall(__NR_execveat, stub_exe_fd, (unsigned long)"", > > + (unsigned long)argv, NULL, AT_EMPTY_PATH); > > I think it would look nicer to leave the call unchanged and define a stub > function like libc does, but only if we detect that the stubs are > undefined. I have a glibc specific patch for this, and then realized > that it should be more generic to cover other libcs. So this patch here > is more general. I think to have what I'd like, I'd need to add and run > a test binaries that check for these functions, like autotools does. > I'm not aware that this is really done in the kernel build system, > though I do know that there are binaries built during build for > generating binaries to be included in the kernel. So in theory I don't > think it should be too much trouble to do. Basically the idea would be > to have a system for testing host libc and outputting a config.h which > will define for instance, HAVE_EXECVEAT, if support is detected in the > linked libc for execveat. And in process.c define a stub around the > syscall() for execveat if not defined HAVE_EXECVEAT. > > So perhaps for now, this is the best solution, but ultimately it would > be nice to have the above and ultimately reverse these changes. Exactly, I am not aware of any configure infrastructure that would make it easy to do a test compile for discovery. I doubt it makes sense to add such an infrastructure just for this. Benjamin