On Mon, 13 Jan 2025 10:41:07 +0100 Benjamin Berg <benja...@sipsolutions.net> wrote:
> From: Benjamin Berg <benjamin.b...@intel.com> > > The stub execution uses the somewhat new close_range and execveat > syscalls. Of these two, the execveat call is essential, but the > close_range call is more about stub process hygiene rather than safety > (and its result is ignored). > > Replace both calls with a raw syscall as older machines might not have a > recent enough kernel for close_range (with CLOSE_RANGE_CLOEXEC) or a > libc that does not yet expose both of the syscalls. > > Fixes: 32e8eaf263d9 ("um: use execveat to create userspace MMs") > Reported-by: Glenn Washburn <developm...@efficientek.com> > Closes: https://lore.kernel.org/20250108022404.05e0de1e@crass-HP-ZBook-15-G2 > Signed-off-by: Benjamin Berg <benjamin.b...@intel.com> Tested-by: Glenn Washburn <developm...@efficientek.com> > > --- > > v2: > - Fix regression in previous version to not close FD 0 > --- > arch/um/os-Linux/skas/process.c | 16 +++++++++++++--- > 1 file changed, 13 insertions(+), 3 deletions(-) > > diff --git a/arch/um/os-Linux/skas/process.c b/arch/um/os-Linux/skas/process.c > index f683cfc9e51a..e2f8f156402f 100644 > --- a/arch/um/os-Linux/skas/process.c > +++ b/arch/um/os-Linux/skas/process.c > @@ -181,6 +181,10 @@ extern char __syscall_stub_start[]; > > static int stub_exe_fd; > > +#ifndef CLOSE_RANGE_CLOEXEC > +#define CLOSE_RANGE_CLOEXEC (1U << 2) > +#endif > + > static int userspace_tramp(void *stack) > { > char *const argv[] = { "uml-userspace", NULL }; > @@ -202,8 +206,12 @@ static int userspace_tramp(void *stack) > init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset); > init_data.stub_data_offset = MMAP_OFFSET(offset); > > - /* Set CLOEXEC on all FDs and then unset on all memory related FDs */ > - close_range(0, ~0U, CLOSE_RANGE_CLOEXEC); > + /* > + * Avoid leaking unneeded FDs to the stub by setting CLOEXEC on all FDs > + * and then unsetting it on all memory related FDs. > + * This is not strictly necessary from a safety perspective. > + */ > + syscall(__NR_close_range, 0, ~0U, CLOSE_RANGE_CLOEXEC); > > fcntl(init_data.stub_data_fd, F_SETFD, 0); > for (iomem = iomem_regions; iomem; iomem = iomem->next) > @@ -224,7 +232,9 @@ static int userspace_tramp(void *stack) > if (ret != sizeof(init_data)) > exit(4); > > - execveat(stub_exe_fd, "", argv, NULL, AT_EMPTY_PATH); > + /* Raw execveat for compatibility with older libc versions */ > + syscall(__NR_execveat, stub_exe_fd, (unsigned long)"", > + (unsigned long)argv, NULL, AT_EMPTY_PATH); > > exit(5); > }