[PATCH 2/2] um: use proper care when taking mmap lock during segfault

2025-04-08 Thread Benjamin Berg
From: Benjamin Berg Segfaults can occur at times where the mmap lock cannot be taken. If that happens the segfault handler may not be able to take the mmap lock. Fix the code to use the same approach as most other architectures. Unfortunately, this requires copying code from mm/memory.c and

[PATCH 1/2] um: do not send SIGALRM to userspace in time-travel mode

2025-04-08 Thread Benjamin Berg
From: Benjamin Berg We send a SIGALRM to userspace processes to interrupt them. Really, doing so is only needed if they are actually executing at the time (to ensure we return to kernelspace). Unfortunately, we do not have that information readily available. We can however be sure that this is

Re: [PATCH 1/2] um: mark rodata read-only and implement _nofault accesses

2025-04-03 Thread Benjamin Berg
L) { show_regs(container_of(regs, struct pt_regs, regs)); panic("Segfault with no mm"); } On Wed, 2025-04-02 at 15:12 -0700, Nathan Chancellor wrote: > Hi Benjamin and Johannes, > > On Mon, Feb 10, 2025 at 05:09:25PM +0100, Benjamin Berg wrote: &

Re: [PATCH v2 1/4] um: Add pthread-based helper support

2025-03-25 Thread Benjamin Berg
On Tue, 2025-03-18 at 22:55 +0800, Tiwei Bie wrote: > On 2025/3/18 21:16, Johannes Berg wrote: > > On Tue, 2025-03-18 at 14:06 +0100, Johannes Berg wrote: > > > On Thu, 2025-03-06 at 23:07 +0800, Tiwei Bie wrote: > > > > Introduce a new set of utility functions that can be used to create > > > > pt

[PATCH] um: work around sched_yield not yielding in time-travel mode

2025-03-15 Thread Benjamin Berg
From: Benjamin Berg sched_yield by a userspace may not actually cause scheduling in time-travel mode as no time has passed. In the case seen it appears to be a badly implemented userspace spinlock in ASAN. Unfortunately, with time-travel it causes an extreme slowdown or even deadlock depending

Re: [PATCH 7/9] um: Implement kernel side of SECCOMP based process handling

2025-03-07 Thread Benjamin Berg
Hi, On Fri, 2025-03-07 at 16:04 +0900, Hajime Tazaki wrote: > thanks for the update; was waiting for this. > > On Tue, 25 Feb 2025 03:18:25 +0900, > Benjamin Berg wrote: > > > > This adds the kernel side of the seccomp based process handling. > > > > Co-au

Re: [systemd-devel] Limit resources of a group of users

2025-03-02 Thread Benjamin Berg
Hi, On Fri, 2025-02-28 at 13:38 -0800, Seva Epsteyn wrote: > I am trying to find a way to limit the combined resources of some, > but not all, users. For example all non root users should be limited > to 90% of memory. > > I can drop in config via user.slice.d which limits all users > combined, o

Re: [PATCH 3/3] x86: avoid copying dynamic FP state from init_task

2025-02-26 Thread Benjamin Berg
On Wed, 2025-02-26 at 14:08 +0100, Ingo Molnar wrote: > > * Benjamin Berg wrote: > > > From: Benjamin Berg > > > > The init_task instance of struct task_struct is statically allocated and > > may not contain the full FP state for userspace. As such, limit

Re: [PATCH v7 5/7] mseal, system mappings: enable uml architecture

2025-02-25 Thread Benjamin Berg
't really any cost to enabling the feature. That said, the only possible real-life use case I can see is doing MM subsystem testing using UML. We certainly do not need the feature to run our UML based wireless stack and driver tests. Benjamin > > > > > Benjamin > > > &g

[PATCH 4/9] um: Add helper functions to get/set state for SECCOMP

2025-02-24 Thread Benjamin Berg
: Benjamin Berg Signed-off-by: Benjamin Berg --- RFCv2: - Proper FP register handling --- arch/x86/um/os-Linux/mcontext.c | 220 ++- arch/x86/um/ptrace.c | 76 ++--- arch/x86/um/shared/sysdep/mcontext.h | 9 ++ 3 files changed, 285 insertions

[PATCH 3/9] um: Add stub side of SECCOMP/futex based process handling

2025-02-24 Thread Benjamin Berg
syscall. Co-authored-by: Johannes Berg Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg --- v1: - Cleanup futex EINTR/EAGAIN handling RFCv2: - Add include guards into new architecture specific header file --- arch/um/include/shared/common-offsets.h | 2 + arch/um/include/shared/skas

[PATCH 9/9] um: Add UML_SECCOMP configuration option

2025-02-24 Thread Benjamin Berg
Add the UML_SECCOMP configuration options. Signed-off-by: Benjamin Berg --- v1: - Move to the end RFCv2: - Remove "default n" --- arch/um/Kconfig | 19 +++ 1 file changed, 19 insertions(+) diff --git a/arch/um/Kconfig b/arch/um/Kconfig index 18051b1cfce0..11ed44225

[PATCH 6/9] um: Track userspace children dying in SECCOMP mode

2025-02-24 Thread Benjamin Berg
userspace process. This should be safe and SECCOMP requires the IRQ in case the process does not come up properly. Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg --- v1: - Permit IRQs during startup to enable detection there RFCv2: - Use "struct list_head" for the list by placi

[PATCH 7/9] um: Implement kernel side of SECCOMP based process handling

2025-02-24 Thread Benjamin Berg
This adds the kernel side of the seccomp based process handling. Co-authored-by: Johannes Berg Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg --- v1: - Fix FUTEX_WAIT EINTR handling - Don't send fatal_sigsegv when waiting during child startup --- arch/um/include/shared/c

[PATCH 8/9] um: pass FD for memory operations when needed

2025-02-24 Thread Benjamin Berg
From: Benjamin Berg Instead of always sharing the FDs with the userspace process, only hand over the FDs needed for mmap when required. The idea is that userspace might be able to force the stub into executing an mmap syscall, however, it will not be able to manipulate the control flow

[PATCH 1/9] um: Store full CSGSFS and SS register from mcontext

2025-02-24 Thread Benjamin Berg
perfectly fine for ptrace. Signed-off-by: Benjamin Berg --- arch/x86/um/os-Linux/mcontext.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/um/os-Linux/mcontext.c b/arch/x86/um/os-Linux/mcontext.c index e80ab7d28117..1b0d95328b2c 100644 --- a/arch/x86/um/os-Linux

[PATCH 5/9] um: Add SECCOMP support detection and initialization

2025-02-24 Thread Benjamin Berg
This detects seccomp support, sets the global using_seccomp variable and initilizes the exec registers. Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg --- arch/um/include/shared/skas/skas.h | 5 + arch/um/os-Linux/registers.c | 4 +- arch/um/os-Linux/skas/process.c

[PATCH 2/9] um: Move faultinfo extraction into userspace routine

2025-02-24 Thread Benjamin Berg
, but I do not know why this difference exists. And, passing NULL can even result in a crash. Signed-off-by: Benjamin Berg --- arch/um/os-Linux/skas/process.c | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/arch/um/os-Linux/skas/process.c b/arch/um/os-Linux

[PATCH 0/9] SECCOMP based userspace for UML

2025-02-24 Thread Benjamin Berg
From: Benjamin Berg Hi all, another version of the SECCOMP patchset. I think that this should now be good enough for general consumption. Compared to the last RFC version there is an important bugfix that caused a SIGSEGV loop and various other small bugfixes and cleanups. The patchset adds a

[PATCH] um: hostfs: avoid issues on inode number reuse by host

2025-02-14 Thread Benjamin Berg
From: Benjamin Berg Some file systems (e.g. ext4) may reuse inode numbers once the inode is not in use anymore. Usually hostfs will keep an FD open for each inode, but this is not always the case. In the case of sockets, this cannot even be done properly. As such, the following sequence of

[PATCH 1/2] um: mark rodata read-only and implement _nofault accesses

2025-02-10 Thread Benjamin Berg
mp;&label extension, but at least in one attempt I made the && caused the compiler to not load -EFAULT into the register in case of jumping to the &&label from the fault handler. So leave it like this for now. Co-developed-by: Benjamin Berg Signed-off-by: Johannes Berg Signed

[PATCH 0/2] Remove incorrect host mincore call and add rodata handling

2025-02-10 Thread Benjamin Berg
From: Benjamin Berg Hi, using mincore() to check whether a page is owned by UML is not correct as it returns whether the page is resident in memory and not whether something has been mapped at the address. This means that UML could get spurious failures in *_nofault functions like

[PATCH 2/2] um: remove copy_from_kernel_nofault_allowed

2025-02-10 Thread Benjamin Berg
From: Benjamin Berg There is no need to override the default version of this function anymore as UML now has proper _nofault memory access functions. Doing this also fixes the fact that the implementation was incorrect as using mincore() will incorrectly flag pages as inaccessible if they were

Re: UML failing at "Failed to initialize default registers" on kernel 5.10

2025-01-20 Thread Benjamin Berg
On Mon, 2025-01-20 at 07:07 +0100, Thomas Meyer wrote: > > > Am 20. Januar 2025 00:25:35 MEZ schrieb Glenn Washburn > : > > Hi Benjamin, > > > > After applying the close_range patch, I'm now getting a failure at > > runtime where the last line printed from UML is "Failed to > > initialize > > de

Re: [PATCH v4 1/1] exec: seal system mappings

2025-01-16 Thread Benjamin Berg
Hi Lorenzo, On Thu, 2025-01-16 at 15:48 +, Lorenzo Stoakes wrote: > On Wed, Jan 15, 2025 at 12:20:59PM -0800, Jeff Xu wrote: > > On Wed, Jan 15, 2025 at 11:46 AM Lorenzo Stoakes > > wrote: > > [SNIP] > > > > > I've made it abundantly clear that this (NACKed) series cannot allow the > > > ke

Re: [PATCH v6 00/13] nommu UML

2025-01-15 Thread Benjamin Berg
Hi, On Wed, 2025-01-15 at 09:25 +0900, Hajime Tazaki wrote: > On Wed, 15 Jan 2025 03:53:36 +0900, > Benjamin Berg wrote: > > > > On Tue, 2025-01-14 at 20:30 +0900, Hajime Tazaki wrote: > > > This patchset is another spin of nommu mode addition to UML.  It doesn

Re: [PATCH v6 00/13] nommu UML

2025-01-14 Thread Benjamin Berg
Hi, On Tue, 2025-01-14 at 20:30 +0900, Hajime Tazaki wrote: > This patchset is another spin of nommu mode addition to UML.  It doesn't > change a lot since the last version (v5), but contain clean ups.  It would > be nice to hear about your opinions on that. > > There are still several limitation

[PATCH v2] um: fix execve stub execution on old host OSs

2025-01-13 Thread Benjamin Berg
From: Benjamin Berg The stub execution uses the somewhat new close_range and execveat syscalls. Of these two, the execveat call is essential, but the close_range call is more about stub process hygiene rather than safety (and its result is ignored). Replace both calls with a raw syscall as

Re: [PATCH] um: fix execve stub execution on old host OSs

2025-01-12 Thread Benjamin Berg
Hi, On Sun, 2025-01-12 at 14:07 -0600, Glenn Washburn wrote: > On Fri, 10 Jan 2025 17:13:05 +0100 > Benjamin Berg wrote: > > > From: Benjamin Berg > > > > The stub execution uses the somewhat new close_range and execveat > > syscalls. Of these two, the ex

[PATCH] um: fix execve stub execution on old host OSs

2025-01-10 Thread Benjamin Berg
From: Benjamin Berg The stub execution uses the somewhat new close_range and execveat syscalls. Of these two, the execveat call is essential, but the close_range call is more about stub process hygiene rather than safety (and its result is ignored). Replace both calls with a raw syscall as

Re: close_range is not available on older systems, preventing the building of a UML kernel

2025-01-10 Thread Benjamin Berg
Hi, On Wed, 2025-01-08 at 12:13 +0100, Geert Uytterhoeven wrote: > Hi Benjamin, > > On Wed, Jan 8, 2025 at 11:58 AM Benjamin Berg > wrote: > > On Wed, 2025-01-08 at 02:24 -0600, Glenn Washburn wrote: > > > I'm wanting to build a UML kernel on Debian bullseye,

Re: close_range is not available on older systems, preventing the building of a UML kernel

2025-01-08 Thread Benjamin Berg
Hi, On Wed, 2025-01-08 at 02:24 -0600, Glenn Washburn wrote: > I'm wanting to build a UML kernel on Debian bullseye, which is at > kernel version 5.10, which does not support close_range(). It appears > as though close_range() support is required on the host since commit > 32e8eaf263d ("um: use ex

[PATCH] um: properly align signal stack on x86_64

2025-01-07 Thread Benjamin Berg
From: Benjamin Berg The stack needs to be properly aligned so 16 byte memory accesses on the stack are correct. This was broken when introducing the dynamic math register sizing as the rounding was not moved appropriately. Fixes: 3f17fed21491 ("um: switch to regset API and depend on X

[PATCH] um: rtc: use RTC time when calculating the alarm

2024-12-17 Thread Benjamin Berg
From: Benjamin Berg The kernel realtime and the current RTC time may have a (small) offset. Should the kernel time be slightly in the future, then the timeout is zero. This is problematic in time-travel mode, as a zero timeout can be correctly configured and time never advances. Replace the

[PATCH 1/3] vmlinux.lds.h: remove entry to place init_task onto init_stack

2024-12-17 Thread Benjamin Berg
From: Benjamin Berg Since commit 0eb5085c3874 ("arch: remove ARCH_TASK_STRUCT_ON_STACK") there is no option that would allow placing task_struct on the stack. Remove the unused linker script entry. Signed-off-by: Benjamin Berg --- include/asm-generic/vmlinux.lds.h | 1 - 1 file

[PATCH 2/3] um: avoid copying FP state from init_task

2024-12-17 Thread Benjamin Berg
From: Benjamin Berg The init_task instance of struct task_struct is statically allocated and does not contain the dynamic area for the userspace FP registers. As such, limit the copy to the valid area of init_task and fill the rest with zero. Note that the FP state is only needed for userspace

[PATCH 3/3] x86: avoid copying dynamic FP state from init_task

2024-12-17 Thread Benjamin Berg
From: Benjamin Berg The init_task instance of struct task_struct is statically allocated and may not contain the full FP state for userspace. As such, limit the copy to the valid area of init_task and fill the rest with zero. Note that the FP state is only needed for userspace, and as such it

[PATCH 0/3] KASAN fix for arch_dup_task_struct (x86, um)

2024-12-17 Thread Benjamin Berg
From: Benjamin Berg On the x86 and um architectures struct task_struct is dynamically sized depending on the size required to store the floating point registers. After adding this feature to UML it sometimes triggered KASAN errors as the memcpy in arch_dup_task_struct read past init_task. In my

Re: [PATCH v5] um: switch to regset API and depend on XSTATE

2024-12-14 Thread Benjamin Berg
Hi, On Sat, 2024-12-14 at 00:08 +0100, Benjamin Berg wrote: > outch. It is doing a memcpy of init_task. Now, struct task_struct is > variably sized, but init_struct is statically allocated, which could > explain why the memcpy is not permitted to read the larger memory (for > the

Re: [PATCH v5] um: switch to regset API and depend on XSTATE

2024-12-13 Thread Benjamin Berg
;) may be part of a correct fix here. Benjamin On Fri, 2024-12-13 at 12:00 -0800, Brian Norris wrote: > Hi Benjamin, > > On Wed, Oct 23, 2024 at 11:41:20AM +0200, Benjamin Berg wrote: > > From: Benjamin Berg > > > > The PTRACE_GETREGSET API has now existed since Linux 2.6

Re: [PATCH v4 1/1] exec: seal system mappings

2024-12-04 Thread Benjamin Berg
Hi, On Wed, 2024-12-04 at 09:43 -0800, Jeff Xu wrote: > On Wed, Dec 4, 2024 at 6:04 AM Benjamin Berg > wrote: > > On Mon, 2024-11-25 at 20:20 +, jef...@chromium.org wrote: > > > From: Jeff Xu > > > > > > Seal vdso, vvar, sigpage, uprobes and vs

Re: [PATCH v4 1/1] exec: seal system mappings

2024-12-04 Thread Benjamin Berg
Hi, On Wed, 2024-12-04 at 09:43 -0800, Jeff Xu wrote: > On Wed, Dec 4, 2024 at 6:04 AM Benjamin Berg > wrote: > > On Mon, 2024-11-25 at 20:20 +, jef...@chromium.org wrote: > > > From: Jeff Xu > > > > > > Seal vdso, vvar, sigpage, uprobes and vs

Re: [PATCH v4 1/1] exec: seal system mappings

2024-12-04 Thread Benjamin Berg
Hi, On Mon, 2024-11-25 at 20:20 +, jef...@chromium.org wrote: > From: Jeff Xu > > Seal vdso, vvar, sigpage, uprobes and vsyscall. > > Those mappings are readonly or executable only, sealing can protect > them from ever changing or unmapped during the life time of the process. > For complete

Re: [PATCH v4 1/1] exec: seal system mappings

2024-12-04 Thread Benjamin Berg
Hi, On Mon, 2024-11-25 at 20:20 +, jef...@chromium.org wrote: > From: Jeff Xu > > Seal vdso, vvar, sigpage, uprobes and vsyscall. > > Those mappings are readonly or executable only, sealing can protect > them from ever changing or unmapped during the life time of the process. > For complete

[PATCH] um: add back support for FXSAVE registers

2024-12-03 Thread Benjamin Berg
From: Benjamin Berg It was reported that qemu may not enable the XSTATE CPU extension, which is a requirement after commit 3f17fed21491 ("um: switch to regset API and depend on XSTATE"). Add a fallback to use FXSAVE (FP registers on x86_64 and XFP on i386) which is just a shorter vers

Re: [PATCH v5] um: switch to regset API and depend on XSTATE

2024-12-03 Thread Benjamin Berg
On Tue, 2024-12-03 at 07:56 -0800, SeongJae Park wrote: > On Tue, 03 Dec 2024 07:01:09 SeongJae Park wrote: > > > On Tue, 03 Dec 2024 09:40:34 +0100 Benjamin Berg > > wrote: > > > > > Hi, > > > > > > that probably means the size detection

Re: [PATCH v5] um: switch to regset API and depend on XSTATE

2024-12-03 Thread Benjamin Berg
: > Hello, > > > On Wed, 23 Oct 2024 11:41:20 +0200 Benjamin Berg > wrote: > > > From: Benjamin Berg > > > > The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE > > CPU feature should also be sufficiently common to be able to rely on it.

Re: [RFC PATCH v2 10/13] x86/um: nommu: signal handling

2024-11-28 Thread Benjamin Berg
Hi, On Mon, 2024-11-11 at 15:27 +0900, Hajime Tazaki wrote: > This commit updates the behavior of signal handling under !MMU > environment. 1) the stack preparation for the signal handlers and > 2) restoration of stack after rt_sigreturn(2) syscall.  Those are needed > as the stack usage on vfork(

Re: [RFC PATCH v2 09/13] x86/um/vdso: nommu: vdso memory update

2024-11-27 Thread Benjamin Berg
Hi, On Mon, 2024-11-11 at 15:27 +0900, Hajime Tazaki wrote: > On !MMU mode, the address of vdso is accessible from userspace.  This > commit implements the entry point by pointing a block of page address. > > This commit also add memory permission configuration of vdso page to be > executable. >

Re: [RFC PATCH v2 08/13] um: nommu: configure fs register on host syscall invocation

2024-11-27 Thread Benjamin Berg
Hi, On Mon, 2024-11-11 at 15:27 +0900, Hajime Tazaki wrote: > As userspace on UML/!MMU also need to configure %fs register when it is > running to correctly access thread structure, host syscalls implemented > in os-Linux drivers may be puzzled when they are called.  Thus it has to > configure %fs

[PATCH v3] um: move thread info into task

2024-11-11 Thread Benjamin Berg
From: Benjamin Berg This selects the THREAD_INFO_IN_TASK option for UM and changes the way that the current task is discovered. This is trivial though, as UML already tracks the current task in cpu_tasks[] and this can be used to retrieve it. Also remove the signal handler code that copies the

[PATCH] um: move thread info into task

2024-11-08 Thread Benjamin Berg
From: Benjamin Berg This selects the THREAD_INFO_IN_TASK option for UM and changes the way that the current task is discovered. This is trivial though, as UML already tracks the current task in cpu_tasks[] and this can be used to retrieve it. Also remove the signal handler code that copies the

Re: UML mount failure with Linux 6.11

2024-11-06 Thread Benjamin Berg
bably need to pass it differently for older kernels. Benjamin On Wed, 2024-11-06 at 17:22 +0530, Ritesh Raj Sarraf wrote: > Hello Benjamin, > > On Thu, 2024-10-31 at 11:07 +0100, Benjamin Berg wrote: > > Hi, > > > > Newer kernels have become more picky about that wi

[PATCH 4/4] um: virtio_uml: query the number of vqs if supported

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg When the VHOST_USER_PROTOCOL_F_MQ protocol feature flag is set, we can query the maximum number of virtual queues. Do so when supported and extend the check to verify that we are not trying to allocate more queues. Signed-off-by: Benjamin Berg --- arch/um/drivers

[PATCH 3/4] um: virtio_uml: fix call_fd IRQ allocation

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg If the device does not support slave requests, then the IRQ will not yet be allocated. So initialize the IRQ to UM_IRQ_ALLOC so that it will be allocated if none has been assigned yet and store it slightly later when we know that it will not be immediately unregistered again

[PATCH 0/4] Enable virtio-fs and virtio-snd in UML

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg With these changes, both virtio-fs and virtio-snd seem to work fine. The feature to query the number of vqs is not really needed, but I already had the code. You can for example boot a machine that has both by executing something like the following: $ virtiofsd --sandbox

[PATCH 2/4] um: virtio_uml: use smaller virtqueue sizes for VIRTIO_ID_SOUND

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg It appears that the different vhost device implementations use different sizes of the virtual queues. Add device specific limitations (for now, only for sound), to ensure that we do not get disconnected unexpectedly. Signed-off-by: Benjamin Berg --- arch/um/drivers

[PATCH 1/4] um: virtio_uml: send SET_MEM_TABLE message with the exact size

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg The rust based userspace vhost devices are very strict and will not accept the message if it is longer than required. So, only include the data for the first memory region. Signed-off-by: Benjamin Berg --- arch/um/drivers/virtio_uml.c | 2 +- 1 file changed, 1 insertion

[PATCH 1/5] um: set DONTDUMP and DONTFORK flags on KASAN shadow memory

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg There is no point in either dumping the KASAN shadow memory or doing copy-on-write after a fork on these memory regions. This considerably speeds up coredump generation. --- arch/um/os-Linux/mem.c | 12 1 file changed, 12 insertions(+) diff --git a/arch/um/os

[PATCH 2/5] um: always include kconfig.h and compiler-version.h

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg Since commit a95b37e20db9 ("kbuild: get out of ") we can safely include these files in userspace code. Doing so simplifies matters as options do not need to be exported via asm-offsets.h anymore. Signed-off-by: Benjamin Berg --- arch/u

[PATCH v2 5/5] um: remove broken double fault detection

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg The show_stack function had some code to detect double faults. However, the logic is wrong and it would e.g. trigger if a WARNING happened inside an IRQ. Remove it without trying to add a new logic. The current behaviour, which will just fault repeatedly until the IRQ stack

[PATCH 5/5] um: remove broken double fault detection

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg The show_stack function had some code to detect double faults. However, the logic is wrong and it would e.g. trigger if a WARNING happened inside an IRQ. Remove it without trying to add a new logic. The current behaviour, which will just fault repeatedly until the IRQ stack

[PATCH v2 1/5] um: set DONTDUMP and DONTFORK flags on KASAN shadow memory

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg There is no point in either dumping the KASAN shadow memory or doing copy-on-write after a fork on these memory regions. This considerably speeds up coredump generation. Signed-off-by: Benjamin Berg --- arch/um/os-Linux/mem.c | 12 1 file changed, 12

[PATCH 3/5] um: remove file sync for stub data

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg There is no need to sync the stub code to "disk" for the other process to see the correct memory. Drop the fsync there and remove the helper function. --- arch/um/include/shared/os.h | 1 - arch/um/kernel/physmem.c| 1 - arch/um/os-Linux/file.c | 6 -

[PATCH v2 3/5] um: remove file sync for stub data

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg There is no need to sync the stub code to "disk" for the other process to see the correct memory. Drop the fsync there and remove the helper function. Signed-off-by: Benjamin Berg --- arch/um/include/shared/os.h | 1 - arch/um/kernel/physmem.c| 1 - arch/u

[PATCH] um: move thread info into task

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg This selects the THREAD_INFO_IN_TASK option for UM and changes the way that the current task is discovered. This is trivial though, as UML already tracks the current task in cpu_tasks[] and this can be used to retrieve it. Also remove the signal handler code that copies the

[PATCH v2 2/5] um: always include kconfig.h and compiler-version.h

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg Since commit a95b37e20db9 ("kbuild: get out of ") we can safely include these files in userspace code. Doing so simplifies matters as options do not need to be exported via asm-offsets.h anymore. Signed-off-by: Benjamin Berg --- arch/u

[PATCH v2 4/5] um: remove duplicate UM_NSEC_PER_SEC definition

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg Just remove the first entry as there is a second later on. Signed-off-by: Benjamin Berg --- arch/um/include/shared/common-offsets.h | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/um/include/shared/common-offsets.h b/arch/um/include/shared/common-offsets.h index

[PATCH 4/5] um: remove duplicate UM_NSEC_PER_SEC definition

2024-11-03 Thread Benjamin Berg
From: Benjamin Berg Just remove the first entry as there is a second later on. --- arch/um/include/shared/common-offsets.h | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/um/include/shared/common-offsets.h b/arch/um/include/shared/common-offsets.h index 1d00fc6b6e92..73f3a4792ed8

[PATCH 2/2] um: fix sparse warnings in signal code

2024-10-31 Thread Benjamin Berg
From: Benjamin Berg sparse reports that various places were missing the __user tag in casts. In addition, one location was using 0 instead of NULL. Signed-off-by: Benjamin Berg --- arch/x86/um/signal.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86

[PATCH 1/2] um: fix sparse warnings from regset refactor

2024-10-31 Thread Benjamin Berg
From: Benjamin Berg Some variables were not tagged with __user and another was not marked as static even though it should be. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202410280655.golefwdg-...@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all

Re: UML mount failure with Linux 6.11

2024-10-31 Thread Benjamin Berg
Hi, Newer kernels have become more picky about that with the new mount API. This is relevant, see the discussion about "Unknown options": https://lwn.net/Articles/979166/ We only use hostfs for the root file system and in that case it works well if you pass the path using "hostfs=/path" on the

Re: [RFC PATCH 00/13] nommu UML

2024-10-28 Thread Benjamin Berg
Hello Hajime, On Sun, 2024-10-27 at 18:10 +0900, Hajime Tazaki wrote: > thank you for your time looking at this. > > On Sat, 26 Oct 2024 19:19:08 +0900, > Benjamin Berg wrote: > > > > - a crash on userspace programs crashes a UML kernel, not signaling > > &

Re: [RFC PATCH 8/9] um: Implement kernel side of SECCOMP based process handling

2024-10-26 Thread Benjamin Berg
Hi, On Thu, 2024-10-10 at 14:12 +0200, Johannes Berg wrote: > > +++ b/arch/um/os-Linux/skas/process.c > > @@ -1,9 +1,11 @@ > >   // SPDX-License-Identifier: GPL-2.0 > >   /* > > + * Copyright (C) 2021 Benjamin Berg > >    * Copyright (C) 2015 Thomas Meyer (th

Re: [RFC PATCH v2 9/9] um: pass FD for memory operations when needed

2024-10-26 Thread Benjamin Berg
Hi, On Thu, 2024-10-24 at 21:52 +0800, Tiwei Bie wrote: > On 2024/10/23 22:08, Benjamin Berg wrote: > [...] > > > It looks the memcpy could trigger a crash when UML_SECCOMP is > enabled: > > [...] > > It can be fixed with changes like below on my machine: >

Re: [RFC PATCH 00/13] nommu UML

2024-10-26 Thread Benjamin Berg
Hi, On Thu, 2024-10-24 at 21:09 +0900, Hajime Tazaki wrote: > This is a series of patches of nommu arch addition to UML.  It would > be nice to ask comments/opinions on this. > > There are several limitations/issues which we already found; here is > the list of those issues. > > - prompt configu

[RFC PATCH v2 9/9] um: pass FD for memory operations when needed

2024-10-23 Thread Benjamin Berg
From: Benjamin Berg Instead of always sharing the FDs with the userspace process, only hand over the FDs needed for mmap when required. The idea is that userspace might be able to force the stub into executing an mmap syscall, however, it will not be able to manipulate the control flow

[RFC PATCH v2 8/9] um: Implement kernel side of SECCOMP based process handling

2024-10-23 Thread Benjamin Berg
This adds the kernel side of the seccomp based process handling. Co-authored-by: Johannes Berg Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg --- arch/um/include/shared/common-offsets.h| 2 + arch/um/include/shared/os.h| 2 +- arch/um/include/shared/skas

[RFC PATCH v2 6/9] um: Add SECCOMP support detection and initialization

2024-10-23 Thread Benjamin Berg
This detects seccomp support, sets the global using_seccomp variable and initilizes the exec registers. For now, the implementation simply falls through to the ptrace startup code, meaning that it is unused. Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg --- arch/um/include/shared

[RFC PATCH v2 7/9] um: Track userspace children dying in SECCOMP mode

2024-10-23 Thread Benjamin Berg
the IRQ handler, find the affected MM and set its PID to -1 as well as the futex variable to FUTEX_IN_KERN. This, together with futex returning -EINTR after the signal is sufficient to implement a race-free detection of a child dying. Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg

[RFC PATCH v2 2/9] um: Move faultinfo extraction into userspace routine

2024-10-23 Thread Benjamin Berg
, but I do not know why this difference exists. And, passing NULL can even result in a crash. Signed-off-by: Benjamin Berg --- arch/um/os-Linux/skas/process.c | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/arch/um/os-Linux/skas/process.c b/arch/um/os-Linux

[RFC PATCH v2 4/9] um: Add stub side of SECCOMP/futex based process handling

2024-10-23 Thread Benjamin Berg
This adds the stub side for the new seccomp process management code. In this case we do register save/restore through the signal handler mcontext. For the FS_BASE/GS_BASE register we need special handling. Co-authored-by: Johannes Berg Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg

[RFC PATCH v2 1/9] um: Store full CSGSFS and SS register from mcontext

2024-10-23 Thread Benjamin Berg
perfectly fine for ptrace. Signed-off-by: Benjamin Berg --- arch/x86/um/os-Linux/mcontext.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/um/os-Linux/mcontext.c b/arch/x86/um/os-Linux/mcontext.c index e80ab7d28117..1b0d95328b2c 100644 --- a/arch/x86/um/os-Linux

[RFC PATCH v2 0/9] SECCOMP based userspace for UML

2024-10-23 Thread Benjamin Berg
From: Benjamin Berg Hi all, here is an updated version of the SECCOMP patchset. The main improvement to the previous RFC version is that now FP registers will work correctly on 32 bit. I hope it is in a relatively good state overall, but I expect we will not merge this into 6.13. The patchset

[RFC PATCH v2 5/9] um: Add helper functions to get/set state for SECCOMP

2024-10-23 Thread Benjamin Berg
: Benjamin Berg Signed-off-by: Benjamin Berg --- RFCv2: - Proper FP register handling --- arch/x86/um/os-Linux/mcontext.c | 220 ++- arch/x86/um/ptrace.c | 76 ++--- arch/x86/um/shared/sysdep/mcontext.h | 10 ++ 3 files changed, 286 insertions

[RFC PATCH v2 3/9] um: Add UML_SECCOMP configuration option

2024-10-23 Thread Benjamin Berg
Add the UML_SECCOMP configuration options. The next commits will add the support itself in smaller chunks. Only x86_64 will be supported for now. Signed-off-by: Benjamin Berg --- RFCv2: - Remove "default n" --- arch/um/Kconfig | 19 +++ 1 file changed, 19 insertion

[PATCH v5] um: switch to regset API and depend on XSTATE

2024-10-23 Thread Benjamin Berg
From: Benjamin Berg The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE CPU feature should also be sufficiently common to be able to rely on it. With this, define our internal FP state to be the hosts XSAVE data. Add discovery for the hosts XSAVE size and place the FP

[PATCH v4] um: switch to regset API and depend on XSTATE

2024-10-23 Thread Benjamin Berg
From: Benjamin Berg The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE CPU feature should also be sufficiently common to be able to rely on it. With this, define our internal FP state to be the hosts XSAVE data. Add discovery for the hosts XSAVE size and place the FP

[PATCH v3] um: switch to regset API and depend on XSTATE

2024-10-23 Thread Benjamin Berg
From: Benjamin Berg The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE CPU feature should also be sufficiently common to be able to rely on it. With this, define our internal FP state to be the hosts XSAVE data. Add discovery for the hosts XSAVE size and place the FP

Re: [PATCH] um: restore process name

2024-10-23 Thread Benjamin Berg
Hi, I just noticed that this is not completely correct. readlink() does not append a NULL byte, so you'll probably want to make the buffer one byte longer and either initialize it or set buf[ret] = '\0' (after the truncation check). Benjamin On Thu, 2024-10-10 at 16:14 +0200, Johannes Berg wrote

Re: [PATCH v9 02/10] um: use execveat to create userspace MMs

2024-10-17 Thread Benjamin Berg
Hi, On Thu, 2024-10-17 at 10:18 +0200, Johannes Berg wrote: > [SNIP] > > I wonder now if the SSE instructions generated are memset() and that > goes away with the patches that Nathan just sent to not have the memset > (which was due to -ftrivial-auto-var-init) in the first place? I am guessing i

Re: [PATCH v9 02/10] um: use execveat to create userspace MMs

2024-10-17 Thread Benjamin Berg
Hi, On Thu, 2024-10-17 at 15:17 +0800, David Gow wrote: > On Thu, 19 Sept 2024 at 20:45, Benjamin Berg > wrote: > > > > [SNIP] > > It turns out that this breaks the KUnit user alloc helpers on x86_64, > at least on my machine. > > This can be reproduced with:

Re: [PATCH] um: Abandon the _PAGE_NEWPROT bit

2024-10-11 Thread Benjamin Berg
t might make it more clear how everything ties together. Anyway, the change looks good to me. Benjamin Reviewed-by: Benjamin Berg > Signed-off-by: Tiwei Bie > --- >  arch/um/include/asm/pgtable.h   | 40 --- >  arch/um/include/shared/os.h |  2 - >

[PATCH v2] um: insert scheduler ticks when userspace does not yield

2024-10-10 Thread Benjamin Berg
From: Benjamin Berg In time-travel mode userspace can do a lot of work without any time passing. Unfortunately, this can result in OOM situations as the RCU core code will never be run. Work around this by keeping track of userspace processes that do not yield for a lot of operations. When this

Re: [RFC PATCH 4/9] um: Add stub side of SECCOMP/futex based process handling

2024-10-10 Thread Benjamin Berg
On Thu, 2024-10-10 at 13:51 +0200, Johannes Berg wrote: > On Wed, 2024-09-25 at 22:32 +0200, Benjamin Berg wrote: > > > > --- /dev/null > > +++ b/arch/x86/um/shared/sysdep/stub-data.h > > @@ -0,0 +1,18 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > Th

Re: [RFC PATCH 8/9] um: Implement kernel side of SECCOMP based process handling

2024-10-10 Thread Benjamin Berg
On Thu, 2024-10-10 at 14:12 +0200, Johannes Berg wrote: > On Wed, 2024-09-25 at 22:32 +0200, Benjamin Berg wrote: > > > > + /* > > +* If in seccomp mode, install the SECCOMP filter and > > trigger a syscall. > > +* Otherwise set PTRACE_TRACEME and do

[PATCH v2] um: switch to regset API and depend on XSTATE

2024-10-10 Thread Benjamin Berg
From: Benjamin Berg The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE CPU feature should also be sufficiently common to be able to rely on it. With this, define our internal FP state to be the hosts XSAVE data. Add discovery for the hosts XSAVE size and place the FP

[PATCH] um: switch to regset API and depend on XSTATE

2024-10-07 Thread Benjamin Berg
From: Benjamin Berg The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE CPU feature should also be sufficiently common to be able to rely on it. With this, define our internal FP state to be the hosts XSAVE data. Add discovery for the hosts XSAVE size and place the FP

[RFC PATCH] um: switch to regset API and depend on XSTATE

2024-10-04 Thread Benjamin Berg
From: Benjamin Berg The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE CPU feature should also be sufficiently common to be able to rely on it. With this, define our internal FP state to be the hosts XSAVE data. Add discovery for the hosts XSAVE register size and place the

[PATCH] um: remove auxiliary FP registers

2024-10-04 Thread Benjamin Berg
From: Benjamin Berg We do not need the extra save/restore of the FP registers when getting the fault information. This was originally added in commit 2f56debd77a8 ("uml: fix FP register corruption") but at that time the code was not saving/restoring the FP registers when switching to

  1   2   3   4   5   6   7   8   9   10   >