[PATCH v2 0/6] remove tlb_remove_page_ptdesc()

2025-02-24 Thread Qi Zheng
Changes in v2: - add [PATCH v2 2/6] (Peter Zijlstra) - remove [PATCH 4/5] and add [PATCH v2 5/6] - rebase onto the next-20250224 Hi all, As suggested by Peter Zijlstra below [1], this series aims to remove tlb_remove_page_ptdesc(). : Fundamentally tlb_remove_page() is about removing *pages

[PATCH v2 3/6] mm: pgtable: convert some architectures to use tlb_remove_ptdesc()

2025-02-24 Thread Qi Zheng
Now, the nine architectures of csky, hexagon, loongarch, m68k, mips, nios2, openrisc, sh and um do not select CONFIG_MMU_GATHER_RCU_TABLE_FREE, and just call pagetable_dtor() + tlb_remove_page_ptdesc() (the wrapper of tlb_remove_page()). This is the same as the implementation of tlb_remove_{ptdesc|

[PATCH v2 1/6] mm: pgtable: make generic tlb_remove_table() use struct ptdesc

2025-02-24 Thread Qi Zheng
Now only arm will call tlb_remove_ptdesc()/tlb_remove_table() when CONFIG_MMU_GATHER_TABLE_FREE is disabled. In this case, the type of the table parameter is actually struct ptdesc * instead of struct page *. Since struct ptdesc still overlaps with struct page and has not been separated from it, f

[PATCH v2 2/6] mm: pgtable: change pt parameter of tlb_remove_ptdesc() to struct ptdesc *

2025-02-24 Thread Qi Zheng
All callers of tlb_remove_ptdesc() pass it a pointer of struct ptdesc, so let's change the pt parameter from void * to struct ptdesc * to perform a type safety check. Signed-off-by: Qi Zheng Originally-by: Peter Zijlstra (Intel) --- include/asm-generic/tlb.h | 2 +- 1 file changed, 1 insertion(

[PATCH v2 5/6] x86: pgtable: convert to use tlb_remove_ptdesc()

2025-02-24 Thread Qi Zheng
The x86 has already been converted to use struct ptdesc, so convert it to use tlb_remove_ptdesc() instead of tlb_remove_table(). Signed-off-by: Qi Zheng --- arch/x86/mm/pgtable.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtab

[PATCH v2 6/6] mm: pgtable: remove tlb_remove_page_ptdesc()

2025-02-24 Thread Qi Zheng
The tlb_remove_ptdesc()/tlb_remove_table() is specially designed for page table pages, and now all architectures have been converted to use it to remove page table pages. So let's remove tlb_remove_page_ptdesc(), it currently has no users and should not be used for page table pages. Signed-off-by:

[PATCH v2 4/6] riscv: pgtable: unconditionally use tlb_remove_ptdesc()

2025-02-24 Thread Qi Zheng
To support fast gup, the commit 69be3fb111e7 ("riscv: enable MMU_GATHER_RCU_TABLE_FREE for SMP && MMU") did the following: 1) use tlb_remove_page_ptdesc() for those platforms which use IPI to perform TLB shootdown 2) use tlb_remove_ptdesc() for those platforms which use SBI to perform TLB s

[PATCH 8/9] um: pass FD for memory operations when needed

2025-02-24 Thread Benjamin Berg
From: Benjamin Berg Instead of always sharing the FDs with the userspace process, only hand over the FDs needed for mmap when required. The idea is that userspace might be able to force the stub into executing an mmap syscall, however, it will not be able to manipulate the control flow sufficient

[PATCH 7/9] um: Implement kernel side of SECCOMP based process handling

2025-02-24 Thread Benjamin Berg
This adds the kernel side of the seccomp based process handling. Co-authored-by: Johannes Berg Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg --- v1: - Fix FUTEX_WAIT EINTR handling - Don't send fatal_sigsegv when waiting during child startup --- arch/um/include/shared/common-offs

[PATCH 6/9] um: Track userspace children dying in SECCOMP mode

2025-02-24 Thread Benjamin Berg
When in seccomp mode, we would hang forever on the futex if a child has died unexpectedly. In contrast, ptrace mode will notice it and kill the corresponding thread when it fails to run it. Fix this issue using a new IRQ that is fired after a SIGCHLD and keeping an (internal) list of all MMs. In t

[PATCH 0/9] SECCOMP based userspace for UML

2025-02-24 Thread Benjamin Berg
From: Benjamin Berg Hi all, another version of the SECCOMP patchset. I think that this should now be good enough for general consumption. Compared to the last RFC version there is an important bugfix that caused a SIGSEGV loop and various other small bugfixes and cleanups. The patchset adds a n

[PATCH 1/9] um: Store full CSGSFS and SS register from mcontext

2025-02-24 Thread Benjamin Berg
Doing this allows using registers as retrieved from an mcontext to be pushed to a process using PTRACE_SETREGS. It is not entirely clear to me why CSGSFS was masked. Doing so creates issues when using the mcontext as process state in seccomp and simply copying the register appears to work perfectl

[PATCH 5/9] um: Add SECCOMP support detection and initialization

2025-02-24 Thread Benjamin Berg
This detects seccomp support, sets the global using_seccomp variable and initilizes the exec registers. Signed-off-by: Benjamin Berg Signed-off-by: Benjamin Berg --- arch/um/include/shared/skas/skas.h | 5 + arch/um/os-Linux/registers.c | 4 +- arch/um/os-Linux/skas/process.c| 3

[PATCH 9/9] um: Add UML_SECCOMP configuration option

2025-02-24 Thread Benjamin Berg
Add the UML_SECCOMP configuration options. Signed-off-by: Benjamin Berg --- v1: - Move to the end RFCv2: - Remove "default n" --- arch/um/Kconfig | 19 +++ 1 file changed, 19 insertions(+) diff --git a/arch/um/Kconfig b/arch/um/Kconfig index 18051b1cfce0..11ed4422593c 100644 -

[PATCH 2/9] um: Move faultinfo extraction into userspace routine

2025-02-24 Thread Benjamin Berg
The segv handler is called slightly differently depending on whether PTRACE_FULL_FAULTINFO is set or not (32bit vs. 64bit). The only difference is that we don't try to pass the registers and instruction pointer to the segv handler. It would be good to either document or remove the difference, but

[PATCH 3/9] um: Add stub side of SECCOMP/futex based process handling

2025-02-24 Thread Benjamin Berg
This adds the stub side for the new seccomp process management code. In this case we do register save/restore through the signal handler mcontext. Add special code for handling TLS, which for x86_64 means setting the FS_BASE/GS_BASE registers while for i386 it means calling the set_thread_area sys

[PATCH 4/9] um: Add helper functions to get/set state for SECCOMP

2025-02-24 Thread Benjamin Berg
When not using ptrace, we need to both save and restore registers through the mcontext as provided by the host kernel to our signal handlers. Add corresponding functions to store the state to an mcontext and helpers to access the mcontext of the subprocess through the stub data. Signed-off-by: Be

RE: [PATCH 3/6] ceph: return the correct dentry on mkdir

2025-02-24 Thread Viacheslav Dubeyko
On Mon, 2025-02-24 at 13:15 +1100, NeilBrown wrote: > On Fri, 21 Feb 2025, Viacheslav Dubeyko wrote: > > On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > > > ceph already splices the correct dentry (in splice_dentry()) from the > > > result of mkdir but does nothing more with it. > > > > > >

RE: [PATCH 3/6] ceph: return the correct dentry on mkdir

2025-02-24 Thread NeilBrown
On Tue, 25 Feb 2025, Viacheslav Dubeyko wrote: > On Mon, 2025-02-24 at 13:15 +1100, NeilBrown wrote: > > On Fri, 21 Feb 2025, Viacheslav Dubeyko wrote: > > > On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > > > > ceph already splices the correct dentry (in splice_dentry()) from the > > > > res

Re: [PATCH 3/6] ceph: return the correct dentry on mkdir

2025-02-24 Thread Jeff Layton
On Mon, 2025-02-24 at 22:09 +, Viacheslav Dubeyko wrote: > On Mon, 2025-02-24 at 13:15 +1100, NeilBrown wrote: > > On Fri, 21 Feb 2025, Viacheslav Dubeyko wrote: > > > On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > > > > ceph already splices the correct dentry (in splice_dentry()) from t

Re: [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry.

2025-02-24 Thread Chuck Lever
On 2/23/25 9:51 PM, NeilBrown wrote: > On Sat, 22 Feb 2025, Chuck Lever wrote: >> On 2/20/25 6:36 PM, NeilBrown wrote: > ... >>> + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); >>> + if (IS_ERR(dchild)) { >>> + host_err = PTR_ERR(dchild); >>>

Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry *

2025-02-24 Thread Trond Myklebust
On Mon, 2025-02-24 at 14:09 +1100, NeilBrown wrote: > On Mon, 24 Feb 2025, Al Viro wrote: > > On Mon, Feb 24, 2025 at 12:34:06PM +1100, NeilBrown wrote: > > > On Sat, 22 Feb 2025, Al Viro wrote: > > > > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > > > > > > > +In general, files