On Sat, 26 Oct 2024 00:20:49 +0900, Johannes Berg wrote: > > On Fri, 2024-10-25 at 21:58 +0900, Hajime Tazaki wrote: > > > > > > + if (down_write_killable(&mm->mmap_lock)) { > > > > + err = -EINTR; > > > > + return err; > > > > > > ? > > > > the lock isn't needed actually so, will remove it. > > Oh, I was just looking at the weird handling of the err variable :)
Ah, now I see. I'd revert the lock part with `return -EINTR` instead. > > > What happens if the binary JITs some code and you don't find it? I don't > > > remember from your talk - there you seemed to say this was fine just > > > slow, but that was zpoline in a different context (container)? > > > > instructions loaded after execve family (like JIT generated code, > > loaded with dlopen, etc) isn't going to be translated. we can > > translated it by tweaking the userspace loader (ld.so w/ LD_PRELOAD) > > or hook mprotect(2) syscall before executing JIT generated code. > > generic description is written in the document ([12/13]). > > Guess I should've read that, sorry. no no, since this part is completely new feature and I'd like to explain any unclear points to help understanding, so any inputs are always nice. # btw, the talk at last netdev was not container specific context, but more focus on the syscall hook mechanism itself so, I didn't go much detail at that time. > > > Perhaps UML could additionally install a seccomp filter or something on > > > itself while running a userspace program? Hmm. > > > > I'm trying to understand the purpose of seccomp filter you suggested > > here; is it for preventing executed by untranslated code ? > > Yeah, that's what I was wondering. > > Obviously you have to be able to get rid of the seccomp filter again so > it's not foolproof, but perhaps not _that_ bad? > > I'm not worried about security or so, it's clear this isn't even _meant_ > to have security. But I do wonder about really hard to debug issues if > userspace suddenly makes syscalls to the host, that'd be ... difficult > to understand? I totally understand; I faced similar situations during the developing this patchset. Originally our patchset had a whitelist-based seccomp filter (w/ SCMP_ACT_ALLOW), but dropped from this RFC as I found that 1) this is not the !MMU specific feature (it can be generally applied to all UML use cases), and 2) we cannot prevent a syscall (e.g., ioctl(2)) from userspace which is white-listed in our seccomp filter, thus the newly introduced filter may not be perfect. the maintenance of the whitelist is also not easy; the syscall used in one version is renamed at some point in future (what I faced is SCMP_SYS(open) should be renamed with SCMP_SYS(openat)). -- Hajime