On Thu, 2024-10-24 at 21:09 +0900, Hajime Tazaki wrote:
> This commit adds a mechanism to hook syscalls for unmodified userspace
> programs used under UML in !MMU mode. The mechanism, called zpoline,
> translates syscall/sysenter instructions with `call *%rax`, which can be
> processed by a trampoline code also installed upon an initcall during
> boot. The translation is triggered by elf_arch_finalize_exec(), an arch
> hook introduced by another commit.
> 
> All syscalls issued by userspace thus redirected to a speicific function,

typo: "specific"

> +     if (down_write_killable(&mm->mmap_lock)) {
> +             err = -EINTR;
> +             return err;

?


What happens if the binary JITs some code and you don't find it? I don't
remember from your talk - there you seemed to say this was fine just
slow, but that was zpoline in a different context (container)?

Perhaps UML could additionally install a seccomp filter or something on
itself while running a userspace program? Hmm.


> +/**
> + * setup trampoline code for syscall hooks
> + *
> + * the trampoline code guides to call hooked function, __kernel_vsyscall
> + * in this case, via nop slides at the memory address zero (thus, zpoline).
> + *
> + * loaded binary by exec(2) is translated to call the function.
> + */
> +static int __init setup_zpoline_trampoline(void)
> +{
> +     int i, ret;
> +     int ptr;
> +
> +     /* zpoline: map area of trampoline code started from addr 0x0 */
> +     __zpoline_start = 0x0;
> +
> +     ret = os_map_memory((void *) 0, -1, 0, 0x1000, 1, 1, 1);

(UM_)PAGE_SIZE?

> +     /**
> +      * FIXME: shit red zone area to properly handle the case

"shift"? :)

> +      */
> +
> +     /**
> +      * put code for jumping to __kernel_vsyscall.
> +      *
> +      * here we embed the following code.
> +      *
> +      * movabs [$addr],%r11
> +      * jmpq   *%r11
> +      *
> +      */
> +     ptr = NR_syscalls;
> +     /* 49 bb [64-bit addr (8-byte)]    movabs [64-bit addr (8-byte)],%r11 */
> +     __zpoline_start[ptr++] = 0x49;
> +     __zpoline_start[ptr++] = 0xbb;
> +     __zpoline_start[ptr++] = ((uint64_t)
> +                               __kernel_vsyscall >> (8 * 0)) & 0xff;

&0xff seems pointless with a u8 array?

> +     /* permission: XOM (PROT_EXEC only) */
> +     ret = os_protect_memory(0, 0x1000, 0, 0, 1);

(UM_)PAGE_SIZE?

johannes

Reply via email to