Matthew Wilcox <wi...@infradead.org> writes:

> On Thu, Jul 09, 2020 at 12:38:40AM -0400, Gabriel Krisman Bertazi wrote:
>> The proposed interface looks like this:
>> 
>>   prctl(PR_SET_SYSCALL_USER_DISPATCH, <op>, <dispatcher>, [selector])
>> 
>> Dispatcher is the address of a syscall instruction that is allowed to
>> by-pass the blockage, such that in fast paths you don't need to disable
>> the trap nor check the selector.  This is essential to return from
>> SIGSYS to a blocked area without triggering another SIGSYS from the
>> rt_sigreturn.
>
> Should <dispatcher> be a single pointer or should the interface specify
> a range from which syscalls may be made without being redirected?  eg,
> one could specify the whole of libc.
>
> prctl(PR_SET_SYSCALL_USER_DISPATCH, <op>, <start>, <inclusive-end>,
> [selector])

I liked this suggestion a lot, since user can just pass a single address
to get the original interface, but it still let us not pay the cost of
__get_user on more paths.  I will add it to v3.

>
>> +++ b/include/linux/syscall_user_dispatch.h
>> @@ -0,0 +1,45 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _SYSCALL_USER_DISPATCH_H
>> +#define _SYSCALL_USER_DISPATCH_H
>> +
>> +struct task_struct;
>> +static void clear_tsk_thread_flag(struct task_struct *tsk, int flag);
>> +
>> +#ifdef CONFIG_SYSCALL_USER_DISPATCH
>> +struct syscall_user_dispatch {
>> +    int __user *selector;
>> +    unsigned long __user dispatcher;
>
> The __user annotation is on the pointer, not the value.  ie, it's
>
>       unsigned long foo;
>       unsigned long __user *p;
>
>       get_user(foo, p)
>
>> +++ b/include/uapi/asm-generic/siginfo.h
>> @@ -285,6 +285,7 @@ typedef struct siginfo {
>>   */
>>  #define SYS_SECCOMP 1       /* seccomp triggered */
>>  #define NSIGSYS             1
>> +#define SYS_USER_REDIRECT 2
>
> I'd suggest that SYS_USER_REDIRECT should be moved up by one line.
>
>> +int set_syscall_user_dispatch(int mode, unsigned long __user dispatcher,
>> +                          int __user *selector)
>> +{
>> +    switch (mode) {
>> +    case PR_SYSCALL_DISPATCH_DISABLE:
>> +            if (dispatcher || selector)
>> +                    return -EINVAL;
>> +            break;
>> +    case PR_SYSCALL_DISPATCH_ENABLE:
>> +            break;
>> +    default:
>> +            return -EINVAL;
>> +    }
>> +
>> +    if (selector) {
>> +            if (!access_ok(selector, sizeof(int)))
>> +                    return -EFAULT;
>> +    }
>
> You're not enforcing the alignment requirement here.  
>
>> +    spin_lock_irq(&current->sighand->siglock);
>> +
>> +    current->syscall_dispatch.selector = selector;
>> +    current->syscall_dispatch.dispatcher = dispatcher;
>> +
>> +    /* make sure fastlock is committed before setting the flag. */
>
> fastlock?  ;-)

Gee, keeping variable renames updated on comments is hard, compiler
won't catch those. :)

> I don't think you actually need this.  You're setting per-thread state on
> yourself, so what's the race that you're concerned about?

Good point.  I was assuming this would be modifiable from under it, but
it is not the case.

>
>> +    smp_mb__before_atomic();
>> +
>> +    if (mode == PR_SYSCALL_DISPATCH_ENABLE)
>> +            set_tsk_thread_flag(current, TIF_SYSCALL_USER_DISPATCH);
>> +    else
>> +            clear_tsk_thread_flag(current, TIF_SYSCALL_USER_DISPATCH);
>> +
>> +    spin_unlock_irq(&current->sighand->siglock);
>> +
>> +    return 0;
>> +}
>> -- 
>> 2.27.0
>> 

-- 
Gabriel Krisman Bertazi

Reply via email to