On May 11, 2016 1:35 PM, "Mateusz Guzik" <mgu...@redhat.com> wrote:
>
> On Tue, May 10, 2016 at 01:58:24PM -0700, Andy Lutomirski wrote:
> > On Tue, May 10, 2016 at 1:56 PM, Mateusz Guzik <mgu...@redhat.com> wrote:
> > > This fixes 731e33e39a5b95ad770 "Remove FSBASE/GSBASE < 4G optimization"
> >
> > Indeed.  How did that survive lockdep?
> >
>
> lockdep_sys_exit only checks actual locks.
>
> In the common path after return from particular syscall interrupts get
> blindly disabled (as opposed to checking first that they are enabled).
> preemption count is not checked in the fast path at all and is checked
> elsewhere as a side effect of calls to e.g. schedule().
>
> How about a hack along these lines (note I don't claim this is
> committable as it is, but it should work):
>
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index ec138e5..5887bc7 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -303,6 +303,24 @@ static void syscall_slow_exit_work(struct pt_regs *regs, 
> u32 cached_flags)
>                 tracehook_report_syscall_exit(regs, step);
>  }
>
> +#ifdef CONFIG_PROVE_LOCKING
> +/*
> + * Called after syscall handlers return.
> + */
> +__visible void syscall_assert_exit(struct pt_regs *regs)
> +{
> +       if (in_atomic() || irqs_disabled()) {
> +               printk(KERN_ERR "invalid state on exit from syscall %ld: "
> +                       "in_atomic(): %d, irqs_disabled(): %d, pid: %d, "
> +                       "name: %s\n", regs->orig_ax, in_atomic(),
> +                       irqs_disabled(), current->pid, current->comm);
> +       }
> +
> +       if (irqs_disabled())
> +               local_irq_enable();
> +}
> +#endif
> +
>  /*
>   * Called with IRQs on and fully valid regs.  Returns with IRQs off in a
>   * state such that we can immediately switch to user mode.
> @@ -314,9 +332,7 @@ __visible inline void syscall_return_slowpath(struct 
> pt_regs *regs)
>
>         CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
>
> -       if (IS_ENABLED(CONFIG_PROVE_LOCKING) &&
> -           WARN(irqs_disabled(), "syscall %ld left IRQs disabled", 
> regs->orig_ax))
> -               local_irq_enable();
> +       syscall_assert_exit(regs);
>
>         /*
>          * First do one-time work.  If these work items are enabled, we
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 9ee0da1..6c5cc23 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -210,6 +210,12 @@ entry_SYSCALL_64_fastpath:
>         movq    %rax, RAX(%rsp)
>  1:
>
> +#ifdef CONFIG_PROVE_LOCKING
> +       /*
> +        * We want to validate bunch of stuff, which will clobber registers.
> +        */
> +       jmp     2f
> +#endif
>         /*
>          * If we get here, then we know that pt_regs is clean for SYSRET64.
>          * If we see that no exit work is required (which we are required
> @@ -236,6 +242,7 @@ entry_SYSCALL_64_fastpath:
>          */
>         TRACE_IRQS_ON
>         ENABLE_INTERRUPTS(CLBR_NONE)
> +2:
>         SAVE_EXTRA_REGS
>         movq    %rsp, %rdi
>         call    syscall_return_slowpath /* returns with IRQs disabled */

It would be nice to do this in a cross-arch way.  Maybe we could
extend lockdep_sys_exit?  Ingo, do you think that would be reasonable?

--Andy

>
> --
> Mateusz Guzik

Reply via email to