On Fri, 2018-09-07 at 12:49 -0700, Dave Hansen wrote:
> From: Dave Hansen <dave.han...@linux.intel.com>
> 
> The comments here are wrong.  They are too absolute about where
> faults can occur when running in the kernel.  The comments are
> also a bit hard to match up with the code.
> 
> Trim down the comments, and make them more precise.
> 
> Also add a comment explaining why we are doing the
> bad_area_nosemaphore() path here.
> 
> Signed-off-by: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Sean Christopherson <sean.j.christopher...@intel.com>
> Cc: "Peter Zijlstra (Intel)" <pet...@infradead.org>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: x...@kernel.org
> Cc: Andy Lutomirski <l...@kernel.org>
> ---
> 
>  b/arch/x86/mm/fault.c |   27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff -puN arch/x86/mm/fault.c~pkeys-fault-warnings-03 arch/x86/mm/fault.c
> --- a/arch/x86/mm/fault.c~pkeys-fault-warnings-03     2018-09-07 
> 11:21:47.696751898 -0700
> +++ b/arch/x86/mm/fault.c     2018-09-07 11:21:47.700751898 -0700
> @@ -1349,24 +1349,25 @@ void do_user_addr_space_fault(struct pt_
>               flags |= FAULT_FLAG_INSTRUCTION;
>  
>       /*
> -      * When running in the kernel we expect faults to occur only to
> -      * addresses in user space.  All other faults represent errors in
> -      * the kernel and should generate an OOPS.  Unfortunately, in the
> -      * case of an erroneous fault occurring in a code path which already
> -      * holds mmap_sem we will deadlock attempting to validate the fault
> -      * against the address space.  Luckily the kernel only validly
> -      * references user space from well defined areas of code, which are
> -      * listed in the exceptions table.
> +      * Kernel-mode access to the user address space should only occur
> +      * inside well-defined areas of code listed in the exception
> +      * tables.  But, an erroneous kernel fault occurring outside one of
> +      * those areas which also holds mmap_sem might deadlock attempting
> +      * to validate the fault against the address space.
>        *
> -      * As the vast majority of faults will be valid we will only perform
> -      * the source reference check when there is a possibility of a
> -      * deadlock. Attempt to lock the address space, if we cannot we then
> -      * validate the source. If this is invalid we can skip the address
> -      * space check, thus avoiding the deadlock:
> +      * Only do the expensive exception table search when we might be at
> +      * risk of a deadlock:
> +      * 1. We failed to acquire mmap_sem, and
> +      * 2. The access was an explicit kernel-mode access
> +      *    (X86_PF_USER=0).

Might be worth reminding the reader that X86_PF_USER will be set in
sw_error_code for implicit accesses.  I saw "explicit" and my mind
immediately jumped to hw_error_code for whatever reason.  E.g.:

        * 2. The access was an explicit kernel-mode access (we set X86_PF_USER
        *    in sw_error_code for implicit kernel-mode accesses).

>        */
>       if (unlikely(!down_read_trylock(&mm->mmap_sem))) {
>               if (!(sw_error_code & X86_PF_USER) &&
>                   !search_exception_tables(regs->ip)) {
> +                     /*
> +                      * Fault from code in kernel from
> +                      * which we do not expect faults.
> +                      */
>                       bad_area_nosemaphore(regs, sw_error_code, address, 
> NULL);
>                       return;
>               }
> _

Reply via email to