On Mon, 2018-06-18 at 14:14 +0200, Frederic Barrat wrote:
> If a process exits without doing proper cleanup, there's a window
> where an opencapi device can try to access the memory of the dying
> process and may trigger a page fault. That's an expected scenario and
> the ocxl driver holds a reference on the mm_struct of the process
> until the opencapi device is notified of the process exiting.
> However, if mm_users is already at 0, i.e. the address space of the
> process has already been destroyed, the driver shouldn't try
> resolving
> the page fault, as it will fail, but it can also try accessing
> already
> freed data.
> 
> It is fixed by only calling the bottom half of the page fault handler
> if mm_users is greater than 0 and get a reference on mm_users instead
> of mm_count. Otherwise, we can safely return a translation fault to
> the device, as its associated memory context is being removed. The
> opencapi device will be properly cleaned up shortly after when
> closing
> the file descriptors.
> 
> Fixes: 5ef3166e8a32 ("ocxl: Driver code for 'generic' opencapi
> devices")
> Cc: sta...@vger.kernel.org # v4.16+
> Signed-off-by: Frederic Barrat <fbar...@linux.ibm.com>
> ---
>  drivers/misc/ocxl/link.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
> index f30790582dc0..2aeaf34e7eda 100644
> --- a/drivers/misc/ocxl/link.c
> +++ b/drivers/misc/ocxl/link.c
> @@ -136,7 +136,7 @@ static void xsl_fault_handler_bh(struct
> work_struct *fault_work)
>       int rc;
>  
>       /*
> -      * We need to release a reference on the mm whenever exiting
> this
> +      * We must release a reference on mm_users whenever exiting
> this
>        * function (taken in the memory fault interrupt handler)
>        */
>       rc = copro_handle_mm_fault(fault->pe_data.mm, fault->dar,
> fault->dsisr,
> @@ -172,7 +172,7 @@ static void xsl_fault_handler_bh(struct
> work_struct *fault_work)
>       }
>       r = RESTART;
>  ack:
> -     mmdrop(fault->pe_data.mm);
> +     mmput(fault->pe_data.mm);
>       ack_irq(spa, r);
>  }
>  
> @@ -184,6 +184,7 @@ static irqreturn_t xsl_fault_handler(int irq,
> void *data)
>       struct pe_data *pe_data;
>       struct ocxl_process_element *pe;
>       int lpid, pid, tid;
> +     bool schedule = false;
>  
>       read_irq(spa, &dsisr, &dar, &pe_handle);
>       trace_ocxl_fault(spa->spa_mem, pe_handle, dsisr, dar, -1);
> @@ -226,14 +227,19 @@ static irqreturn_t xsl_fault_handler(int irq,
> void *data)
>       }
>       WARN_ON(pe_data->mm->context.id != pid);
>  
> -     spa->xsl_fault.pe = pe_handle;
> -     spa->xsl_fault.dar = dar;
> -     spa->xsl_fault.dsisr = dsisr;
> -     spa->xsl_fault.pe_data = *pe_data;
> -     mmgrab(pe_data->mm); /* mm count is released by bottom half
> */
> -
> +     if (mmget_not_zero(pe_data->mm)) {
> +                     spa->xsl_fault.pe = pe_handle;
> +                     spa->xsl_fault.dar = dar;
> +                     spa->xsl_fault.dsisr = dsisr;
> +                     spa->xsl_fault.pe_data = *pe_data;
> +                     schedule = true;
> +                     /* mm_users count released by bottom half */
> +     }
>       rcu_read_unlock();
> -     schedule_work(&spa->xsl_fault.fault_work);
> +     if (schedule)
> +             schedule_work(&spa->xsl_fault.fault_work);
> +     else
> +             ack_irq(spa, ADDRESS_ERROR);
>       return IRQ_HANDLED;
>  }
>  

Reviewed-By: Alastair D'Silva <alast...@d-silva.org>

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819

Reply via email to