papr_scm: Force a scm-unbind if initial scm-bind fails

Oliver O'Halloran Wed, 26 Jun 2019 18:42:11 -0700

On Thu, Jun 27, 2019 at 2:58 AM Aneesh Kumar K.V
<aneesh.ku...@linux.ibm.com> wrote:
>
> Vaibhav Jain <vaib...@linux.ibm.com> writes:
> > *snip*
> > +     /* If phyp says drc memory still bound then force unbound and retry */
> > +     if (rc == -EBUSY) {
> > +             dev_warn(&pdev->dev, "Retrying bind after unbinding\n");
> > +             drc_pmem_unbind(p);
> This should only be caused by kexec right?


We should only ever hit this path if there's an unclean shutdown, so
kdump or fadump. For a normal kexec the previous kernel should have
torn down the binding for us.

> And considering kernel nor
> hypervisor won't change device binding details, can you check switching
> this to H_SCM_QUERY_BLOCK_MEM_BINDING?

I thought about using the QUERY_BLOCK_MEM_BINDING call, but I'm not
sure it's a good idea. It bakes in assumptions about what the
*previous* kernel did with the SCM volume that might not be valid. A
panic while unbinding a region would result in a partially-bound
region which might break the query call. Also, it's possible that we
might have SCM drivers in the future that do something other than just
binding the volume in one contiguous chunk. UNBIND_ALL is robust
against all of these and robustness is what you want out of an error
handling mechanism.

> Will that result in faster boot?

As I said in the comments on v1, do we have any actual numbers on how
long the bind step takes? From memory, you could bind ~32GB in a
single bind h-call before phyp would hit it's time limit of 250us and
return a continue token. Assuming that holds we'll be saving a few
dozen milliseconds at best.

> > +             rc = drc_pmem_bind(p);
> > +     }
> > +
> >       if (rc)
> >               goto err;
> >
>
> I am also not sure about the module reference count here. Should we
> increment the module reference count after a bind so that we can track
> failures in ubind and fail the module unload?

I don't really get what you're concerned about here. The error
handling path calls drc_pmem_unbind() so if there's a bind error we
should never leave probe with memory still bound.

> -aneesh
>

Re: [PATCH v3 3/3] powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails

Reply via email to