On Thu, Aug 29, 2019 at 4:34 PM Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> wrote: > > Right now we force an unbind of SCM memory at drcindex on H_OVERLAP error. > This really slows down operations like kexec where we get the H_OVERLAP > error because we don't go through a full hypervisor re init.
Maybe we should be unbinding it on a kexec(). > H_OVERLAP error for a H_SCM_BIND_MEM hcall indicates that SCM memory at > drc index is already bound. Since we don't specify a logical memory > address for bind hcall, we can use the H_SCM_QUERY hcall to query > the already bound logical address. This is a little sketchy since we might have crashed during the initial bind. Checking if the last block is bound to where we expect it to be might be a good idea. If it's not where we expect it to be, then an unbind->bind cycle is the only sane thing to do. > Boot time difference with and without patch is: > > [ 5.583617] IOMMU table initialized, virtual merging enabled > [ 5.603041] papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Retrying > bind after unbinding > [ 301.514221] papr_scm ibm,persistent-memory:ibm,pmemory@44108001: Retrying > bind after unbinding > [ 340.057238] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 > failures), 275 descs Is the unbind significantly slower than a bind? Or is the region here just massive? > after fix > > [ 5.101572] IOMMU table initialized, virtual merging enabled > [ 5.116984] papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Querying > SCM details > [ 5.117223] papr_scm ibm,persistent-memory:ibm,pmemory@44108001: Querying > SCM details > [ 5.120530] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 > failures), 275 descs > > Signed-off-by: Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> > --- > arch/powerpc/platforms/pseries/papr_scm.c | 26 ++++++++++++++++++++--- > 1 file changed, 23 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c > b/arch/powerpc/platforms/pseries/papr_scm.c > index 220e595cb579..4b74cfe7b334 100644 > --- a/arch/powerpc/platforms/pseries/papr_scm.c > +++ b/arch/powerpc/platforms/pseries/papr_scm.c > @@ -110,6 +110,27 @@ static void drc_pmem_unbind(struct papr_scm_priv *p) > return; > } > > +static int drc_pmem_query(struct papr_scm_priv *p) > +{ > + unsigned long ret[PLPAR_HCALL_BUFSIZE]; > + int64_t rc; > + > + > + rc = plpar_hcall(H_SCM_QUERY_BLOCK_MEM_BINDING, ret, > + p->drc_index, 0); > + > + if (rc) { > + dev_err(&p->pdev->dev, "Failed to bind SCM"); > + return rc; > + } > + > + p->bound_addr = ret[0]; > + dev_dbg(&p->pdev->dev, "bound drc 0x%x to %pR\n", p->drc_index, > &p->res); > + > + return 0; > +} > + > + > static int papr_scm_meta_get(struct papr_scm_priv *p, > struct nd_cmd_get_config_data_hdr *hdr) > { > @@ -431,9 +452,8 @@ static int papr_scm_probe(struct platform_device *pdev) > > /* If phyp says drc memory still bound then force unbound and retry */ > if (rc == H_OVERLAP) { > - dev_warn(&pdev->dev, "Retrying bind after unbinding\n"); > - drc_pmem_unbind(p); > - rc = drc_pmem_bind(p); > + dev_warn(&pdev->dev, "Querying SCM details\n"); That's a pretty vague message. If we're going to treat leaving the region bound over kexec() as normal then you might want to bump it down to pr_info() or so. > + rc = drc_pmem_query(p); > } > > if (rc != H_SUCCESS) { > -- > 2.21.0 >