On 8/29/19 1:29 PM, Oliver O'Halloran wrote:
On Thu, Aug 29, 2019 at 4:34 PM Aneesh Kumar K.V
<aneesh.ku...@linux.ibm.com> wrote:
Right now we force an unbind of SCM memory at drcindex on H_OVERLAP error.
This really slows down operations like kexec where we get the H_OVERLAP
error because we don't go through a full hypervisor re init.
Maybe we should be unbinding it on a kexec().
shouldn't ?
H_OVERLAP error for a H_SCM_BIND_MEM hcall indicates that SCM memory at
drc index is already bound. Since we don't specify a logical memory
address for bind hcall, we can use the H_SCM_QUERY hcall to query
the already bound logical address.
This is a little sketchy since we might have crashed during the
initial bind. Checking if the last block is bound to where we expect
it to be might be a good idea. If it's not where we expect it to be,
then an unbind->bind cycle is the only sane thing to do.
I would not have expected hypervisor to not mark the drc index bound if
we failed the previous BIND request.
I can query start block and last block logical address and check whether
the full blocks is indeed mapped.
Boot time difference with and without patch is:
[ 5.583617] IOMMU table initialized, virtual merging enabled
[ 5.603041] papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Retrying
bind after unbinding
[ 301.514221] papr_scm ibm,persistent-memory:ibm,pmemory@44108001: Retrying
bind after unbinding
[ 340.057238] hv-24x7: read 1530 catalog entries, created 537 event attrs (0
failures), 275 descs
Is the unbind significantly slower than a bind? Or is the region here
just massive?
on unbind. We go two regions one of 60G and other of 10G
after fix
[ 5.101572] IOMMU table initialized, virtual merging enabled
[ 5.116984] papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Querying
SCM details
[ 5.117223] papr_scm ibm,persistent-memory:ibm,pmemory@44108001: Querying
SCM details
[ 5.120530] hv-24x7: read 1530 catalog entries, created 537 event attrs (0
failures), 275 descs
Signed-off-by: Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com>
---
arch/powerpc/platforms/pseries/papr_scm.c | 26 ++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c
b/arch/powerpc/platforms/pseries/papr_scm.c
index 220e595cb579..4b74cfe7b334 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -110,6 +110,27 @@ static void drc_pmem_unbind(struct papr_scm_priv *p)
return;
}
+static int drc_pmem_query(struct papr_scm_priv *p)
+{
+ unsigned long ret[PLPAR_HCALL_BUFSIZE];
+ int64_t rc;
+
+
+ rc = plpar_hcall(H_SCM_QUERY_BLOCK_MEM_BINDING, ret,
+ p->drc_index, 0);
+
+ if (rc) {
+ dev_err(&p->pdev->dev, "Failed to bind SCM");
+ return rc;
+ }
+
+ p->bound_addr = ret[0];
+ dev_dbg(&p->pdev->dev, "bound drc 0x%x to %pR\n", p->drc_index,
&p->res);
+
+ return 0;
+}
+
+
static int papr_scm_meta_get(struct papr_scm_priv *p,
struct nd_cmd_get_config_data_hdr *hdr)
{
@@ -431,9 +452,8 @@ static int papr_scm_probe(struct platform_device *pdev)
/* If phyp says drc memory still bound then force unbound and retry */
if (rc == H_OVERLAP) {
- dev_warn(&pdev->dev, "Retrying bind after unbinding\n");
- drc_pmem_unbind(p);
- rc = drc_pmem_bind(p);
+ dev_warn(&pdev->dev, "Querying SCM details\n");
That's a pretty vague message. If we're going to treat leaving the
region bound over kexec() as normal then you might want to bump it
down to pr_info() or so.
sure.
+ rc = drc_pmem_query(p);
}
if (rc != H_SUCCESS) {
--
2.21.0