On Fri, Jan 18, 2019 at 04:47:23PM -0800, Dan Williams wrote:
>In recent days, 2 engineers, including the original author of
>nd_pfn_init(), overlooked the internal call to nd_pfn_validate() and the
>implications to memory allocation.
>
>Clarify this situation to help anyone that reads through this code in
>the future.
>
>Reported-by: Wei Yang <richardw.y...@linux.intel.com>
>Signed-off-by: Dan Williams <dan.j.willi...@intel.com>
>---
> drivers/nvdimm/btt_devs.c |    5 +++++
> drivers/nvdimm/dax_devs.c |    5 +++++
> drivers/nvdimm/pfn_devs.c |   21 +++++++++++++++++++++
> 3 files changed, 31 insertions(+)
>
>diff --git a/drivers/nvdimm/btt_devs.c b/drivers/nvdimm/btt_devs.c
>index 795ad4ff35ca..e0a6f2491e57 100644
>--- a/drivers/nvdimm/btt_devs.c
>+++ b/drivers/nvdimm/btt_devs.c
>@@ -354,6 +354,11 @@ int nd_btt_probe(struct device *dev, struct 
>nd_namespace_common *ndns)
>               put_device(btt_dev);
>       }
> 
>+      /*
>+       * Successful probe indicates to the caller that an nd_btt
>+       * personality device has been registered and the caller can
>+       * fail the probe of the baseline namespace device.
>+       */
>       return rc;
> }
> EXPORT_SYMBOL(nd_btt_probe);
>diff --git a/drivers/nvdimm/dax_devs.c b/drivers/nvdimm/dax_devs.c
>index 0453f49dc708..65010d955fa7 100644
>--- a/drivers/nvdimm/dax_devs.c
>+++ b/drivers/nvdimm/dax_devs.c
>@@ -136,6 +136,11 @@ int nd_dax_probe(struct device *dev, struct 
>nd_namespace_common *ndns)
>       } else
>               __nd_device_register(dax_dev);
> 
>+      /*
>+       * Successful probe indicates to the caller that a device-dax
>+       * personality device has been registered and the caller can
>+       * fail the probe of the baseline namespace device.
>+       */
>       return rc;
> }
> EXPORT_SYMBOL(nd_dax_probe);
>diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
>index 6f22272e8d80..a8783b5a76ba 100644
>--- a/drivers/nvdimm/pfn_devs.c
>+++ b/drivers/nvdimm/pfn_devs.c
>@@ -576,6 +576,11 @@ int nd_pfn_probe(struct device *dev, struct 
>nd_namespace_common *ndns)
>       } else
>               __nd_device_register(pfn_dev);
> 
>+      /*
>+       * Successful probe indicates to the caller that an nd_pfn
>+       * personality device has been registered and the caller can
>+       * fail the probe of the baseline namespace device.
>+       */
>       return rc;
> }
> EXPORT_SYMBOL(nd_pfn_probe);
>@@ -706,6 +711,22 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>               sig = DAX_SIG;
>       else
>               sig = PFN_SIG;
>+
>+      /*
>+       * Check for an existing 'pfn' superblock before writing a new
>+       * one. The intended flow is that on the first probe of an
>+       * nd_{pfn,dax} device the superblock is calculated and written
>+       * to the namespace. In this case nd_pfn_validate() returns
>+       * -ENODEV because no valid superblock exists currently.
>+       *
>+       * On subsequent probes nd_pfn_validate() will find a valid
>+       * superblock and return 0.
>+       *
>+       * If an assumption of the superblock has been violated, like a
>+       * change to the physical alignment of the namespace,
>+       * nd_pfn_validate() will return an error other than -ENODEV to
>+       * fail probing.
>+       */

Let me reply in this thread. Sorry for my poor understand, I don't get it
clearly now.

To be honest, the structure is a little bit complicated, if my understanding
is not correct, please forgive my poor understand.

Below is a code flow. To simply analysis, I setup kernel parameter memmap to
emulate, and configure one namespace to mode devdax. So that we would have the
same root for code flow.

Let's start with nd_region_driver:

    nd_region_probe
        nd_region_register_namespaces
            create_namespaces
        nd_region->btt_seed = nd_btt_create(nd_region);
        nd_region->pfn_seed = nd_pfn_create(nd_region);
        nd_region->dax_seed = nd_dax_create(nd_region);

After this, there are 4 devices created:

        namespace0.0, btt0.0, pfn0.0, dax0.0

And there are two drivers related to these devices. The relationship between
devices and drivers are:

        nd_pmem_driver: namespace0.0, btt0.0, pfn0.0
        dax_pmem_driver: dax0.0

Only the probe function on namespace0.0 succeed. Even others get -ENODEV,
those devices themself is not released.

Then let's look at the probe on namespace0.0:

    nd_pmem_probe
        nd_btt_probe
        nd_pfn_probe
        nd_dax_probe

When namespace is configured as devdax, only nd_dax_probe do some real work.

Then I see some different behavior as your description.

    * nd_dax_probe->nd_pfn_validate() return 0 instead of -ENODEV.
    * so device dax0.1 is created
    * dax_pmem_probe is called on dax0.1 and nd_pfn_validate() return 0 too

This means pfn_sb is created twice in following functions:

    * nd_dax_probe
    * dax_pmem_probe

Also, I have one confusion about your saying: two probes.

If the two probes are:

    * for dax%d.%d: 1. nd_dax_probe 2. dax_pmem_probe   
    * for pfn%d.%d: 1. nd_pfn_probe 2. nd_pmem_probe    

Then, if the first probe fails, the device itself would be destroyed. How the
second probe do its job?

>       rc = nd_pfn_validate(nd_pfn, sig);
>       if (rc != -ENODEV)
>               return rc;

-- 
Wei Yang
Help you, Help me

Reply via email to