Hi Stefano,

> -----Original Message-----
> From: Stefano Stabellini <sstabell...@kernel.org>
> Sent: 2021年8月28日 9:06
> To: Wei Chen <wei.c...@arm.com>
> Cc: xen-devel@lists.xenproject.org; sstabell...@kernel.org; jul...@xen.org;
> jbeul...@suse.com; Bertrand Marquis <bertrand.marq...@arm.com>
> Subject: Re: [XEN RFC PATCH 23/40] xen/arm: introduce a helper to parse
> device tree memory node
> 
> On Wed, 11 Aug 2021, Wei Chen wrote:
> > Memory blocks' NUMA ID information is stored in device tree's
> > memory nodes as "numa-node-id". We need a new helper to parse
> > and verify this ID from memory nodes.
> >
> > In order to support memory affinity in later use, the valid
> > memory ranges and NUMA ID will be saved to tables.
> >
> > Signed-off-by: Wei Chen <wei.c...@arm.com>
> > ---
> >  xen/arch/arm/numa_device_tree.c | 130 ++++++++++++++++++++++++++++++++
> >  1 file changed, 130 insertions(+)
> >
> > diff --git a/xen/arch/arm/numa_device_tree.c
> b/xen/arch/arm/numa_device_tree.c
> > index 37cc56acf3..bbe081dcd1 100644
> > --- a/xen/arch/arm/numa_device_tree.c
> > +++ b/xen/arch/arm/numa_device_tree.c
> > @@ -20,11 +20,13 @@
> >  #include <xen/init.h>
> >  #include <xen/nodemask.h>
> >  #include <xen/numa.h>
> > +#include <xen/libfdt/libfdt.h>
> >  #include <xen/device_tree.h>
> >  #include <asm/setup.h>
> >
> >  s8 device_tree_numa = 0;
> >  static nodemask_t processor_nodes_parsed __initdata;
> > +static nodemask_t memory_nodes_parsed __initdata;
> >
> >  static int srat_disabled(void)
> >  {
> > @@ -55,6 +57,79 @@ static int __init
> dtb_numa_processor_affinity_init(nodeid_t node)
> >      return 0;
> >  }
> >
> > +/* Callback for parsing of the memory regions affinity */
> > +static int __init dtb_numa_memory_affinity_init(nodeid_t node,
> > +                                paddr_t start, paddr_t size)
> > +{
> > +    struct node *nd;
> > +    paddr_t end;
> > +    int i;
> > +
> > +    if ( srat_disabled() )
> > +        return -EINVAL;
> > +
> > +    end = start + size;
> > +    if ( num_node_memblks >= NR_NODE_MEMBLKS )
> > +    {
> > +        dprintk(XENLOG_WARNING,
> > +                "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
> > +        bad_srat();
> > +        return -EINVAL;
> > +    }
> > +
> > +    /* It is fine to add this area to the nodes data it will be used
> later */
> > +    i = conflicting_memblks(start, end);
> > +    /* No conflicting memory block, we can save it for later usage */;
> > +    if ( i < 0 )
> > +        goto save_memblk;
> > +
> > +    if ( memblk_nodeid[i] == node ) {
> > +        /*
> > +         * Overlaps with other memblk in the same node, warning here.
> > +         * This memblk will be merged with conflicted memblk later.
> > +         */
> > +        printk(XENLOG_WARNING
> > +               "DT: NUMA NODE %u (%"PRIx64
> > +               "-%"PRIx64") overlaps with itself (%"PRIx64"-
> %"PRIx64")\n",
> > +               node, start, end,
> > +               node_memblk_range[i].start, node_memblk_range[i].end);
> > +    } else {
> > +        /*
> > +         * Conflict with memblk in other node, this is an error.
> > +         * The NUMA information is invalid, NUMA will be turn off.
> > +         */
> > +        printk(XENLOG_ERR
> > +               "DT: NUMA NODE %u (%"PRIx64"-%"
> > +               PRIx64") overlaps with NODE %u (%"PRIx64"-%"PRIx64")\n",
> > +               node, start, end, memblk_nodeid[i],
> > +               node_memblk_range[i].start, node_memblk_range[i].end);
> > +        bad_srat();
> > +        return -EINVAL;
> > +    }
> > +
> > +save_memblk:
> > +    nd = &nodes[node];
> > +    if ( !node_test_and_set(node, memory_nodes_parsed) ) {
> > +        nd->start = start;
> > +        nd->end = end;
> > +    } else {
> > +        if ( start < nd->start )
> > +            nd->start = start;
> > +        if ( nd->end < end )
> > +            nd->end = end;
> > +    }
> > +
> > +    printk(XENLOG_INFO "DT: NUMA node %u %"PRIx64"-%"PRIx64"\n",
> > +           node, start, end);
> > +
> > +    node_memblk_range[num_node_memblks].start = start;
> > +    node_memblk_range[num_node_memblks].end = end;
> > +    memblk_nodeid[num_node_memblks] = node;
> > +    num_node_memblks++;
> 
> 
> Is it possible to have non-contigous ranges of memory for a single NUMA
> node?
> 
> Looking at the DT bindings and Linux implementation, it seems possible.
> Here, it seems that node_memblk_range/memblk_nodeid could handle it,
> but nodes couldn't.

Yes, you're right. I copied this code for x86 ACPI NUMA. Does ACPI allow
non-contiguous ranges of memory for a single NUMA node too? If yes, I think
this will affect x86 ACPI NUMA too. In next version, we plan to merge
dtb_numa_memory_affinity_init and acpi_numa_memory_affinity_init into a
neutral function. So we can fix them at the same time.

If not, maybe we have to keep the diversity for dtb and ACPI here.
Anyway, Thanks for pointing this, I will look into the latest Linux
implementation.

Cheers,
Wei Chen


Reply via email to