On 21.06.2014 [13:06:53 +1000], Alexey Kardashevskiy wrote: > On 06/21/2014 08:55 AM, Nishanth Aravamudan wrote: > > On 16.06.2014 [17:53:49 +1000], Alexey Kardashevskiy wrote: > >> Current QEMU does not support memoryless NUMA nodes. > >> This prepares SPAPR for that. > >> > >> This moves 2 calls of spapr_populate_memory_node() into > >> the existing loop which handles nodes other than than > >> the first one. > >> > >> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > >> --- > >> hw/ppc/spapr.c | 31 +++++++++++-------------------- > >> 1 file changed, 11 insertions(+), 20 deletions(-) > >> > >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > >> index cb3a10a..666b676 100644 > >> --- a/hw/ppc/spapr.c > >> +++ b/hw/ppc/spapr.c > >> @@ -689,28 +689,13 @@ static void spapr_populate_memory_node(void *fdt, > >> int nodeid, hwaddr start, > >> > >> static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt) > >> { > >> - hwaddr node0_size, mem_start, node_size; > >> + hwaddr mem_start, node_size; > >> int i; > >> > >> - /* memory node(s) */ > >> - if (nb_numa_nodes > 1 && node_mem[0] < ram_size) { > >> - node0_size = node_mem[0]; > >> - } else { > >> - node0_size = ram_size; > >> - } > >> - > >> - /* RMA */ > >> - spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size); > >> - > >> - /* RAM: Node 0 */ > >> - if (node0_size > spapr->rma_size) { > >> - spapr_populate_memory_node(fdt, 0, spapr->rma_size, > >> - node0_size - spapr->rma_size); > >> - } > >> - > >> - /* RAM: Node 1 and beyond */ > >> - mem_start = node0_size; > >> - for (i = 1; i < nb_numa_nodes; i++) { > >> + for (i = 0, mem_start = 0; i < nb_numa_nodes; ++i) { > >> + if (!node_mem[i]) { > >> + continue; > >> + } > > > > Doesn't this skip memoryless nodes? What actually puts the memoryless > > node in the device-tree? > > It does skip. > > > And if you were to put them in, wouldn't spapr_populate_memory_node() > > fail because we'd be creating two nodes with memory@XXX where XXX is the > > same (starting address) for both? > > I cannot do this now - there is no way to tell from the command line > where I want NUMA node memory start from so I'll end up with multiple > nodes with the same name and QEMU won't start. When NUMA fixes reach > upstream, I'll try to work out something on top of that.
So in mst's tree, which I've rebased your patches, we have a struct defining each NUMA node, which has a size (and the index is the nodeid). I've got patches working that allow for sparse indexing, but I'm curious what you think we should do for the naming. I can send out the patches, with the caveat that architectures still need to fix the remaining issues for memoryless nodes? Thanks, Nish