On 22.04.2014 [19:27:51 +1000], Benjamin Herrenschmidt wrote: > On Tue, 2014-04-22 at 19:12 +1000, Alexey Kardashevskiy wrote: > > I already have in plan to fix non-power-of-two sized memory nodes so I will > > this too. > > > > What exactly is the point in having NUMA memoryless nodes now? We do > > not support memory hotplug yet and balloon is not memory hotplug at > > all. > > It exists on real HW an it's always causing funny bugs, so being able to > simulate it in qemu would make it easier to test & debug.
Yes, my primary purpose here is to help test & debug issues I see in other environments. > > And I still fail to see how the patch is wrong. May be the idea of > > having DT memory nodes the same thing as NUMA memory nodes is not > > the best one, but the patch is not changing that. No, but it does make it an error for the RMA not be in node0. + if (spapr->rma_size > node0_size) { + fprintf(stderr, "Error: Numa node 0 has to span the RMA (%#08"HWADDR_PR + spapr->rma_size); + exit(1); + } which will never be the case if node0 has no memory? I'm fine with the change being left in, I guess, I just want to make sure the semantics are intended. FWIW, if one instead tries to specify node 1 as memoryless, and nodes 0 and 2 as having memory: sprintf(mem_name, "memory@" TARGET_FMT_lx, mem_start); off = fdt_add_subnode(fdt, 0, mem_name); _FDT(off); ends up getting a duplicate error from _FDT because we're trying to create memory@<end of node 0> twice, once for node 1 and once for node 2. I'm not actually sure what we're supposed to do in that situation. Looking at a PowerVM LPAR with the following topology [1]: numactl --hardware available: 3 nodes (1-3) node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 node 1 size: 0 MB node 1 free: 0 MB node 2 cpus: node 2 size: 7935 MB node 2 free: 7302 MB node 3 cpus: node 3 size: 8396 MB node 3 free: 8338 MB node distances: node 1 2 3 1: 10 20 20 2: 20 10 20 3: 20 20 10 I only see /proc/device-tree/memory@0. Perhaps the node 2 and node 3 are from ibm,dynamic-reconfiguration-memory? Thanks, Nish [1] Running a hacked kernel that doesn't require node 0 to be present in the first place.