Re: [Qemu-devel] Regression (?) due to c4177479 ('spapr: make sure RMA is in first mode of first memory node')

Alexey Kardashevskiy Fri, 25 Apr 2014 07:18:35 -0700

On 04/23/2014 05:04 AM, Nishanth Aravamudan wrote:
> On 22.04.2014 [19:27:51 +1000], Benjamin Herrenschmidt wrote:
>> On Tue, 2014-04-22 at 19:12 +1000, Alexey Kardashevskiy wrote:
>>> I already have in plan to fix non-power-of-two sized memory nodes so I will
>>> this too.
>>>
>>> What exactly is the point in having NUMA memoryless nodes now? We do
>>> not support memory hotplug yet and balloon is not memory hotplug at
>>> all.
>>
>> It exists on real HW an it's always causing funny bugs, so being able to
>> simulate it in qemu would make it easier to test & debug.
> 
> Yes, my primary purpose here is to help test & debug issues I see in
> other environments.
> 
>>> And I still fail to see how the patch is wrong. May be the idea of
>>> having DT memory nodes the same thing as NUMA memory nodes is not
>>> the best one, but the patch is not changing that.
> 
> No, but it does make it an error for the RMA not be in node0.
> 
> +    if (spapr->rma_size > node0_size) {
> +        fprintf(stderr, "Error: Numa node 0 has to span the RMA 
> (%#08"HWADDR_PR
> +                spapr->rma_size);
> +        exit(1);
> +    }
> 
> which will never be the case if node0 has no memory?



Except reproducing "funny" bugs from some real hardware, is there any other
point in memory-less nodes?


> I'm fine with the change being left in, I guess, I just want to make
> sure the semantics are intended.


It is intended. What does make you think that it is not taking into account
that _now_ memory nodes are equal to NUMA nodes in SPAPR's QEMU?

I am really confused.


> 
> FWIW, if one instead tries to specify node 1 as memoryless, and nodes 0
> and 2 as having memory:
> 
>       sprintf(mem_name, "memory@" TARGET_FMT_lx, mem_start);
>       off = fdt_add_subnode(fdt, 0, mem_name);
>       _FDT(off);
> 
> ends up getting a duplicate error from _FDT because we're trying
> to create memory@<end of node 0> twice, once for node 1 and once for
> node 2. I'm not actually sure what we're supposed to do in that
> situation. Looking at a PowerVM LPAR with the following topology [1]:
> 
> numactl --hardware
> available: 3 nodes (1-3)
> node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11
> node 1 size: 0 MB
> node 1 free: 0 MB
> node 2 cpus:
> node 2 size: 7935 MB
> node 2 free: 7302 MB
> node 3 cpus:
> node 3 size: 8396 MB
> node 3 free: 8338 MB
> node distances:
> node   1   2   3 
>   1:  10  20  20 
>   2:  20  10  20 
>   3:  20  20  10 
> 
> I only see /proc/device-tree/memory@0. Perhaps the node 2 and node 3 are
> from ibm,dynamic-reconfiguration-memory?


It would help if you told how exactly you run QEMU.


> 
> Thanks,
> Nish
> 
> [1] Running a hacked kernel that doesn't require node 0 to be present in
> the first place.
> 


-- 
Alexey

Re: [Qemu-devel] Regression (?) due to c4177479 ('spapr: make sure RMA is in first mode of first memory node')

Reply via email to