On Tue, Jun 22, 2021 at 06:53:41PM +1000, Gavin Shan wrote: > Hi Drew, > > On 6/3/21 2:48 PM, Gavin Shan wrote: > > On 6/2/21 9:36 PM, Andrew Jones wrote: > > > On Wed, Jun 02, 2021 at 11:09:32AM +1000, Gavin Shan wrote: > > > > On 6/1/21 5:50 PM, Andrew Jones wrote: > > > > > On Tue, Jun 01, 2021 at 03:30:04PM +0800, Gavin Shan wrote: > > > > > > We possibly populate empty nodes where memory isn't included and > > > > > > might > > > > > > be hot added at late time. The FDT memory nodes can't be created due > > > > > > to conflicts on their names if multiple empty nodes are specified. > > > > > > For example, the VM fails to start with the following error > > > > > > messages. > > > > > > > > > > > > /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 > > > > > > \ > > > > > > -accel kvm -machine virt,gic-version=host > > > > > > \ > > > > > > -cpu host -smp 4,sockets=2,cores=2,threads=1 -m > > > > > > 1024M,maxmem=64G \ > > > > > > -object memory-backend-ram,id=mem0,size=512M > > > > > > \ > > > > > > -object memory-backend-ram,id=mem1,size=512M > > > > > > \ > > > > > > -numa node,nodeid=0,cpus=0-1,memdev=mem0 > > > > > > \ > > > > > > -numa node,nodeid=1,cpus=2-3,memdev=mem1 > > > > > > \ > > > > > > -numa node,nodeid=2 > > > > > > \ > > > > > > -numa node,nodeid=3 > > > > > > \ > > > > > > : > > > > > > -device virtio-balloon-pci,id=balloon0,free-page-reporting=yes > > > > > > > > > > > > qemu-system-aarch64: FDT: Failed to create subnode > > > > > > /memory@80000000: \ > > > > > > FDT_ERR_EXISTS > > > > > > > > > > > > This fixes the issue by using NUMA node ID or zero in the memory > > > > > > node > > > > > > name to avoid the conflicting memory node names. With this applied, > > > > > > the > > > > > > VM can boot successfully with above command lines. > > > > > > > > > > > > Signed-off-by: Gavin Shan <gs...@redhat.com> > > > > > > --- > > > > > > hw/arm/boot.c | 7 ++++++- > > > > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/hw/arm/boot.c b/hw/arm/boot.c > > > > > > index d7b059225e..3169bdf595 100644 > > > > > > --- a/hw/arm/boot.c > > > > > > +++ b/hw/arm/boot.c > > > > > > @@ -432,7 +432,12 @@ static int fdt_add_memory_node(void *fdt, > > > > > > uint32_t acells, hwaddr mem_base, > > > > > > char *nodename; > > > > > > int ret; > > > > > > - nodename = g_strdup_printf("/memory@%" PRIx64, mem_base); > > > > > > + if (numa_node_id >= 0) { > > > > > > + nodename = g_strdup_printf("/memory@%d", numa_node_id); > > > > > > + } else { > > > > > > + nodename = g_strdup("/memory@0"); > > > > > > + } > > > > > > + > > > > > > qemu_fdt_add_subnode(fdt, nodename); > > > > > > qemu_fdt_setprop_string(fdt, nodename, "device_type", > > > > > > "memory"); > > > > > > ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", > > > > > > acells, mem_base, > > [...] > > > > > I've sent one separate mail to check with Rob Herring. Hopefully he have > > ideas as he is maintaining linux FDT subsystem. You have been included to > > that thread. I didn't find something meaningful to this thread after doing > > some google search either. > > > > Yes, I agree with you we need to follow the specification strictly. It seems > > it's uncertain about the 'physical memory map' bus binding requirements. > > > > I didn't get expected answers from device-tree experts.
What response did you get? Can you please provide a link to the discussion? > After rethinking about it, > I plan to fix this like this way, but please let me know if it sounds sensible > to you. > > The idea is to assign a (not overlapped) dummy base address to each memory > node in the device-tree. The dummy is (last_valid_memory_address + NUMA ID). > The 'length' of the 'reg' property in the device-tree nodes, corresponding > to empty NUMA nodes, is still zero. This ensures the nodes are still invalid > until memory is added to these nodes. > > I had the temporary patch for the implementation. It works fine and VM can > boot up successfully. I would like to be sure that the kernel developers for NUMA, memory hotplug, and devicetree specifications are all happy with the approach before adding it to QEMU. Thanks, drew