In implementations of ARM64 architecture, at most there could be a CPU topology hierarchy like "sockets/dies/clusters/cores/threads" defined. For example, some ARM64 server chip Kunpeng 920 totally has 2 sockets, 2 NUMA nodes (also represent CPU dies range) in each socket, 6 clusters in each NUMA node, 4 CPU cores in each cluster.
Clusters within the same NUMA share the L3 cache data and cores within the same cluster share a L2 cache and a L3 cache tag. Given that designing a vCPU topology with cluster level for the guest can gain scheduling performance improvement, let's support this new parameter on ARM virt machines. After this, we can define a 4-level CPU topology hierarchy like: cpus=*,maxcpus=*,sockets=*,clusters=*,cores=*,threads=*. Signed-off-by: Yanan Wang <wangyana...@huawei.com> --- hw/arm/virt.c | 1 + qemu-options.hx | 10 ++++++++++ 2 files changed, 11 insertions(+) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 6bce595aba..f413e146d9 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2700,6 +2700,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data) hc->unplug_request = virt_machine_device_unplug_request_cb; hc->unplug = virt_machine_device_unplug_cb; mc->nvdimm_supported = true; + mc->smp_props.clusters_supported = true; mc->auto_enable_numa_with_memhp = true; mc->auto_enable_numa_with_memdev = true; mc->default_ram_id = "mach-virt.ram"; diff --git a/qemu-options.hx b/qemu-options.hx index fd1f8135fb..69ef1cdb85 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -277,6 +277,16 @@ SRST -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 + The following sub-option defines a CPU topology hierarchy (2 sockets + totally on the machine, 2 clusters per socket, 2 cores per cluster, + 2 threads per core) for ARM virt machines which support sockets/clusters + /cores/threads. Some members of the option can be omitted but their values + will be automatically computed: + + :: + + -smp 16,sockets=2,clusters=2,cores=2,threads=2,maxcpus=16 + Historically preference was given to the coarsest topology parameters when computing missing values (ie sockets preferred over cores, which were preferred over threads), however, this behaviour is considered -- 2.27.0