Hi,

Using a QEMU pseries guest with this follwing SMP topology, with a
single NUMA node:


(...) -smp 32,threads=4,cores=4,sockets=2, (...)

This is the output of lscpu with a guest running v5.12-rc5:

[root@localhost ~]# lscpu
Architecture:        ppc64le
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  4
Core(s) per socket:  8
Socket(s):           1
NUMA node(s):        1
Model:               2.2 (pvr 004e 1202)
Model name:          POWER9 (architected), altivec supported
Hypervisor vendor:   KVM
Virtualization type: para
L1d cache:           32K
L1i cache:           32K
NUMA node0 CPU(s):   0-31
[root@localhost ~]#


The changes with cpu_core_mask made the topology sockets matching NUMA nodes.
In this case, given that we have a single NUMA node, the SMP topology got
adjusted to have 8 cores instead of 4 so we can have a single socket as well.

Although sockets equal to NUMA nodes is true for Power hardware, QEMU doesn't
have this constraint and users expect sockets and NUMA nodes to be kind of
independent, regardless of how unpractical that would be with real hardware.


The same guest running a kernel with this series applied:


[root@localhost ~]# lscpu
Architecture:        ppc64le
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  4
Core(s) per socket:  4
Socket(s):           2
NUMA node(s):        1
Model:               2.2 (pvr 004e 1202)
Model name:          POWER9 (architected), altivec supported
Hypervisor vendor:   KVM
Virtualization type: para
L1d cache:           32K
L1i cache:           32K
NUMA node0 CPU(s):   0-31


The sockets and NUMA nodes are being represented separately, as intended via
the QEMU command line.


Thanks for the looking this up, Srikar. For all patches:


Tested-by: Daniel Henrique Barboza <danielhb...@gmail.com>



On 4/15/21 9:09 AM, Srikar Dronamraju wrote:
Daniel had reported that
  QEMU is now unable to see requested topologies in a multi socket single
  NUMA node configurations.
  -smp 8,maxcpus=8,cores=2,threads=2,sockets=2

This patchset reintroduces cpu_core_mask so that users can see requested
topologies while still maintaining the boot time of very large system
configurations.

It includes caching the chip_id as suggested by Michael Ellermann

4 Threads/Core; 4 cores/Socket; 4 Sockets/Node, 2 Nodes in System
   -numa node,nodeid=0,memdev=m0 \
   -numa node,nodeid=1,memdev=m1 \
   -smp 128,sockets=8,threads=4,maxcpus=128  \

5.12.0-rc5 (or any kernel with commit 4ca234a9cbd7)
---------------------------------------------------
srikar@cloudy:~$ lscpu
Architecture:                    ppc64le
Byte Order:                      Little Endian
CPU(s):                          128
On-line CPU(s) list:             0-127
Thread(s) per core:              4
Core(s) per socket:              16
Socket(s):                       2                 <<<<<-----
NUMA node(s):                    2
Model:                           2.3 (pvr 004e 1203)
Model name:                      POWER9 (architected), altivec supported
Hypervisor vendor:               KVM
Virtualization type:             para
L1d cache:                       1 MiB
L1i cache:                       1 MiB
NUMA node0 CPU(s):               0-15,32-47,64-79,96-111
NUMA node1 CPU(s):               16-31,48-63,80-95,112-127
--
srikar@cloudy:~$ dmesg |grep smp
[    0.010658] smp: Bringing up secondary CPUs ...
[    0.424681] smp: Brought up 2 nodes, 128 CPUs
--

5.12.0-rc5 + 3 patches
----------------------
srikar@cloudy:~$ lscpu
Architecture:                    ppc64le
Byte Order:                      Little Endian
CPU(s):                          128
On-line CPU(s) list:             0-127
Thread(s) per core:              4
Core(s) per socket:              4
Socket(s):                       8    <<<<-----
NUMA node(s):                    2
Model:                           2.3 (pvr 004e 1203)
Model name:                      POWER9 (architected), altivec supported
Hypervisor vendor:               KVM
Virtualization type:             para
L1d cache:                       1 MiB
L1i cache:                       1 MiB
NUMA node0 CPU(s):               0-15,32-47,64-79,96-111
NUMA node1 CPU(s):               16-31,48-63,80-95,112-127
--
srikar@cloudy:~$ dmesg |grep smp
[    0.010372] smp: Bringing up secondary CPUs ...
[    0.417892] smp: Brought up 2 nodes, 128 CPUs

5.12.0-rc5
----------
srikar@cloudy:~$  lscpu
Architecture:                    ppc64le
Byte Order:                      Little Endian
CPU(s):                          1024
On-line CPU(s) list:             0-1023
Thread(s) per core:              8
Core(s) per socket:              128
Socket(s):                       1
NUMA node(s):                    1
Model:                           2.3 (pvr 004e 1203)
Model name:                      POWER9 (architected), altivec supported
Hypervisor vendor:               KVM
Virtualization type:             para
L1d cache:                       4 MiB
L1i cache:                       4 MiB
NUMA node0 CPU(s):               0-1023
srikar@cloudy:~$ dmesg | grep smp
[    0.027753 ] smp: Bringing up secondary CPUs ...
[    2.315193 ] smp: Brought up 1 node, 1024 CPUs

5.12.0-rc5 + 3 patches
----------------------
srikar@cloudy:~$ dmesg | grep smp
[    0.027659 ] smp: Bringing up secondary CPUs ...
[    2.532739 ] smp: Brought up 1 node, 1024 CPUs

I also have booted and tested the kernels on PowerVM and PowerNV and
even there I see a very negligible increase in the bringing up time of
secondary CPUs

Srikar Dronamraju (3):
   powerpc/smp: Reintroduce cpu_core_mask
   Revert "powerpc/topology: Update topology_core_cpumask"
   powerpc/smp: Cache CPU to chip lookup

  arch/powerpc/include/asm/smp.h      |  6 ++++
  arch/powerpc/include/asm/topology.h |  2 +-
  arch/powerpc/kernel/prom.c          | 19 +++++++---
  arch/powerpc/kernel/smp.c           | 56 +++++++++++++++++++++++++----
  4 files changed, 71 insertions(+), 12 deletions(-)

Reply via email to