>>In our configuration `numa` is just a boolean, not a count like in the >>above example, so IMO if no topology is defined but numa enabled we >>should just let qemu do its thing, which is the behavior we used to have >>before hugepages. >> >>So in order to restore the old behavior I'd like to apply the following >>patch, note that the very same check still exists in the `numaX` entry >>loop further up in the code.
Ok, I thinked we already check that before hugepages. (I really don't know the behaviour (performance) of auto numa balacing on host when guest have more numa nodes). But I think we still need the check if hugepages are enabled. something like: - die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/"; + die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/" && $conf->{hugepages); ----- Mail original ----- De: "Wolfgang Bumiller" <w.bumil...@proxmox.com> À: "aderumier" <aderum...@odiso.com> Cc: "pve-devel" <pve-devel@pve.proxmox.com> Envoyé: Jeudi 28 Juillet 2016 09:24:58 Objet: Re: [pve-devel] 3 numa topology issues On Thu, Jul 28, 2016 at 08:44:47AM +0200, Alexandre DERUMIER wrote: > I'm looking at openstack implementation > > https://specs.openstack.org/openstack/nova-specs/specs/juno/implemented/virt-driver-numa-placement.html > > > and it seem that they check if host numa nodes exist too > > > "hw:numa_nodes=NN - numa of NUMA nodes to expose to the guest. > The most common case will be that the admin only sets ‘hw:numa_nodes’ and > then the flavor vCPUs and RAM will be divided equally across the NUMA nodes. > " > > This is what we are doing with numa:1. (we use sockets to known how many numa > nodes we need) > > > " So, given an example config: > > vcpus=8 > mem=4 > hw:numa_nodes=2 - numa of NUMA nodes to expose to the guest. > hw:numa_cpus.0=0,1,2,3,4,5 > hw:numa_cpus.1=6,7 > hw:numa_mem.0=3072 > hw:numa_mem.1=1024 > The scheduler will look for a host with 2 NUMA nodes with the ability to run > 6 CPUs + 3 GB of RAM on one node, and 2 CPUS + 1 GB of RAM on another node. > If a host has a single NUMA node with capability to run 8 CPUs and 4 GB of > RAM it will not be considered a valid match. > " > > So, if host don't have enough numa nodes, it's invalid This is the equivalent for a custom topology, there it's perfectly fine to throw an error, and that's a different `die` statement from the one I want to remove in our code, too. In our configuration `numa` is just a boolean, not a count like in the above example, so IMO if no topology is defined but numa enabled we should just let qemu do its thing, which is the behavior we used to have before hugepages. So in order to restore the old behavior I'd like to apply the following patch, note that the very same check still exists in the `numaX` entry loop further up in the code. From da9b76607c5dbb12477976117c6f91cbc127f992 Mon Sep 17 00:00:00 2001 From: Wolfgang Bumiller <w.bumil...@proxmox.com> Date: Wed, 27 Jul 2016 09:05:57 +0200 Subject: [PATCH qemu-server] memory: don't restrict sockets to the number of host numa nodes Removes an error for when there is no custom numa topology defined and there are more virtual sockets defined than host numa nodes available. --- PVE/QemuServer/Memory.pm | 2 -- 1 file changed, 2 deletions(-) diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm index 047ddad..fec447a 100644 --- a/PVE/QemuServer/Memory.pm +++ b/PVE/QemuServer/Memory.pm @@ -263,8 +263,6 @@ sub config { my $numa_memory = ($static_memory / $sockets); for (my $i = 0; $i < $sockets; $i++) { - die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/"; - my $cpustart = ($cores * $i); my $cpuend = ($cpustart + $cores - 1) if $cores && $cores > 1; my $cpus = $cpustart; -- 2.1.4 _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel