On Thu, 19 Mar 2015 17:05:13 +0100 Andreas Färber <afaer...@suse.de> wrote:
> Am 18.03.2015 um 17:38 schrieb Igor Mammedov: > > since commit > > dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT > > Linux kernel actually tries to use CPU to Node mapping from > > QEMU provided SRAT table instead of discarding it, and that > > in some cases breaks build_sched_domains() which expects > > sane mapping where cores/threads belonging to the same socket > > are on the same NUMA node. > > > > With current default round-robin mapping of VCPUs to nodes > > guest ends-up with cores/threads belonging to the same socket > > being on different NUMA nodes. > > > > For example with following CLI: > > qemu-kvm -m 4G -smp 5,sockets=2,cores=4,threads=1,maxcpus=8 \ > > -numa node,nodeid=0 -numa node,nodeid=1 > > 2.6.32 based kernels will hang on boot due to incorrectly build > > sched_group-s list in update_sd_lb_stats() > > so comment in QEMU justifying dumb default mapping: > > " > > guest OSes must cope with this anyway, because there are BIOSes > > out there in real machines which also use this scheme. > > " > > isn't really valid. > > > > Replacing default mapping with a manual, where VCPUs belonging to > > the same socket are on the same NUMA node, fixes issue for > > guests which can't handle nonsense topology i.e. changing CLI to: > > -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7 > > > > So instead of simply scattering VCPUs around nodes, map > > the same socket VCPUs to the same NUMA node, which is what > > guest would expect from a sane hardware/BIOS. > > > > Signed-off-by: Igor Mammedov <imamm...@redhat.com> > > --- > > v2: > > - add machine callback cpu_index_to_socket_id() and use it > > instead of stub approach > > --- > > hw/i386/pc.c | 9 +++++++++ > > include/hw/boards.h | 5 +++++ > > include/sysemu/numa.h | 3 ++- > > numa.c | 18 +++++++++++++----- > > vl.c | 2 +- > > 5 files changed, 30 insertions(+), 7 deletions(-) > > Looks great to me now, the hook name with _socket_id is perfect, > > Reviewed-by: Andreas Färber <afaer...@suse.de> > > but can we do that in three steps please? "machine:" adding callback and > default implementation, "numa:" switching to use it and "pc:" overriding > the new callback - not only nicer subjects but easier to cherry-pick and > bisect then. sure > > Regards, > Andreas >