On Tue, 14 Jan 2025 at 15:15, Peter Xu <pet...@redhat.com> wrote: > > On Tue, Jan 14, 2025 at 02:28:46PM -0500, Stefan Hajnoczi wrote: > > On Tue, 14 Jan 2025 at 09:15, Fabiano Rosas <faro...@suse.de> wrote: > > > > > > Stefan Hajnoczi <stefa...@gmail.com> writes: > > > > > > > On Mon, 13 Jan 2025 at 16:09, Fabiano Rosas <faro...@suse.de> wrote: > > > >> > > > >> Bug #2594 is about a failure during migration after a cpu hotplug. Add > > > >> a test that covers that scenario. Start the source with -smp 2 and > > > >> destination with -smp 3, plug one extra cpu to match and migrate. > > > >> > > > >> The issue seems to be a mismatch in the number of virtqueues between > > > >> the source and destination due to the hotplug not changing the > > > >> num_queues: > > > >> > > > >> get_pci_config_device: Bad config data: i=0x9a read: 4 device: 5 > > > >> cmask: ff wmask: 0 w1cmask:0 > > > >> > > > >> Usage: > > > >> $ QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_BINARY=./qemu-system-x86_64 \ > > > >> ./tests/qtest/migration-test -p /x86_64/migration/hotplug/cpu > > > >> > > > >> References: https://gitlab.com/qemu-project/qemu/-/issues/2594 > > > >> References: https://issues.redhat.com/browse/RHEL-68302 > > > >> Signed-off-by: Fabiano Rosas <faro...@suse.de> > > > >> --- > > > >> As you can see there's no fix attached to this. I haven't reached that > > > >> part yet, suggestions welcome =). Posting the test case if anyone > > > >> wants to play with this. > > > >> > > > >> (if someone at RH is already working on this, that's fine. I'm just > > > >> trying to get some upstream bugs to move) > > > > > > > > The management tool should set num_queues on the destination to ensure > > > > migration compatibility. > > > > > > > > > > I'm not sure that's feasible. The default num-queues seem like an > > > implementation detail that the management application would not have a > > > way to query. Unless it starts the source with a fixed number that > > > already accounts for all hotplug/unplug operations during the VM > > > lifetime, which would be wasteful in terms of resources allocated > > > upfront. > > > > > > That would also make the destination run with a suboptimal (< #vcpus) > > > number of queues, although that's already the case in the source after > > > the hotplug. Do we have any definition on what should happen durgin > > > hotplug? If one plugs 100 vcpus, should num-queues remain as 2? > > > > QEMU defaults num_queues to the number of present CPUs. A management > > tool that wants to ensure that all hotplugged CPUs will have their own > > virtqueues must set num_queues to max_cpus instead. This wastes > > resources upfront but in theory the guest can operate efficiently. I > > haven't checked the Linux guest drivers to see if they actually handle > > virtqueue allocation after hotplug. The Linux drivers vary in how they > > allocate virtqueue interrupts, so be sure to check several device > > types like virtio-net and virtio-blk as they may behave differently. > > > > Or the management tool can explicitly set num_queues to the number of > > present CPUs and preserve that across live migration and CPU hotplug. > > In that case num_queues can be updated across guest cold boot in order > > to (eventually) achieve the optimal multi-queue configuration. > > > > Other approaches might be possible too. The management tool has a > > choice of how to implement this and QEMU doesn't dictate a specific > > approach. > > Thanks for the answer, Stefan. I've left a comment in each of the issue > reports so that reporter can verify this works properly. > > This also reminded me we could have specified a very large number of queues > in many cases - I remember it used to be 1024 somehow (perhaps also the max > vcpu number, but I'm not sure), which caused unwanted slowness on migration > loading side (aka, downtime portion) due to MMIO regions of each queue - > each of the queues may need a global address space update on the guest > physical address space. I didn't verify this issue, but if it can be > reproduced and verified true, I wonder if the MMIO regions (or any relevant > resources that would be enabled with num_queues even though some of them > are not in use) can be plugged lazily, so that we can save quite some time > on loadvm of migration.
Greg Kurz's commit 9cf4fd872d14 ("virtio: Clarify MR transaction optimization") is about the scaling optimization where ioeventfd changes are batched into a single transaction. This made a big difference. Maybe something similar can be done in your case too? Stefan