On Mon, 18 Jul 2016 17:06:18 +0200 Peter Krempa <pkre...@redhat.com> wrote:
> On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote: > > I'm not entirely sure if this is a good idea, and if it is whether > > this is a good approach to it. But I'd like to discuss it and see if > > anyone has better ideas. > > > > As you may know we've hit a bunch of complications with cpu_index > > which will impose some limitations with what we can do with the new > > query-hotpluggable-cpus interface, and we've run out of time to > > address these in qemu-2.7. > > > > At the same time we're hitting complications with the fact that the > > new qemu interface requires a new libvirt interface to use properly, > > and that has follow on effects further up the stack. > > The libvirt interface is basically now depending on adding a working > implementation for qemu or a different hypervisor. APIs without > implementation are not accepted upstream. > > It looks like there are the following problems which make the above > hard: > > First of the problem is the missing link between the NUMA topology > (currently confirured via 'cpu id' which is not linked in any way to the > query-hotpluggable-cpus entries). This basically means that I'll have to > re-implement the qemu numbering scheme and hope that it doesn't change > until a better approach is added. with current 'in order' plug/unplug limitation behavior is the same as for cpu-add (wrt x86) so device_add could be used as direct replacement of cpu-add in NUMA case. Numa node to CPU in query-hotpluggable-cpus a missing part but once numa mapping for hotplugged CPUs (which is broken now) is fixed (fix https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00595.html) I'll be ready to extend x86.query-hotpluggable-cpus with numa mapping that -numa cpus=1,2,3... happened to configure. (note: that device_add cpu,node=X that doesn't match whatever has been configured with -numa cpus=... will rise error, as numa configuration is static and fixed at VM creation time, meaning that "node" option in query-hotpluggable-cpus is optional and only to inform users to which node cpu belongs) > Secondly from my understanding of the current state it's impossible to > select an arbitrary cpu to hotplug but they need to happen 'in order' of > the cpu id pointed out above (which is not accessible). The grand plan > is to allow adding the cpus in any order. This makes the feature look > like a proof of concept rather than something useful. having out-of-order plug/unplug would be nice but that wasn't the grand plan. Main reason is to replace cpu-add with 'device_add cpu' and on top of that provide support for 'device_del cpu' instead of adding cpu-del command. And as result of migration to device_add to avoid changing -smp to match present cpus count on target and reuse the same interface as other devices. We can still pick 'out of order' device_add cpu using migration_id patch and revert in-order limit patch. It would work for x86, but I think there were issues with SPAPR, that's why I'm in favor of in-order limit approach. > The two problems above make this feature hard to implement and hard to > sell to libvirt's upstream. > > > Together this means a bunch more delays to having usable CPU hotplug > > on Power for downstream users, which is unfortunate. > > I'm not in favor of adding upstream hacks for sake of downstream > deadlines. > > > This is an attempt to get something limited working in a shorter time > > frame, by implementing the old cpu-add interface in terms of the new > > interface. Obviously this can't fully exploit the new interface's > > capabilities, but you can do basic in-order cpu hotplug without removal. > > As a side note, cpu-add technically allows out of order usage. Libvirt > didn't use it that way though. out-of-order cpu-add breaks migration that's why it's not been used. > > To make this work, I need to broaden the semantics of cpu-add: it will > > a single entry from query-hotpluggable-cpus, which means it may add > > multiple vcpus, which the x86 implementation did not previously do. > > See my response to 2/2. If this requires to add -device for the > hotplugged entries when migrating it basically doesn't help at all. > > > I'm not sure if the intended semantics of cpu-add were ever defined > > well enough to say if this is "wrong" or not. > > For x86 I'll also need to experiment with the combined use of cpu-add > and device_add interfaces. It should work, though I'd not recommend to use them together as cpu-add will be obsoleted eventually. >I plan to add a implementation which > basically uses the old API in libvirt but calls the new APIs in qemu if > they were used previously. (skip) >(We still need to fall back to the old API for migration compatibility) Why? > > > Because of this, I suspect libvirt will still need some work, but I'm > > hoping it might be less that the full new API implementation. > > Mostly as adding a single entry via the interface will result in > multiple entries in query-cpus. Also libvirt's interface takes the > target number of vcpus as argument so any increment that is not > divisible by the thread count needs to be rejected. > > Peter >