On Tue, 19 Jul 2016 09:58:59 +0530 Bharata B Rao <bhar...@linux.vnet.ibm.com> wrote:
> On Mon, Jul 18, 2016 at 06:20:35PM +0200, Igor Mammedov wrote: > > On Mon, 18 Jul 2016 17:06:18 +0200 > > Peter Krempa <pkre...@redhat.com> wrote: > > > > > On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote: > > > > I'm not entirely sure if this is a good idea, and if it is whether > > > > this is a good approach to it. But I'd like to discuss it and see if > > > > anyone has better ideas. > > > > > > > > As you may know we've hit a bunch of complications with cpu_index > > > > which will impose some limitations with what we can do with the new > > > > query-hotpluggable-cpus interface, and we've run out of time to > > > > address these in qemu-2.7. > > > > > > > > At the same time we're hitting complications with the fact that the > > > > new qemu interface requires a new libvirt interface to use properly, > > > > and that has follow on effects further up the stack. > > > > > > The libvirt interface is basically now depending on adding a working > > > implementation for qemu or a different hypervisor. APIs without > > > implementation are not accepted upstream. > > > > > > It looks like there are the following problems which make the above > > > hard: > > > > > > First of the problem is the missing link between the NUMA topology > > > (currently confirured via 'cpu id' which is not linked in any way to the > > > query-hotpluggable-cpus entries). This basically means that I'll have to > > > re-implement the qemu numbering scheme and hope that it doesn't change > > > until a better approach is added. > > with current 'in order' plug/unplug limitation behavior is the same as > > for cpu-add (wrt x86) so device_add could be used as direct replacement > > of cpu-add in NUMA case. > > > > Numa node to CPU in query-hotpluggable-cpus a missing part > > but once numa mapping for hotplugged CPUs (which is broken now) is fixed > > (fix https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00595.html) > > I'll be ready to extend x86.query-hotpluggable-cpus with numa mapping > > that -numa cpus=1,2,3... happened to configure. > > (note: that device_add cpu,node=X that doesn't match whatever has been > > configured with -numa cpus=... will rise error, as numa configuration > > is static and fixed at VM creation time, meaning that "node" option > > in query-hotpluggable-cpus is optional and only to inform users to > > which node cpu belongs) > > > > > Secondly from my understanding of the current state it's impossible to > > > select an arbitrary cpu to hotplug but they need to happen 'in order' of > > > the cpu id pointed out above (which is not accessible). The grand plan > > > is to allow adding the cpus in any order. This makes the feature look > > > like a proof of concept rather than something useful. > > > having out-of-order plug/unplug would be nice but that wasn't > > the grand plan. Main reason is to replace cpu-add with 'device_add cpu' and > > on top of that provide support for 'device_del cpu' instead of adding > > cpu-del > > command. > > And as result of migration to device_add to avoid changing -smp to match > > present cpus count on target and reuse the same interface as other devices. > > > > We can still pick 'out of order' device_add cpu using migration_id patch > > and revert in-order limit patch. It would work for x86, > > but I think there were issues with SPAPR, that's why I'm in favor of > > in-order limit approach. > > Not that the migration_id patch doesn't work for sPAPR, but it was felt > that having too many IDs (cpu_dt_id, arch_id, migration_id) is not > good/idea/preferable and could cause confusion. migration_id is internal thing and doesn't concern libvirt at all, so it will be only QEMU thing that we can deal with later either by eliminating cpu_index and leaving migration_id only or merging them into one id after cpu_index refactoring. > I am not clear as to why limiting the out-of-order hotplug is a show > stopper for libvirt actually. Isn't that how it is for cpu-add currently ? It's not show stopper but as Eric pointed out there is a caveat. If we ship limited device_add then we would need to extend external interface to report that out-of-order creation is available. Looking from that point of view it's better to go migration_id route keeping external API simple if spapr is able to handle out-of-order cpu creation and migration. > > Regards, > Bharata. >