On Mon, Jul 18, 2016 at 05:06:18PM +0200, Peter Krempa wrote: > On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote: > > I'm not entirely sure if this is a good idea, and if it is whether > > this is a good approach to it. But I'd like to discuss it and see if > > anyone has better ideas. > > > > As you may know we've hit a bunch of complications with cpu_index > > which will impose some limitations with what we can do with the new > > query-hotpluggable-cpus interface, and we've run out of time to > > address these in qemu-2.7. > > > > At the same time we're hitting complications with the fact that the > > new qemu interface requires a new libvirt interface to use properly, > > and that has follow on effects further up the stack. > > The libvirt interface is basically now depending on adding a working > implementation for qemu or a different hypervisor. APIs without > implementation are not accepted upstream. > > It looks like there are the following problems which make the above > hard: > > First of the problem is the missing link between the NUMA topology > (currently confirured via 'cpu id' which is not linked in any way to the > query-hotpluggable-cpus entries). This basically means that I'll have to > re-implement the qemu numbering scheme and hope that it doesn't change > until a better approach is added.
I have at least a start on how to fix this in mind, and it's the next thing I'll work on. However, it obviously won't be merged for qemu-2.7. > Secondly from my understanding of the current state it's impossible to > select an arbitrary cpu to hotplug but they need to happen 'in order' of > the cpu id pointed out above (which is not accessible). The grand plan > is to allow adding the cpus in any order. This makes the feature look > like a proof of concept rather than something useful. Alas, yes :(. Again, I have a plan on this, but it's missed the 2.7 window. > The two problems above make this feature hard to implement and hard to > sell to libvirt's upstream. > > > Together this means a bunch more delays to having usable CPU hotplug > > on Power for downstream users, which is unfortunate. > > I'm not in favor of adding upstream hacks for sake of downstream > deadlines. As a rule, I'm not either. But if the hacks are small and isolated enough, I think it can be reasonable. Whether that's the case is what I'm trying to assess here. > > This is an attempt to get something limited working in a shorter time > > frame, by implementing the old cpu-add interface in terms of the new > > interface. Obviously this can't fully exploit the new interface's > > capabilities, but you can do basic in-order cpu hotplug without removal. > > As a side note, cpu-add technically allows out of order usage. Libvirt > didn't use it that way though. Yes, I know. I gather it will break migration though. With this patch out-of-order cpu-add will fail because of the test enforcing in-order device_add. > > To make this work, I need to broaden the semantics of cpu-add: it will > > a single entry from query-hotpluggable-cpus, which means it may add > > multiple vcpus, which the x86 implementation did not previously do. > > See my response to 2/2. If this requires to add -device for the > hotplugged entries when migrating it basically doesn't help at all. It doesn't. But it does require a more complex calculation of how to increase -smp. > > I'm not sure if the intended semantics of cpu-add were ever defined > > well enough to say if this is "wrong" or not. > > For x86 I'll also need to experiment with the combined use of cpu-add > and device_add interfaces. I plan to add a implementation which > basically uses the old API in libvirt but calls the new APIs in qemu if > they were used previously. (We still need to fall back to the old API > for migration compatibility) > > Because of this, I suspect libvirt will still need some work, but I'm > > hoping it might be less that the full new API implementation. > > Mostly as adding a single entry via the interface will result in > multiple entries in query-cpus. Also libvirt's interface takes the > target number of vcpus as argument so any increment that is not > divisible by the thread count needs to be rejected. Yes. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature