On Thu, 27 Apr 2017 13:32:25 -0300 Eduardo Habkost <ehabk...@redhat.com> wrote:
> On Thu, Apr 27, 2017 at 03:14:06PM +0200, Igor Mammedov wrote: > > On Wed, 26 Apr 2017 09:21:38 -0300 > > Eduardo Habkost <ehabk...@redhat.com> wrote: > > > > adding Peter to CC list > > > > [...] > > > > > On Wed, Apr 19, 2017 at 01:14:58PM +0200, Igor Mammedov wrote: > > > > On Wed, 12 Apr 2017 18:02:39 -0300 > > > > Eduardo Habkost <ehabk...@redhat.com> wrote: > > > > > > > > > On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote: > > > > > > it will allow switching from cpu_index to property based > > > > > > numa mapping in follow up patches. > > > > > > > > > > I am not sure I understand all the consequences of this, so I > > > > > will give it a try: > > > > > > > > > > "node-id" is an existing field in CpuInstanceProperties. > > > > > CpuInstanceProperties is used on both query-hotpluggable-cpus > > > > > output and in MachineState::possible_cpus. > > > > > > > > > > We will start using MachineState::possible_cpus to keep track of > > > > > NUMA CPU affinity, and that means query-hotpluggable-cpus will > > > > > start reporting a "node-id" property when a NUMA mapping is > > > > > configured. > > > > > > > > > > To allow query-hotpluggable-cpus to report "node-id", the CPU > > > > > objects must have a "node-id" property that can be set. This > > > > > patch adds the "node-id" property to X86CPU. > > > > > > > > > > Is this description accurate? Is the presence of "node-id" in > > > > > query-hotpluggable-cpus the only reason we really need this > > > > > patch, or is there something else that requires the "node-id" > > > > > property? > > > > That accurate description, node-id is in the same 'address' > > > > properties category as socket/core/thread-id. So if you have > > > > numa enabled machine you'd see node-id property in > > > > query-hotpluggable-cpus. > > > > > > I agree that we can make -numa cpu affect query-hotpluggable-cpus > > > output (i.e. affect some field on HotpluggableCPU.props). > > > > > > But it looks like we disagree about the purpose of > > > HotpluggableCPU.props: > > > > > > I believe HotpluggableCPU.props is just an opaque identifier for > > > the location we want to plug the CPU, and the only requirement is > > > that it should be unique and have all the information device_add > > > needs. As socket IDs are already unique on our existing machines, > > > and socket<=>node mapping is already configured using -numa cpu, > > > node-id doesn't need to be in HotpluggableCPU.props. (see example > > > below) > > node-id is also location property which logically complements > > to socket/core/thread properties. Also socket is not necessarily > > unique id that maps 1:1 to node-id from generic pov. > > BTW -numa cpu[s] is not the only way to specify mapping, > > it could be specified like we do with pc-dimm: > > device_add pc-dimm,node=x > > > > Looking at it more genericly, there could be the same > > socket-ids for different nodes, then we would have to add > > node-id to props anyway and end up with 2 node-id, one in props > > and another in the parent struct. > > This is where my expectations are different: I think > HotpluggableCPU.props is just an identifier property for CPU > slots that is used for device_add (and will be used for -numa > cpu), and isn't supposed to be be interpreted by clients. > > The problem I see is that the property has two completely > different purposes: identifying a given CPU slot for device_add > (and -numa cpu), and introspection of topology information about > the CPU slot. Today we are lucky and those goals don't conflict > with each other, but I worry this might cause trouble in the > future. > > > > > > > > I don't think clients should assume topology information in > > > HotpluggableCPU.props is always present, because the field has a > > > different purpose: letting clients know what are the required > > > device_add arguments. If we need introspection of CPU topology, > > > we can add new fields to HotpluggableCPU, outside 'props'. > > looking at existing clients (libvirt), it doesn't treat 'props' > > as opaque set, but parses it into topology information (my guess > > is that because it's the sole source such info from QEMU). > > Actually we never forbade this, the only requirement for > > props was that mgmt should provide those properties to create > > a cpu. Property names where designed in topology/location > > friendly terms so that clients could make the sense from them. > > > > So I wouldn't try now to reduce meaning of 'props' to > > opaque as you suggest. > > I see. This means my expectation is not met even today. I am not > thrilled about it, but that's OK. > > > > > [..] > > > My question is: if we use: > > > > > > -numa cpu,socket=2,core=1,thread=0,node-id=3 > > > > > > and then: > > > > > > device_add ...,socket=2,core=1,thread=0 > > > (omitting node-id on device_add) > > > > > > won't it work exactly the same, and place the new CPU on NUMA > > > node 3? > > yep, it's allowed for compat reasons: > > 1: to allow DST start with old CLI variant, that didn't have node-id > > (migration) > > 2: to let old libvirt hotplug CPUs, it doesn't treat 'props' > > as opaque set that is just replayed to device_add, > > instead it composes command from topo info it got > > from QEMU and unfortunately node-id is only read but is > > not emitted when device_add is composed > > (I consider this bug but it's out in the wild so we have to deal with > > it) > > > > we can't enforce presence in these cases or at least have to > > keep it relaxed for old machine types. > > I see. > > > > > > In this case, if we don't need a node-id argument on device_add, > > > so node-id doesn't need to be in HotpluggableCPU.props. > > I'd say we currently don't have to (for above reasons) but > > it doesn't hurt and actually allows to use pc-dimm way of > > mapping CPUs to nodes as David noted. i.e.: > > -device cpu-foo,node-id=x,... > > without any of -numa cpu[s] options on CLI. > > It's currently explicitly disabled but should work if one > > doesn't care about hotplug or if target doesn't care about > > mapping at startup (sPAPR), it also might work for x86 as > > well using _PXM method in ACPI. > > (But that's out of scope of this series and needs more > > testing as some guest OSes might expect populated SRAT > > to work correctly). > > Yep. I understand that setting node-id is useful, I just didn't > expect it to be mandatory and included on HotpluggableCPU.props. > > > > > [...] > > > > > > > > In future to avoid starting QEMU twice we were thinking about > > > > configuring QEMU from QMP at runtime, that's where preconfigure > > > > approach could be used to help solving it in the future: > > > > > > > > 1. introduce pause before machine_init CLI option to allow > > > > preconfig machine from qmp/monitor > > > > 2. make query-hotpluggable-cpus usable at preconfig time > > > > 3. start qemu with needed number of numa nodes and default mapping: > > > > #qemu -smp ... -numa node,nodeid=0 -node node,nodeid=1 > > > > 4. get possible cpus list > > > > > > This is where things can get tricky: if we have the default > > > mapping set, step 4 would return "node-id" already set on all > > > possible CPUs. > > that would depend on impl. > > - display node-id with default preset values to override > > - do not set defaults and force user to do mapping > > Right. We could choose to initialize default values much later, > and leave it uninitialized. > > > > > > > 5. add qmp/monitor command variant for '-numa cpu' to set numa > > > > mapping > > > > > > This is where I think we would make things simpler: if node-id > > > isn't present on 'props', we can simply document the arguments > > > that identify the CPU for the numa-cpu command as "just use the > > > properties you get on query-hotpluggable-cpus.props". Clients > > > would be able to treat CpuInstanceProperties as an opaque CPU > > > slot identifier. > > > > > > i.e. I think this would be a better way to define and document > > > the interface: > > > > > > ## > > > # @NumaCpuOptions: > > > # > > > # Mapping of a given CPU (or a set of CPUs) to a NUMA node. > > > # > > > # @cpu: Properties identifying the CPU(s). Use the 'props' field of > > > # query-hotpluggable-cpus for possible values for this > > > # field. > > > # TODO: describe what happens when 'cpu' matches > > > # multiple slots. > > > # @node-id: NUMA node where the CPUs are going to be located. > > > ## > > > { 'struct': 'NumaCpuOptions', > > > 'data': { > > > 'cpu': 'CpuInstanceProperties', > > > 'node-id': 'int' } } > > > > > > This separates "what identifies the CPU slot(s) we are > > > configuring" from "what identifies the node ID we are binding > > > to". > > Doesn't look any simpler to me, I'd better document node-id usage > > in props, like > > > > Well, it still looks simpler and more intuitive to me, but just > because it matches my initial expectations about the semantics of > query-hotpluggable-cpus and CpuInstanceProperties. If your > alternative is very clearly documented (like below), it is not a > problem to me. > > > # > > # A discriminated record of NUMA options. (for OptsVisitor) > > # > > +# For 'cpu' type as arguments use a set of cpu properties returned > > +# by query-hotpluggable-cpus[].props, where node-id could be used > > +# to override default node mapping. Since: 2.10 > > +# > > # Since: 2.1 > > ## > > { 'union': 'NumaOptions', > > 'base': { 'type': 'NumaOptionsType' }, > > 'discriminator': 'type', > > 'data': { > > - 'node': 'NumaNodeOptions' }} > > + 'node': 'NumaNodeOptions', > > + 'cpu' : 'CpuInstanceProperties' }} > > I worry about not being able to add extra options to "-numa cpu" > in the future without affecting HotpluggableCPU.props too. Being > able to document the semantics of -numa cpu inside a dedicated > NumaCpuOptions struct would be nice too. > > I believe this can be addressed by defing "NumaCpuOptions" with > "CpuInstanceProperties" as base: > > { 'union': 'NumaOptions', > 'base': { 'type': 'NumaOptionsType' }, > 'discriminator': 'type', > 'data': { > 'node': 'NumaNodeOptions', > 'cpu' : 'NumaCpuOptions' }} > > ## > # Options for -numa cpu,... > # > # "-numa cpu" accepts the same set of cpu properties returned by > # query-hotpluggable-cpus[].props, where node-id could be used to > # override default node mapping. > # > # Since: 2.10 > ## > { 'struct': 'NumaCpuOptions', > 'base': 'CpuInstanceProperties' } is it inheritance or encapsulation? if it's encapsulation, wouldn't look nice, but we can duplicate fields from CpuInstanceProperties in NumaCpuOptions like you proposed below and marshal them into CpuInstanceProperties inside of parse_numa() where needed. > > > > > > > ## > > # @NumaNodeOptions: > > > > > > > In case we have trouble making this struct work with QemuOpts, we > > > could do this (temporarily?): > > > > > > ## > > > # @NumaCpuOptions: > > > # > > > # Mapping of a given CPU (or a set of CPUs) to a NUMA node. > > > # > > > # @cpu: Properties identifying the CPU(s). Use the 'props' field of > > > # query-hotpluggable-cpus for possible values for this > > > # field. > > > # TODO: describe what happens when 'cpu' matches > > > # multiple slots. > > > # @node-id: NUMA node where the CPUs are going to be located. > > > # > > > # @socket-id: Shortcut for cpu.socket-id, to make this struct > > > # friendly to QemuOpts. > > > # @core-id: Shortcut for cpu.core-id, to make this struct > > > # friendly to QemuOpts. > > > # @thread-id: Shortcut for cpu.thread-id, to make this struct > > > # friendly to QemuOpts. > > > ## > > > { 'struct': 'NumaCpuOptions', > > > 'data': { > > > '*cpu': 'CpuInstanceProperties', > > > '*socket-id': 'int', > > > '*core-id': 'int', > > > '*thread-id': 'int', > > > 'node-id': 'int' } } > > > > > > > 6. optionally, set new numa mapping and get updated > > > > possible cpus list with query-hotpluggable-cpus > > > > 7. optionally, add extra cpus with device_add using updated > > > > cpus list and get updated cpus list as it's been changed again. > > > > 8. unpause preconfig stage and let qemu continue to execute > > > > machine_init and the rest. > > > > > > > > Since we would need to implement QMP configuration for '-device cpu', > > > > we as well might reuse it for custom numa mapping. > > > > > > > > [...] > > > > > >