Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO)

David Gibson Tue, 10 Nov 2015 18:28:22 -0800

On Tue, Nov 10, 2015 at 04:56:38PM -0800, Nishanth Aravamudan wrote:
> On 11.11.2015 [11:17:58 +1100], David Gibson wrote:
> > On Mon, Nov 09, 2015 at 08:22:32PM -0800, Sukadev Bhattiprolu wrote:
> > > David Gibson [da...@gibson.dropbear.id.au] wrote:
> > > | On Wed, Nov 04, 2015 at 03:06:05PM -0800, Sukadev Bhattiprolu wrote:
> > > | > Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to 
> > > rtas_get_sysparm()
> > > | > call in qemu. This call returns the processor module (socket), chip 
> > > and core
> > > | > information as specified in section 7.3.16.18 of PAPR v2.7.
> > > | 
> > > | PAPR v2.7 isn't available publically.  For upstream patches, please
> > > | reference LoPAPR instead (where it's section 7.3.16.17 AFAICT).
> > > 
> > > Ok.
> > > 
> > > | 
> > > | > We walk the /proc/device-tree to determine the number of chips, cores 
> > > and
> > > | > modules in the _host_ system and return this info to the guest 
> > > application
> > > | > that makes the rtas_get_sysparm() call.
> > > | > 
> > > | > We currently hard code the number of module_types to 1, but we should 
> > > determine
> > > | > that dynamically somehow later.
> > > | > 
> > > | > Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk.
> > > | > 
> > > | > Signed-off-by: Sukadev Bhattiprolu <suka...@linux.vnet.ibm.com>
> > > | 
> > > | This isn't ready to go yet - you need to put some more consideration
> > > | into the uncommon cases: PR KVM, TCG and non-Power hosts.
> > > 
> > > Ok. Is there a we can make this code applicable only a Powerpc host?
> > > (would moving this code to target-ppc/kvm.c do that?)
> > 
> > Yes, moving it to target-ppc/kvm.c would mostly do that.  You'd need
> > some logic to make sure it fails gracefully in other cases, of course.
> > 
> > [snip]
> > > | >      switch (parameter) {
> > > | > +    case RTAS_SYSPARM_PROCESSOR_MODULE_INFO: {
> > > | > +        int i;
> > > | > +        int offset = 0;
> > > | > +        int size;
> > > | > +        struct rtas_module_info modinfo;
> > > | > +
> > > | > +        if (rtas_get_module_info(&modinfo)) {
> > > | > +            break;
> > > | > +        }
> > > | 
> > > | So, you handle the variable size of this structure before sending it
> > > | to the guest, but you don't handle it in allocation of the structure
> > > | right here.  You'll get away with that because for now you only ever
> > > | have one entry in the sockets array, but it's a bit icky.
> > > 
> > > Can we assume that the size is static for now...
> > > | 
> > > | > +
> > > | > +        size = sizeof(modinfo);
> > > | > +        size += (modinfo.module_types - 1) * sizeof(struct 
> > > rtas_socket_info);
> > > | 
> > > | More seriously, this calculation will break horribly if you change the
> > > | size of the array in struct rtas_module_info.
> > > 
> > > and just set 'size' to sizeof(modinfo)?.
> > 
> > For purposes of allocation you could just use a fixed size.  But the
> > guest might get confused by additional data beyond the declared size,
> > so you do need to get the value correct that you send back to the guest.
> > 
> > [snip]
> > > | > +/*
> > > | > + * Each module's (aka socket's) id is contained in the 
> > > 'ibm,hw-module-id'
> > > | > + * file in the "xscom" directory (/proc/device-tree/xscom*). 
> > > Similarly each
> > > | > + * chip's id is contained in the 'ibm,chip-id' file in the xscom 
> > > directory.
> > > | > + *
> > > | > + * A module can contain more than one chip and a chip can contain 
> > > more
> > > | > + * than one core. So there are likely to be duplicates in the module
> > > | > + * and chip identifiers (i.e more than one xscom directory can 
> > > contain
> > > | > + * the same module/chip id).
> > > | > + *
> > > | > + * Search the xscom directories and count the number of _UNIQUE_ 
> > > module
> > > | > + * and chip identifiers in the system.
> > > | 
> > > | There's no direct way to go from a core
> > > | (i.e. /proc/device-tree/cpus/cpu@NNN) to the corresponding chip and/or
> > > | module?
> > > 
> > > Yes, it would logical to find the chip and module from the core :-)
> > > 
> > > While 'ibm,chip-id' is in the core dir 
> > > (/proc/device-tree/cpus/PowerPC,*/), 
> > > the 'ibm,hw-module-id' is not there (on my Tuleta system). Maybe the
> > > 'ibm,hw-module-id' will be added in the future?
> > 
> > Hm, I see.  Is there any device node that represents the "chip"?
> > 
> > > I am using the xscom node to be consistent in counting chips and modules.
> > 
> > The trouble with xscom is that it's extremely specific to the way the
> > current IBM servers present things.  It won't work on other types of
> > host machine (which could happen with PR KVM), and could even break if
> > IBM changes the way it organizes the SCOMs in a future machine.
> > 
> > Working from the nodes in /cpus still has some dependencies on IBM
> > specific properties, but it's at least partially based on OF
> > standards.
> > 
> > There's also another possible approach here, though I don't know if it
> > will work.  Instead of looking directly in the device tree, try to get
> > the information from lscpu, or libosinfo.  That would at least give
> > you some hope of providing meaningful information on other host types.
> 
> Heh, the issue that is underlying all of this, is that `lscpu` itself is
> quite wrong.
> 
> On PAPR-compliant hypervisors (well, PowerVM, at least), the only
> supported means of determining the underlying hardware CPU information
> (which is what licensing models want in the end), is to use this RTAS
> call in an LPAR. `lscpu` is explicitly incorrect in these environments
> (it's values are "derived" from sysfs and some are adjusted to ensure
> the division of values works out).


So.. I'm not sure if you're just saying that lscpu is wrong because it
gives the guest information, or because of other problems.

What I was suggesting is implementing the RTAS call so that it
effectively lets the guest get lscpu information from the host.

> So, we are trying to at least resolve what PowerKVM guest can see by
> supporting this RTAS call there. We should report *something* to the
> guest, if possible, and we can adjust what is reported to the guests as
> we go, from the host perspective.
> 
> I haven't followed along too closely in this thread, but woudl it be
> reasonable to only report this RTAS call as being supported under
> KVM?

Possibly, yes.

> How are other RTAS calls dealt with for PR and non-IBM models
> currently?

Most of them still make sense in PR or TCG.  A few do look in the host
device tree, in which case they're likely to fail on non-KVM.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO)

Reply via email to