On Tue, Nov 10, 2015 at 04:56:38PM -0800, Nishanth Aravamudan wrote: > On 11.11.2015 [11:17:58 +1100], David Gibson wrote: > > On Mon, Nov 09, 2015 at 08:22:32PM -0800, Sukadev Bhattiprolu wrote: > > > David Gibson [da...@gibson.dropbear.id.au] wrote: > > > | On Wed, Nov 04, 2015 at 03:06:05PM -0800, Sukadev Bhattiprolu wrote: > > > | > Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to > > > rtas_get_sysparm() > > > | > call in qemu. This call returns the processor module (socket), chip > > > and core > > > | > information as specified in section 7.3.16.18 of PAPR v2.7. > > > | > > > | PAPR v2.7 isn't available publically. For upstream patches, please > > > | reference LoPAPR instead (where it's section 7.3.16.17 AFAICT). > > > > > > Ok. > > > > > > | > > > | > We walk the /proc/device-tree to determine the number of chips, cores > > > and > > > | > modules in the _host_ system and return this info to the guest > > > application > > > | > that makes the rtas_get_sysparm() call. > > > | > > > > | > We currently hard code the number of module_types to 1, but we should > > > determine > > > | > that dynamically somehow later. > > > | > > > > | > Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk. > > > | > > > > | > Signed-off-by: Sukadev Bhattiprolu <suka...@linux.vnet.ibm.com> > > > | > > > | This isn't ready to go yet - you need to put some more consideration > > > | into the uncommon cases: PR KVM, TCG and non-Power hosts. > > > > > > Ok. Is there a we can make this code applicable only a Powerpc host? > > > (would moving this code to target-ppc/kvm.c do that?) > > > > Yes, moving it to target-ppc/kvm.c would mostly do that. You'd need > > some logic to make sure it fails gracefully in other cases, of course. > > > > [snip] > > > | > switch (parameter) { > > > | > + case RTAS_SYSPARM_PROCESSOR_MODULE_INFO: { > > > | > + int i; > > > | > + int offset = 0; > > > | > + int size; > > > | > + struct rtas_module_info modinfo; > > > | > + > > > | > + if (rtas_get_module_info(&modinfo)) { > > > | > + break; > > > | > + } > > > | > > > | So, you handle the variable size of this structure before sending it > > > | to the guest, but you don't handle it in allocation of the structure > > > | right here. You'll get away with that because for now you only ever > > > | have one entry in the sockets array, but it's a bit icky. > > > > > > Can we assume that the size is static for now... > > > | > > > | > + > > > | > + size = sizeof(modinfo); > > > | > + size += (modinfo.module_types - 1) * sizeof(struct > > > rtas_socket_info); > > > | > > > | More seriously, this calculation will break horribly if you change the > > > | size of the array in struct rtas_module_info. > > > > > > and just set 'size' to sizeof(modinfo)?. > > > > For purposes of allocation you could just use a fixed size. But the > > guest might get confused by additional data beyond the declared size, > > so you do need to get the value correct that you send back to the guest. > > > > [snip] > > > | > +/* > > > | > + * Each module's (aka socket's) id is contained in the > > > 'ibm,hw-module-id' > > > | > + * file in the "xscom" directory (/proc/device-tree/xscom*). > > > Similarly each > > > | > + * chip's id is contained in the 'ibm,chip-id' file in the xscom > > > directory. > > > | > + * > > > | > + * A module can contain more than one chip and a chip can contain > > > more > > > | > + * than one core. So there are likely to be duplicates in the module > > > | > + * and chip identifiers (i.e more than one xscom directory can > > > contain > > > | > + * the same module/chip id). > > > | > + * > > > | > + * Search the xscom directories and count the number of _UNIQUE_ > > > module > > > | > + * and chip identifiers in the system. > > > | > > > | There's no direct way to go from a core > > > | (i.e. /proc/device-tree/cpus/cpu@NNN) to the corresponding chip and/or > > > | module? > > > > > > Yes, it would logical to find the chip and module from the core :-) > > > > > > While 'ibm,chip-id' is in the core dir > > > (/proc/device-tree/cpus/PowerPC,*/), > > > the 'ibm,hw-module-id' is not there (on my Tuleta system). Maybe the > > > 'ibm,hw-module-id' will be added in the future? > > > > Hm, I see. Is there any device node that represents the "chip"? > > > > > I am using the xscom node to be consistent in counting chips and modules. > > > > The trouble with xscom is that it's extremely specific to the way the > > current IBM servers present things. It won't work on other types of > > host machine (which could happen with PR KVM), and could even break if > > IBM changes the way it organizes the SCOMs in a future machine. > > > > Working from the nodes in /cpus still has some dependencies on IBM > > specific properties, but it's at least partially based on OF > > standards. > > > > There's also another possible approach here, though I don't know if it > > will work. Instead of looking directly in the device tree, try to get > > the information from lscpu, or libosinfo. That would at least give > > you some hope of providing meaningful information on other host types. > > Heh, the issue that is underlying all of this, is that `lscpu` itself is > quite wrong. > > On PAPR-compliant hypervisors (well, PowerVM, at least), the only > supported means of determining the underlying hardware CPU information > (which is what licensing models want in the end), is to use this RTAS > call in an LPAR. `lscpu` is explicitly incorrect in these environments > (it's values are "derived" from sysfs and some are adjusted to ensure > the division of values works out).
So.. I'm not sure if you're just saying that lscpu is wrong because it gives the guest information, or because of other problems. What I was suggesting is implementing the RTAS call so that it effectively lets the guest get lscpu information from the host. > So, we are trying to at least resolve what PowerKVM guest can see by > supporting this RTAS call there. We should report *something* to the > guest, if possible, and we can adjust what is reported to the guests as > we go, from the host perspective. > > I haven't followed along too closely in this thread, but woudl it be > reasonable to only report this RTAS call as being supported under > KVM? Possibly, yes. > How are other RTAS calls dealt with for PR and non-IBM models > currently? Most of them still make sense in PR or TCG. A few do look in the host device tree, in which case they're likely to fail on non-KVM. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature