On Wed, Nov 11, 2015 at 02:10:48PM -0800, Nishanth Aravamudan wrote: > On 11.11.2015 [12:41:26 +1100], David Gibson wrote: > > On Tue, Nov 10, 2015 at 04:56:38PM -0800, Nishanth Aravamudan wrote: > > > On 11.11.2015 [11:17:58 +1100], David Gibson wrote: > > > > On Mon, Nov 09, 2015 at 08:22:32PM -0800, Sukadev Bhattiprolu wrote: > > <snip> > > > > > The trouble with xscom is that it's extremely specific to the way the > > > > current IBM servers present things. It won't work on other types of > > > > host machine (which could happen with PR KVM), and could even break if > > > > IBM changes the way it organizes the SCOMs in a future machine. > > > > > > > > Working from the nodes in /cpus still has some dependencies on IBM > > > > specific properties, but it's at least partially based on OF > > > > standards. > > > > > > > > There's also another possible approach here, though I don't know if it > > > > will work. Instead of looking directly in the device tree, try to get > > > > the information from lscpu, or libosinfo. That would at least give > > > > you some hope of providing meaningful information on other host types. > > > > > > Heh, the issue that is underlying all of this, is that `lscpu` itself is > > > quite wrong. > > > > > > On PAPR-compliant hypervisors (well, PowerVM, at least), the only > > > supported means of determining the underlying hardware CPU information > > > (which is what licensing models want in the end), is to use this RTAS > > > call in an LPAR. `lscpu` is explicitly incorrect in these environments > > > (it's values are "derived" from sysfs and some are adjusted to ensure > > > the division of values works out). > > > > So.. I'm not sure if you're just saying that lscpu is wrong because it > > gives the guest information, or because of other problems. > > `lscpu`'s man-page specifically says that on virtualized platforms, the > output may be inaccurate. And, in fact, on Power, in a KVM guest (and > in a LPAR), `lscpu` is outputting the guest CPU information, which is > completely fake. This is true on x86 KVM guests too, afaict.
Um.. yes, I was assuming lscpu reporting information about virtual cpus and sockets was intended and correct behaviour. > *If* we have a valid RTAS implementation on PowerKVM (or under qemu > generally), I think we can modify `lscpu` to do the right thing in at > least those two environments. > > > What I was suggesting is implementing the RTAS call so that it > > effectively lets the guest get lscpu information from the host. > > A bit of a chicken & egg problem, I'd say. The `lscpu` output in PowerNV > is also wrong :) Ok.. why is it wrong in PowerNV? This sounds like something you'd want to fix anyway. > > > So, we are trying to at least resolve what PowerKVM guest can see by > > > supporting this RTAS call there. We should report *something* to the > > > guest, if possible, and we can adjust what is reported to the guests as > > > we go, from the host perspective. > > > > > > I haven't followed along too closely in this thread, but woudl it be > > > reasonable to only report this RTAS call as being supported under > > > KVM? > > > > Possibly, yes. > > At least, as a first step, I guess. > > > > How are other RTAS calls dealt with for PR and non-IBM models > > > currently? > > > > Most of them still make sense in PR or TCG. A few do look in the host > > device tree, in which case they're likely to fail on non-KVM. > > Got it, thanks. > > So my investigation overall led me to this set of conclusions: > > 1) Under PowerVM, we do not use this RTAS call, which is the only (as > asserted by pHyp developers) valid way to get hardware information about > the machine. Therefore, the PowerVM `lscpu` output is the "virtual" CPU > information -- where cores are as defined by sharing of the L2-cache. > > 2) Under PowerKVM, we do not use this RTAS call, because it's not > supported, and just spit out whatever the qemu topology is (which has no > connection to the host (physical) CPU information). Right.. so does that mean nothing is using this call yet? > --> so if we implement the RTAS call of some sort under PowerKVM, then > we can update `lscpu` to use that RTAS call. Yeah, I'm not convinced that's correct. Shouldn't lscpu return the virtual cpu information, at least by default. > 3) Under PowerNV, there is a dependency on the hack that is ibm,chip-id > from OPAL, which leads to twice as many sockets potentially being > reported. `lscpu` also uses the sysfs files directly, which may or may > not be the physical topology (I'm still tracking all of this down). > > *Also* `lscpu` has no knowledge of offline/online CPUs, so as you > online/offline CPUs, the output of `lscpu` starts to change. Ah, true. > I think what we eventually want to do is add some fields to `lscpu` to > indicate the "physical" data vs. the "virtual" data. Ok. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature