On Thu, Dec 22, 2016 at 01:19:36PM -0500, Boris Ostrovsky wrote: > On 12/22/2016 11:44 AM, Roger Pau Monne wrote: > > On Thu, Dec 22, 2016 at 09:24:02AM -0700, Jan Beulich wrote: > >>>>> On 22.12.16 at 17:17, <boris.ostrov...@oracle.com> wrote: > >>> On 12/22/2016 07:17 AM, Roger Pau Monne wrote: > >>>> Maybe Boris has some ideas about how to do CPU hotplug for Dom0? > >>> Why would Xen need to be able to parse the AML that is intended to be > >>> executed by dom0? I'd think that all the hypervisor would need to do is > >>> to load it into memory, not any different from how it's done for regular > >>> guests. > >> Well, if Dom0 executed the unmodified _MAT, it would get back > >> data relating to the physical CPU. Roger is overriding MADT to > >> present virtual CPU information to Dom0, and this would likewise > >> need to happen for the _MAT return value. > > By "unmodified _MAT" you mean system's _MAT? Why can't we provide our > own that will return _MAT object properly adjusted for dom0? We are > going to provide our own (i.e. not system's) DSDT, aren't we?
Providing Dom0 with a different DSDT is almost impossible from a Xen PoV, for once Xen cannot parse the original DSDT (because it's a dynamic table), and then if we would be to provide a modified DSDT, we would also need an asl assembler, so that we could parse the DSDT, modify it, and then compile it again in order to provide it to Dom0. Although all this code could be put under an init section that would be freed after Dom0 creation it seems overkill and very far from trivial, not to mention that I'm not even sure what side-effects there would be if Xen parsed the DSDT itself without having any drivers. > > This is one of the problems with this Dom/Xen0 split brain problem that we > > have, and cannot get away from. > > > > To clarify a little bit, I'm not overwriting the original MADT in memory, so > > Dom0 should still be able to access it if the implementation of _MAT returns > > data from that area. AFAICT when plugging in a physical CPU (pCPU) into the > > hardware, Dom0 should see "correct" data returned by the _MAT method. > > However > > (as represented by the " I've used), this data will not match Dom0 vCPU > > topology, and should not be used by Dom0 (only reported to Xen in order to > > bring up the new pCPU). > > So the problem seems to be that we need to run both _MATs --- system's > and dom0's. Exactly, we need _MAT for pCPUs and _MAT for _vCPUs. > > Then the problem arises because we have no way to perform vCPU hotplug for > > Dom0, not at least as it is done for DomU (using ACPI), so we would have to > > use > > an out-of-band method in order to do vCPU hotplug for Dom0, which is a PITA. > > > I would very much like to avoid this. Maybe we can provide an extra SSDT for Dom0 that basically overwrites the CPU objects (_PR.CPUX), but I'm not sure if ACPI allows this kind of objects overwrites? After reading the spec, I came across the following note in the SSDT section: "Additional tables can only add data; they cannot overwrite data from previous tables." So I guess this is a no-go. I only see the following options: - Prevent Dom0 from using the original _MAT methods (or even the full _PR.CPU objects) using the STAO, and then provide Dom0 with an out-of-band method (ie: not ACPI) in order to do CPU hotplug. - Expand the STAO so that it can be used to override ACPI namespace objects, possibly by adding a payload field that contains aml code. It seems that Linux already supports overwriting part of the ACPI namespace from user-space[0], so part of the needed machinery seem to be already in place (hopefully in acpica code?). - Disable the native CPU objects in the DSDT/SSDT using the STAO. Then pick up unused ACPI CPU IDs and use those for vCPUs. Provide an additional SSDT that contains ACPI objects for those vCPUs (as is done for DomU). This means we would probably have to start using x2APIC entries in the MADT, since the CPUs IDs might easily expand past 255 (AFAICT we could still keep the APIC IDs low however, since those two are disjoint). I don't really fancy any of these two options, probably the last one seems like the best IMHO, but I would like to hear some feedback about them, and of course I'm open to suggestions :). Roger. [0] https://www.kernel.org/doc/Documentation/acpi/method-customizing.txt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel