On 09/04/2013 10:13 PM, Alexander Graf wrote: > > On 04.09.2013, at 13:40, Alexey Kardashevskiy wrote: > >> On 09/04/2013 08:42 PM, Alexander Graf wrote: >>> >>> On 04.09.2013, at 12:19, Alexey Kardashevskiy wrote: >>> >>>> This is an RFC patch. >>>> >>>> The modern Linux kernel supports every known POWERPC CPU so when >>>> it boots, it can always find a matching cpu_spec from the cpu_specs array. >>>> However if the kernel is quite old, it may be missing the definition of >>>> the actual CPU. To provide ability for old kernels to work on modern >>>> hardware, a Logical Processor Version concept was introduced in PowerISA. >>>> From the hardware prospective, it is supported by PCR (Processor >>>> Compatibility Register) which is defined in PowerISA. The register >>>> enables compatibility mode which can be set to PowerISA 2.05 or 2.06. >>>> >>>> PAPR+ specification defines a Logical Processor Version per every >>>> version of PowerISA specification. PAPR+ also defines >>>> a ibm,client-architecture-support rtas call which purpose is to provide >>>> a negotiation mechanism for the guest and the hypervisor to work out >>>> the best Logical Processor Version to continue with. >>>> >>>> At the moment, the Linux kernel calls the ibm,client-architecture-support >>>> method and only then reads the device. The current RTAS's handler checks >>>> the capabilities from the array supplied by the guest kernel, analyses >>>> if QEMU can or cannot provide with the requested features. >>>> If QEMU supports everything the guest has requested, it returns from rtas >>>> call and the guest continues booting. >>>> If some parameter changes, QEMU fixes the device tree and reboots >>>> the guest with a new tree. >>>> >>>> In this version, the ibm,client-architecture-support handler checks >>>> if the current CPU is in the list from the guest and if it is not, QEMU >>>> adds a "cpu-version" property to a cpu node with the best of logical PVRs >>>> supported by the guest. >>>> >>>> Technically QEMU reboots and as a part of reboot, it fixes the tree and >>>> this is when the cpu-version property is actually added. >>>> >>>> Although it seems possible to add a custom interface between SLOF and QEMU >>>> and implement device tree update on the fly to avoid a guest reboot, >>>> there still may be cases when device tree change would not be enough. >>>> As an example, the guest may ask for a bigger RMA area than QEMU allocates >>>> by default. >>>> >>>> The patch depends on "[PATCH v5] powerpc: add PVR mask support". >>>> >>>> Cc: Nikunj A Dadhania <nik...@linux.vnet.ibm.com> >>>> Cc: Andreas Färber <afaer...@suse.de> >>>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> >>>> --- >>>> hw/ppc/spapr.c | 10 ++++++ >>>> hw/ppc/spapr_hcall.c | 76 >>>> +++++++++++++++++++++++++++++++++++++++++++++ >>>> include/hw/ppc/spapr.h | 7 ++++- >>>> target-ppc/cpu-models.h | 13 ++++++++ >>>> target-ppc/cpu-qom.h | 1 + >>>> target-ppc/translate_init.c | 3 ++ >>>> 6 files changed, 109 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >>>> index 13574bf..5adf53c 100644 >>>> --- a/hw/ppc/spapr.c >>>> +++ b/hw/ppc/spapr.c >>>> @@ -238,6 +238,16 @@ static int spapr_fixup_cpu_dt(void *fdt, >>>> sPAPREnvironment *spapr) >>>> if (ret < 0) { >>>> return ret; >>>> } >>>> + >>>> + if (spapr->pvr_new) { >>>> + ret = fdt_setprop(fdt, offset, "cpu-version", >>>> + &spapr->pvr_new, sizeof(spapr->pvr_new)); >>>> + if (ret < 0) { >>>> + return ret; >>>> + } >>>> + /* Reset as the guest after reboot may give other PVR set */ >>>> + spapr->pvr_new = 0; >>>> + } >>>> } >>>> return ret; >>>> } >>>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c >>>> index 9f6e7b8..509de89 100644 >>>> --- a/hw/ppc/spapr_hcall.c >>>> +++ b/hw/ppc/spapr_hcall.c >>>> @@ -3,6 +3,7 @@ >>>> #include "helper_regs.h" >>>> #include "hw/ppc/spapr.h" >>>> #include "mmu-hash64.h" >>>> +#include "cpu-models.h" >>>> >>>> static target_ulong h_random(PowerPCCPU *cpu, sPAPREnvironment *spapr, >>>> target_ulong opcode, target_ulong *args) >>>> @@ -792,6 +793,78 @@ out: >>>> return ret; >>>> } >>>> >>>> +static target_ulong h_client_architecture_support(PowerPCCPU *cpu, >>>> + sPAPREnvironment *spapr, >>>> + target_ulong opcode, >>>> + target_ulong *args) >>>> +{ >>>> + target_ulong list = args[0]; >>>> + int i, number_of_option_vectors; >>>> + PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); >>>> + bool cpu_match = false; >>>> + unsigned compat_cpu_level = 0, pvr_new; >>>> + >>>> + /* Parse PVR list */ >>>> + for ( ; ; ) { >>>> + uint32_t pvr, pvr_mask; >>>> + >>>> + pvr_mask = ldl_phys(list); >>>> + list += 4; >>>> + pvr = ldl_phys(list); >>>> + list += 4; >>>> + >>>> + if ((cpu->env.spr[SPR_PVR] & pvr_mask) == (pvr & pvr_mask)) { >>>> + cpu_match = true; >>>> + pvr_new = cpu->env.spr[SPR_PVR]; >>>> + } >>>> + >>>> + /* Is it a logical PVR? */ >>>> + if ((pvr & CPU_POWERPC_LOGICAL_MASK) == CPU_POWERPC_LOGICAL_MASK) >>>> { >>>> + switch (pvr) { >>>> + case CPU_POWERPC_LOGICAL_2_05: >>>> + if ((pcc->pcr & POWERPC_ISA_COMPAT_2_05) && >>>> + (compat_cpu_level < 2050)) { >>>> + compat_cpu_level = 2050; >>>> + pvr_new = pvr; >>>> + } >>>> + break; >>>> + case CPU_POWERPC_LOGICAL_2_06: >>>> + if ((pcc->pcr & POWERPC_ISA_COMPAT_2_06) && >>>> + (compat_cpu_level < 2060)) { >>>> + compat_cpu_level = 2060; >>>> + pvr_new = pvr; >>>> + } >>>> + break; >>>> + case CPU_POWERPC_LOGICAL_2_06_PLUS: >>>> + if ((pcc->pcr & POWERPC_ISA_COMPAT_2_06) && >>>> + (compat_cpu_level < 2061)) { >>>> + compat_cpu_level = 2061; >>>> + pvr_new = pvr; >>>> + } >>>> + break; >>>> + } >>>> + } >>>> + >>>> + if (~pvr_mask & pvr) { >>>> + break; >>>> + } >>>> + } >>>> + >>>> + /* Parsing finished. Making decision whether to reboot or not */ >>>> + if (cpu_match) { >>>> + return H_SUCCESS; >>>> + } >>>> + if (pvr_new == cpu->env.spr[SPR_PVR]) { >>>> + return H_SUCCESS; >>>> + } >>>> + >>>> + cpu->env.spr[SPR_PVR] = pvr_new; >>> >> >>> This would change the SPR value the guest sees. IIUC this is not what is >>> happening with this call. >> >> I am always trying to avoid adding new variables but yes, this is not the >> case, I'll fix it. >> >>> >>>> + spapr->pvr_new = pvr_new; >>>> + qemu_system_reset_request(); >>> >> >>> I think there should be a way to define the compat mode from the >>> beginning without the need to reboot the machine from itself. >> >> That is a different problem which I do not how to solve in a way to make >> everybody happy. Add logical CPUs to every CPU family (at least for >> POWER7/7+/8)? >
> I'm not sure the CPU is really the right place to put this information. > After all, you are not changing the CPU itself, you're only changing a > few bits inside of that CPU that make it appear closer to the legacy CPU > plus a few device tree changes. > So IMHO this whole thing should be orthogonal to -cpu. Well, since we cannot change CPU class on the fly, yes, it should be a "compatibility" flags/properties/methods/whatever of the default CPU for the specific family. >>> That way >>> management tools can straight on create POWER6 compat machines without >>> jumping through reboot hoops. >> >> One of the examples (came from Paul) is: >> the host runs on POWER8, the guest boots a kernel which is capable of >> POWER7 only + POWER7-compat. We do this reboot tweak and boot in >> POWER7-compat mode. Then the guest does "yum update" and gets POWER8 kernel >> installed so when it reboots, it will boot in normal POWER8 mode. Everybody >> is happy. >> >> Having POWER7-compat mode set from the very beginning will break this >> behaviour. > > Why? Just because you're on POWER7 as default doesn't mean you can't bump to > a newer compat later on, no? Bump when exactly? And it won't be a new compat, it will be a native CPU. I thought you are totally against reboots and you want libvirt (for example) to detect that this was ibm,client-architecture-support and reboot the guest with new CPU type (or machine option). > > > Alex > -- Alexey