On Wednesday 05 November 2014 04:37 PM, Alexander Graf wrote: > > > On 05.11.14 11:37, Aravinda Prasad wrote: >> >> >> On Wednesday 05 November 2014 02:02 PM, Alexander Graf wrote: >>> >>> >>> On 05.11.14 08:13, Aravinda Prasad wrote: >>>> This patch adds FWNMI support in qemu for powerKVM >>>> guests by handling the ibm,nmi-register rtas call. >>>> Whenever OS issues ibm,nmi-register RTAS call, the >>>> machine check notification address is saved and the >>>> machine check interrupt vector 0x200 is patched to >>>> issue a private hcall. >>>> >>>> This patch also handles the cases when multi-processors >>>> experience machine check at or about the same time. >>>> As per PAPR, subsequent processors serialize waiting >>>> for the first processor to issue the ibm,nmi-interlock call. >>>> The second processor retries if the first processor which >>>> received a machine check is still reading the error log >>>> and is yet to issue ibm,nmi-interlock call. >>>> >>>> Signed-off-by: Aravinda Prasad <aravi...@linux.vnet.ibm.com> >>>> --- >>>> hw/ppc/spapr_hcall.c | 16 +++++++ >>>> hw/ppc/spapr_rtas.c | 93 >>>> +++++++++++++++++++++++++++++++++++++++ >>>> include/hw/ppc/spapr.h | 17 +++++++ >>>> pc-bios/spapr-rtas/spapr-rtas.S | 38 ++++++++++++++++ >>>> 4 files changed, 163 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c >>>> index 8f16160..eceb5e5 100644 >>>> --- a/hw/ppc/spapr_hcall.c >>>> +++ b/hw/ppc/spapr_hcall.c >>>> @@ -97,6 +97,9 @@ struct rtas_mc_log { >>>> struct rtas_error_log err_log; >>>> }; >>>> >>>> +/* Whether machine check handling is in progress by any CPU */ >>>> +bool mc_in_progress; >>>> + >>>> static void do_spr_sync(void *arg) >>>> { >>>> struct SPRSyncState *s = arg; >>>> @@ -678,6 +681,19 @@ static target_ulong h_report_mc_err(PowerPCCPU *cpu, >>>> sPAPREnvironment *spapr, >>>> cpu_synchronize_state(CPU(ppc_env_get_cpu(env))); >>>> >>>> /* >>>> + * Only one VCPU can process machine check NMI at a time. Hence >>>> + * set the lock mc_in_progress. Once the VCPU finishes processing >>>> + * NMI, it executes ibm,nmi-interlock and mc_in_progress is unset >>>> + * in ibm,nmi-interlock handler. Meanwhile if other VCPUs encounter >>>> + * NMI we return 0 asking the VCPU to retry h_report_mc_err >>>> + */ >>>> + if (mc_in_progress == 1) { >>> >>> Please don't depend on bools being numbers. Use true / false. For if()s, >>> just don't use == at all - it makes it more readable. >> >> ok >> >>> >>>> + return 0; >>>> + } >>>> + >>>> + mc_in_progress = 1; >>>> + >>>> + /* >>>> * We save the original r3 register in SPRG2 in 0x200 vector, >>>> * which is patched during call to ibm.nmi-register. Original >>>> * r3 is required to be included in error log >>>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c >>>> index 2ec2a8e..71c7662 100644 >>>> --- a/hw/ppc/spapr_rtas.c >>>> +++ b/hw/ppc/spapr_rtas.c >>>> @@ -36,6 +36,9 @@ >>>> >>>> #include <libfdt.h> >>>> >>>> +#define BRANCH_INST_MASK 0xFC000000 >>>> +extern bool mc_in_progress; >>> >>> Please put this into the spapr struct. >> >> ok >> >>> >>>> + >>>> static void rtas_display_character(PowerPCCPU *cpu, sPAPREnvironment >>>> *spapr, >>>> uint32_t token, uint32_t nargs, >>>> target_ulong args, >>>> @@ -290,6 +293,90 @@ static void rtas_ibm_os_term(PowerPCCPU *cpu, >>>> rtas_st(rets, 0, ret); >>>> } >>>> >>>> +static void rtas_ibm_nmi_register(PowerPCCPU *cpu, >>>> + sPAPREnvironment *spapr, >>>> + uint32_t token, uint32_t nargs, >>>> + target_ulong args, >>>> + uint32_t nret, target_ulong rets) >>>> +{ >>>> + int i; >>>> + uint32_t ori_inst = 0x60630000; >>>> + uint32_t branch_inst = 0x48000002; >>>> + target_ulong guest_machine_check_addr; >>>> + uint32_t trampoline[TRAMPOLINE_INSTS]; >>>> + int total_inst = sizeof(trampoline) / sizeof(uint32_t); >>> >>> ARRAY_SIZE(trampoline), though I don't quite understand why you need a >>> variable that contains the same value as a constant (TRAMPOLINE_INSTS). >>> >>> But since you're moving all of those bits into variable fields on the >>> rtas blob itself as we discussed in the last version, I guess this code >>> will go away anyways ;). >> >> I think we still need this. We need to patch the KVMPPC_H_REPORT_MC_ERR >> number and branch address in the trampoline and also, depending on >> whether the guest running in LE/BE we may need to flip the bits in the >> trampoline before copying it to 0x200 machine check vector. >> >> As rtas-blob is part of the guest memory I do not want to patch these in >> rtas-blob, hence I copy the trampoline from the rtas-blob to an array, >> modify accordingly and then move it to 0x200 machine check vector. > > Yes, you will still need the array. But the array should be dynamically > sized based on spapr->rtas_info->fwnmi_size which you extract from the > blob on load. > > That way you wouldn't need the "total_inst" variable anymore ;).
Yes, I will fix it that way. > >> >>> >>>> + PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); >>>> + >>>> + /* Store the system reset and machine check address */ >>>> + guest_machine_check_addr = rtas_ld(args, 1); >>> >>> Load or Store? I don't find the comment particularly useful either ;). >> >> will reword it or may delete it. >> >>> >>>> + >>>> + /* >>>> + * Read the trampoline instructions from RTAS Blob and patch >>>> + * the KVMPPC_H_REPORT_MC_ERR hcall number and the guest >>>> + * machine check address before copying to 0x200 vector >>>> + */ >>>> + cpu_physical_memory_read(spapr->rtas_addr + RTAS_TRAMPOLINE_OFFSET, >>>> + trampoline, sizeof(trampoline)); >>>> + >>>> + /* Safety Check */ >>> >>> Same for this comment. >> >> we have only 0x100 bytes that can be copied at 0x200. If the trampoline >> size exceeds then the next interrupt vector code is overwritten. Hence a >> safety check during compile time to make sure trampoline is within 0x100 >> bytes. > > I think the check is fine, the comment is just redundant with > QEMU_BUILD_BUG_ON. Either be more verbose in the comment or remove it I will add above lines as comment. > ;). But something a la > > /* check for failure */ > BUG_ON(foo); > > is useless redundancy, because everyone already knows that BUG_ON checks > for failure. > > The interesting bit is the why. Also, as a general rule of thumb, if you > need a comment explaining the "what" of what you're doing, your function > and/or variable names are probably not well chosen ;). But I don't think > this is a problem here. > > Thanks for the patches btw :) > > > Alex > -- Regards, Aravinda