On Tue, Sep 20, 2022 at 2:27 PM Tamas K Lengyel <tamas.k.leng...@gmail.com> wrote:
> > > On Tue, Sep 20, 2022 at 2:12 PM Boris Ostrovsky < > boris.ostrov...@oracle.com> wrote: > >> >> On 9/20/22 10:54 AM, Jan Beulich wrote: >> > On 20.09.2022 16:26, Boris Ostrovsky wrote: >> >> On 9/20/22 4:01 AM, Jan Beulich wrote: >> >>> On 20.09.2022 00:42, Boris Ostrovsky wrote: >> >>>> It is saving vpmu data from current pcpu's MSRs for a remote vcpu so >> @v >> >>>> in vmx_find_msr() is not @current: >> >>>> >> >>>> vpmu_load() >> >>>> ... >> >>>> prev = per_cpu(last_vcpu, pcpu); >> >>>> vpmu_save_force(prev) >> >>>> core2_vpmu_save() >> >>>> __core2_vpmu_save() >> >>>> vmx_read_guest_msr() >> >>>> vmx_find_msr() >> >>>> >> >>>> >> >>>> The call to vmx_find_msr() was introduced by 755087eb9b10c. I wonder >> though whether >> >>>> this call is needed when code path above is executed (i.e. when we >> are saving >> >>>> remove vcpu) >> >>> How could it not be needed? We need to obtain the guest value. The >> >>> thing I don't understand is why this forced saving is necessary, >> >>> when context_switch() unconditionally calls vpmu_switch_from(). >> >> >> >> IIRC the logic is: >> >> >> >> 1. vcpuA runs on pcpu0 >> >> 2. vcpuA is de-scheduled and is selected to run on pcpu1. It has not >> yet called vpmu_load() from pcpu1 >> > The calling of vpmu_load() shouldn't matter here. What does matter is >> > that vpmu_save() was necessarily called already. >> >> >> I thought we don't always save MSRs on context switch because we optimize >> for case when the same vcpu gets scheduled again. But I am not sure I see >> this now that I am looking at the code. >> > > I see context_switch calling vpmu_save_from before __context_switch, but > if that did actually save the vPMU state it would have reset > VPMU_CONTEXT_LOADED, so by the time vpmu_load calls vpmu_save_force it > would have just bailed before hitting the ASSERT failure in vmx_find_msrs. > The context must still be loaded at that point (I'm trying to verify right > now by adding some printks). > OK, Boris was correct above, MSRs are not saved on context switch automatically because of that optimization. VPMU_CONTEXT_SAVE isn't set, so the only thing vpmu_switch_from does is set global control to 0 in case it was a PV vCPU (see core2_vpmu_save checks for both VPMU_CONTEXT_SAVE and VPMU_CONTEXT_LOADED) and vpmu_switch_from doesn't set VPMU_CONTEXT_SAVE. So for HVM vCPUs it does nothing, that's why we still see the context still loaded when we get to vpmu_load. It would be a simple enough change to make vpmu_switch_from always save the context and then we could get rid of vpmu_load trying to do it later and getting into that ASSERT failure. Thoughts? Tamas