> On 27 May 2025, at 16:52, Marc Zyngier <m...@kernel.org> wrote:
> 
> On Tue, 27 May 2025 16:55:32 +0100,
> Miguel Luis <miguel.l...@oracle.com> wrote:
>> 
>> Hi Marc,
>> 
>>> On 27 May 2025, at 13:46, Marc Zyngier <m...@kernel.org> wrote:
>>> 
>>> On Tue, 27 May 2025 14:24:31 +0100,
>>> Miguel Luis <miguel.l...@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On 27 May 2025, at 12:02, Marc Zyngier <m...@kernel.org> wrote:
>>>>> 
>>>>> On Tue, 27 May 2025 12:40:35 +0100,
>>>>> Miguel Luis <miguel.l...@oracle.com> wrote:
>>>>>> 
>>>>>> Hi Marc,
>>>>>> 
>>>>>>> On 27 May 2025, at 07:39, Marc Zyngier <m...@kernel.org> wrote:
>>>>>>> 
>>>>>>> Hi Eric,
>>>>>>> 
>>>>>>> On Tue, 27 May 2025 07:24:32 +0100,
>>>>>>> Eric Auger <eric.au...@redhat.com> wrote:
>>>>>>>> 
>>>>>>>> Now that ARM nested virt has landed in kvm/next, let's turn the series
>>>>>>>> into a PATCH series. The linux header update was made against kvm/next.
>>>>>>>> 
>>>>>>>> For gaining virt functionality in KVM accelerated L1, The host needs to
>>>>>>>> be booted with "kvm-arm.mode=nested" option and qemu needs to be 
>>>>>>>> invoked
>>>>>>>> with: -machine virt,virtualization=on.
>>>>>>> 
>>>>>>> Thanks for respinning this series.
>>>>>>> 
>>>>>>> Do you have any plan to support the non-VHE version of the NV support
>>>>>>> (as advertised by KVM_CAP_ARM_EL2_E2H0)? It would allow running lesser
>>>>>>> hypervisors (such as *cough* Xen *cough*), which completely rely on
>>>>>>> HCR_EL2.E2H being 0?
>>>>>>> 
>>>>>> 
>>>>>> Something that pops up is early_kvm_mode_cfg trying to handle nested mode
>>>>>> while KVM_ARM_VCPU_HAS_EL2_E2H0 is set.
>>>>> 
>>>>> Care to elaborate?
>>>>> 
>>>> 
>>>> Say host is booted in nested mode (kvm-arm.mode=nested) and host's KVM 
>>>> supports
>>>> both KVM_CAP_ARM_EL2 and KVM_CAP_ARM_E2H0.
>>>> 
>>>> A L1 guest boots setting both KVM_ARM_VCPU_HAS_EL2 and
>>>> KVM_ARM_VCPU_HAS_EL2_E2H0 and guest kernel's command line state
>>>> kvm-arm.mode=nested.
>>>> 
>>>> This splats the kernel from early_kvm_mode_cfg along a malformed early 
>>>> option
>>>> message.
>>> 
>>> BEBKAC. You are asking for nested on a (virtual) machine that doesn't
>>> support it, and the kernel tells you so with a warning. Try the same
>>> thing on a physical machine that doesn't have NV, and observe the
>>> result.
>>> 
>> 
>> Ack.
>> 
>> I find trying them a great way to improve resilience.
>> I’ve tried the scenarios below which have similar results on the guest:
>> 
>> 1.
>> Host: kvm-arm.mode=nested
>> 
>> L1 Guest: kvm-arm.mode=nvhe setting both
>> KVM_ARM_VCPU_HAS_EL2 and KVM_ARM_VCPU_HAS_EL2_E2H0
>> 
>> Result on the guest: No early_kvm_mode_cfg splat, boot proceeds, ends up in 
>> a hard lockup splat.
> 
> Setting kvm-arm.mode=nvhe when KVM_ARM_VCPU_HAS_EL2_E2H0 is set is a
> tautology. The very definition of nVHE is that HCR_EL2.E2H=0.
> 
>> 
>> 2.
>> Host: kvm-arm.mode=nested
>> 
>> L1 Guest: kvm-arm.mode=nested setting both
>> KVM_ARM_VCPU_HAS_EL2 and KVM_ARM_VCPU_HAS_EL2_E2H0
>> 
>> Result on the guest: Splat at early_kvm_mode_cfg, boot proceeds, ends up in 
>> hard lockup splat.
> 
> I don't see any of these lockups with kvmtool. See this:
> 
> https://pastebin.com/uyYzsBHc

Could you try bigger values for -c and check whether you can reproduce the 
issue?

> 
> for an example of a boot with both capabilities set and the nonsense
> "nested" on the command-line (your #2).
> 
>> Does this means there’s a default fallback mode in which nv gets on when 
>> kvm-arm.mode fed to the guest kernel cmdline differs from the expected?
> 
> I don't understand your question. We have two modes of operation:
> 
> - HAS_EL2 enables NV on the host, and additionally enables recursive
>  NV. As a consequence, HCR_EL2.E2H is RES1. This is how NV will be
>  supported long term.
> 
> - HAS_EL2_E2H0 restricts the above by not exposing NV to the guest,
>  and enforcing HCR_EL2.E2H to be RES0. I expect this to gradually be
>  removed from implementations, and eventually disappear.
> 
> As you can see, there is no "fallback mode". You pick the mode you
> want based on the guest you want to run and the capabilities of the
> hardware.
> 

I’m now suspecting the lockups might have a different reason than guest’s mode.

Thanks
Miguel

> M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

Reply via email to