On 2025-04-24 21:27:59 [+0530], Shrikanth Hegde wrote:
> > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> > > index 19f4d298d..123539642 100644
> > > --- a/arch/powerpc/kvm/book3s_hv.c
> > > +++ b/arch/powerpc/kvm/book3s_hv.c
> > > @@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, 
> > > u64 time_limit,
> > >           }
> > >           if (need_resched())
> > > -         cond_resched();
> > > +         schedule();
> > 
> 
> 
> > This looks unrelated and odd. I don't why but this should be a
> > cond_resched() so it can be optimized away on PREEMPT kernels.
> 
> This is needed, otherwise KVM on powerVM setup gets stuck on 
> preempt=full/lazy.

But this makes no sense. On preempt=full the cond_resched() gets patched
out while schedule() doesn't. Okay, this explains the stuck.

On preempt=full need_resched() should not return true because the
preemption happens right away. Unless you are in a preempt-disabled
or interrupt disabled section. But any of the conditions can't be true
because in both cases you can't invoke schedule(). So you must have had
a wake up on the local CPU which sets need-resched but the schedule()
was delayed for some reason. Once that need-resched bit is observed by
a remote CPU then it won't send an interrupt for a scheduling request
because it should happen any time soon… This should be fixed.

If you replace the above with preempt_disable(); preempt_enable() then it
should also work…

…
> > > --- a/arch/powerpc/kvm/powerpc.c
> > > +++ b/arch/powerpc/kvm/powerpc.c
> > > @@ -34,6 +34,7 @@
> > >   #endif
> > >   #include <asm/ultravisor.h>
> > >   #include <asm/setup.h>
> > > +#include <linux/entry-kvm.h>
> > >   #include "timing.h"
> > >   #include "../mm/mmu_decl.h"
> > > @@ -80,24 +81,17 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
> > >   {
> > >           int r;
> > > + /* use generic framework to handle need resched and signals */
> > > + if (__xfer_to_guest_mode_work_pending()) {
> > > +         r = xfer_to_guest_mode_handle_work(vcpu);
> > 
> > there is nothing special you do checking and handling the work. Couldn't
> > you invoke xfer_to_guest_mode_handle_work() unconditionally?
> > 
> 
> I followed what was in arch/x86/kvm/x86.c. Since 
> xfer_to_guest_mode_handle_work does the same check
> it makes sense to call it without checks too.

Yeah but I guess x86 did some other updates, too.

> Will update in v2.
> 
…
> > > -
> > > -         if (signal_pending(current)) {
> > > -                 kvmppc_account_exit(vcpu, SIGNAL_EXITS);
> > > -                 vcpu->run->exit_reason = KVM_EXIT_INTR;
> > > -                 r = -EINTR;
> > > -                 break;
> > 
> > I don't how this works but couldn't SIGNAL_EXITS vanish now that it
> > isn't updated anymore? The stat itself moves in kvm_handle_signal_exit()
> > to a different counter so it is not lost. The reader just needs to look
> > somewhere else for it.
> 
> ok. thanks for pointing out.
> 
> AFAIU it is updating the stats mostly. But below could keep the stats happy.
> I will update that in v2.
> 
>         if (__xfer_to_guest_mode_work_pending()) {
>                 r = xfer_to_guest_mode_handle_work(vcpu);
> +               /* generic framework doesn't update ppc specific stats*/
> +               if (r == -EINTR)
> +                       kvmppc_account_exit(vcpu, SIGNAL_EXITS);
>                 if (r)
>                         return r;

Either that or you rip it out entirely but that is not my call.

Sebastian

Reply via email to