On Thu, Jul 03, 2025 at 11:21:38PM +0200, Cédric Le Goater wrote:
> On 7/3/25 21:45, Peter Xu wrote:
> > On Wed, Jul 02, 2025 at 03:41:08PM -0400, Steven Sistare wrote:
> > > The irq producer is not closed, but it is detached from the kvm consumer.
> > > It's eventfd is preserved in new QEMU, and interrupts that arrive during
> > > transition are pended there.
> > 
> > Ah I see, looks reasonable.
> > 
> > So can I understand the core issue here is about the irq consumer /
> > provider updates are atomic, meanwhile there's always the fallback paths
> > ready, so before / after the update the irq won't get lost?
> > 
> > E.g. in Post-Interrupt context of Intel's, the irte will be updated
> > atomically for these VFIO irqs, so that either it'll keep using the fast
> > path (provided by the irqbypass mechanism), or slow path (eventfd_signal),
> > so it's free of any kind of race that irq could trigger?
> > 
> > I saw that there's already a new version and Cedric queued it.  If possible
> > add some explanation into commit message, either when repost, or when
> > merge, would be nice, on explaning irq won't get lost.
> yes.
> 
> Steve, just resend the patch. I will update the vfio queue.
> Or we can address that with a follow up patch before QEMU 10.1
> is released.

I've just noticed maybe I was wrong that slow path was always present.
We've closed the kvm so likely the slow path is gone..

So I think I misunderstood, and Steve likely meant the irq will be
persisted in eventfd, which is still true if the irq eventfds are persisted
and passed over (I didn't check the patchset, but I'm assuming this is the
case).

Then I found, yes, indeed when irqfd is re-established on dest qemu, we
have such tricky code:

kvm_irqfd_assign():

        /*
         * Check if there was an event already pending on the eventfd
         * before we registered, and trigger it as if we didn't miss it.
         */
        events = vfs_poll(fd_file(f), &irqfd->pt);

        if (events & EPOLLIN)
                schedule_work(&irqfd->inject);

I've no idea whether it was intended to do this as the code was there since
2009, maybe this chunk of code is the core of why irq won't get lost for
CPR.  But in all cases, it can be a pretty tricky spot to prove that cpr
works and looks important piece of info.

Personally I'm ok doing it on top of what's queued.  Maybe such explanation
on how it works should be put directly into docs/../cpr.rst?

Thanks,

-- 
Peter Xu


Reply via email to