Hi Stefan, Thanks for the detailed explanation!
at 6:21 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote: > Hi Jinhao, > Thanks for working on this! > > irqfd is not necessarily faster than KVM ioctl interrupt injection. > > There are at least two non performance reasons for irqfd: > 1. It avoids QEMU emulation code, which historically was not thread safe and > needed the Big QEMU Lock. IOThreads don't hold the BQL and therefore cannot > safely call the regular interrupt emulation code in QEMU. I think this is > still true today although parts of the code may now be less reliant on the > BQL. This probably means we need to move to irqfd when iothread support is added in qemu-nvme. > 2. The eventfd interface decouples interrupt injection from the KVM ioctl > interface. Vhost kernel and vhost-user device emulation code has no > dependency on KVM thanks to irqfd. They work with any eventfd, including > irqfd. This is contrary to our original belief. Klaus once pointed out that irqfd is KVM specific. I agreed with him since I found irqfd implementation is in virt/kvm/eventfd.c. But irqfd indeed avoids the KVM ioctl call. Could you elaborate on what “no dependency on KVM” means? > 2. How can I debug this kind of cross QEMU-KVM problems? > > perf(1) is good at observing both kernel and userspace activity together. > What is it that you want to debug. > I’ll look into perf(1). I think what I was trying to do is like a breakdown analysis on which part caused the latency. For example, what is the root cause of the performance improvements or regressions when irqfd is turned on. > What happens when the MSI-X vector is masked? > > I remember the VIRTIO code having masking support. I'm on my phone and can't > check now, but I think it registers a temporary eventfd and buffers irqs > while the vector is masked. Yes, this RFC ignored interrupt masking support. > > This makes me wonder if the VIRTIO and NVMe IOThread irqfd code can be > unified. Maybe IOThread support can be built into the core device emulation > code (e.g. irq APIs) so that it's not necessary to duplicate it. > Agreed. Recently when working on ioeventfd, iothread and polling support, my typical workflow is to look at how virtio does that and adjust that code into nvme. I think unifying their IOThread code can be beneficial since VIRTIO has incorporated many optimizations over the years that can not be directly enjoyed by nvme. But I fear that subtle differences in the two protocols may cause challenges for the unification. Again, thanks for your help :)