Re: [PATCH v3 4/4] hw/nvme: add polling support

2022-11-08 Thread Jinhao Fan
at 10:11 PM, Klaus Jensen wrote: > On Nov 8 12:39, John Levon wrote: >> On Fri, Nov 04, 2022 at 07:32:12AM +0100, Klaus Jensen wrote: >> >>> On Nov 3 21:19, Jinhao Fan wrote: >>>> On 11/3/2022 8:10 PM, Klaus Jensen wrote: >>>>> I agree that

Re: [PATCH v3 4/4] hw/nvme: add polling support

2022-11-03 Thread Jinhao Fan
On 11/3/2022 8:10 PM, Klaus Jensen wrote: I agree that the spec is a little unclear on this point. In any case, in Linux, when the driver has decided that the sq tail must be updated, it will use this check: (new_idx - event_idx - 1) < (new_idx - old) When eventidx is already behind, it's l

Re: [PATCH v3 3/4] hw/nvme: add iothread support

2022-11-03 Thread Jinhao Fan
On 11/3/2022 8:11 PM, Klaus Jensen wrote: I'll rerun some experiments and get back to you 😄 Thanks Klaus. I'll also run some experments on my machine, to see if the reenabling cqe batching patch solves this problem.

Re: [PATCH v3 4/4] hw/nvme: add polling support

2022-11-02 Thread Jinhao Fan
at 7:10 PM, Klaus Jensen wrote: > This doesn't do what you expect it to. By not updaring the eventidx it > will fall behind the actual head, causing the host to think that the > device is not processing events (but it is!), resulting in doorbell > ringing. I’m not sure I understand this correctl

Re: [PATCH v3 3/4] hw/nvme: add iothread support

2022-11-02 Thread Jinhao Fan
at 7:13 PM, Klaus Jensen wrote: > Otherwise, it all looks fine. I'm still seeing the weird slowdown when > an iothread is enabled. I have yet to figure out why that is... But it > scales! :) How much slowdown do you observe on your machine?

Re: [PATCH] hw/nvme: reenable cqe batching

2022-10-20 Thread Jinhao Fan
at 1:37 PM, Klaus Jensen wrote: > On Okt 21 10:37, Jinhao Fan wrote: >> at 7:35 PM, Klaus Jensen wrote: >> >>> Commit 2e53b0b45024 ("hw/nvme: Use ioeventfd to handle doorbell >>> updates") had the unintended effect of disabling batching of CQEs. >

Re: [PATCH] hw/nvme: reenable cqe batching

2022-10-20 Thread Jinhao Fan
at 7:35 PM, Klaus Jensen wrote: > Commit 2e53b0b45024 ("hw/nvme: Use ioeventfd to handle doorbell > updates") had the unintended effect of disabling batching of CQEs. > > This patch changes the sq/cq timers to bottom halfs and instead of > calling nvme_post_cqes() immediately (causing an interru

Re: [PATCH 0/3] iothread and irqfd support

2022-10-13 Thread Jinhao Fan
I also came across the Interrupt Coalescing feature in NVMe (5.21.1.8 in Spec 1.4c). It seems to suggest an algorithm based on thresholds for the maximum interrupt delay time and minimum number of coalesced interrupts. Klaus, based on your experience, how does this algorithm work? Is is commonly u

Re: [PATCH 0/3] iothread and irqfd support

2022-10-12 Thread Jinhao Fan
On 10/12/2022 10:39 PM, Klaus Jensen wrote: I have been meaning to pick it up, but I got side-tracked. The polling performance drop needs to be address as we discussed offline. But the v4 looks pretty good and I can pick that up without the polling support for now. I've been using the v4 witho

[PATCH v3 4/4] hw/nvme: add polling support

2022-08-27 Thread Jinhao Fan
shadow doorbell buffer when we want to post a cqe. Updates on the SQ eventidx buffer is stopped during polling to avoid the host doing unnecessary doorbell buffer writes. Comparison (KIOPS): QD1 4 16 64 QEMU 53 155 245 309 polling 123 165 189 191 Signed-off-by: Jinhao Fan --- hw/nvme

[PATCH v3 0/4] irqfd, iothread and polling support

2022-08-27 Thread Jinhao Fan
duplicate initilization of cq timer Jinhao Fan (4): hw/nvme: support irq(de)assertion with eventfd hw/nvme: use KVM irqfd when available hw/nvme: add iothread support hw/nvme: add polling support hw/nvme/ctrl.c | 394 --- hw/nvme/ns.c

[PATCH v3 1/4] hw/nvme: support irq(de)assertion with eventfd

2022-08-27 Thread Jinhao Fan
Asserting and deasseting irq with eventfd has some performance implications. For small queue depth it increases request latency but for large queue depth it effectively coalesces irqs. Comparision (KIOPS): QD1 4 16 64 QEMU 38 123 210 329 irq-eventfd 32 106 240 364 Signed

[PATCH v3 2/4] hw/nvme: use KVM irqfd when available

2022-08-27 Thread Jinhao Fan
Use KVM's irqfd to send interrupts when possible. This approach is thread safe. Moreover, it does not have the inter-thread communication overhead of plain event notifiers since handler callback are called in the same system call as irqfd write. Signed-off-by: Jinhao Fan Signed-off-by:

[PATCH v3 3/4] hw/nvme: add iothread support

2022-08-27 Thread Jinhao Fan
oorbells, which I will bring up in a following patch. Iothread can be enabled by: -object iothread,id=nvme0 \ -device nvme,iothread=nvme0 \ Performance comparisons (KIOPS): QD 1 4 16 64 QEMU 41 136 242 338 iothread 53 155 245 309 Signed-off-by: Jinhao Fan --- hw/n

Re: [PATCH v2 2/3] hw/nvme: use KVM irqfd when available

2022-08-27 Thread Jinhao Fan
at 11:34 PM, Keith Busch wrote: > On Fri, Aug 26, 2022 at 11:12:04PM +0800, Jinhao Fan wrote: >> Use KVM's irqfd to send interrupts when possible. This approach is >> thread safe. Moreover, it does not have the inter-thread communication >> overhead of plain eve

Re: [PATCH v2 2/3] hw/nvme: use KVM irqfd when available

2022-08-26 Thread Jinhao Fan
at 11:34 PM, Keith Busch wrote: > On Fri, Aug 26, 2022 at 11:12:04PM +0800, Jinhao Fan wrote: >> Use KVM's irqfd to send interrupts when possible. This approach is >> thread safe. Moreover, it does not have the inter-thread communication >> overhead of plain eve

[PATCH v2 3/3] hw/nvme: add iothread support

2022-08-26 Thread Jinhao Fan
oorbells, which I will bring up in a following patch. Iothread can be enabled by: -object iothread,id=nvme0 \ -device nvme,iothread=nvme0 \ Performance comparisons (KIOPS): QD 1 4 16 64 QEMU 41 136 242 338 iothread 53 155 245 309 Signed-off-by: Jinhao Fan --- hw/n

[PATCH v2 0/3] iothread and irqfd support

2022-08-26 Thread Jinhao Fan
: - Avoid duplicate initilization of cq timer Jinhao Fan (3): hw/nvme: support irq(de)assertion with eventfd hw/nvme: use KVM irqfd when available hw/nvme: add iothread support hw/nvme/ctrl.c | 328 +++ hw/nvme/ns.c | 21 ++- hw/nvme/nvme.h

Re: [PATCH 3/3] hw/nvme: add iothread support

2022-08-26 Thread Jinhao Fan
at 7:18 PM, Jinhao Fan wrote: > @@ -4979,7 +5007,13 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, > uint64_t dma_addr, > } > } > n->cq[cqid] = cq; > -cq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_post_cqes, cq); > + > +if (cq

[PATCH v2 1/3] hw/nvme: support irq(de)assertion with eventfd

2022-08-26 Thread Jinhao Fan
Asserting and deasseting irq with eventfd has some performance implications. For small queue depth it increases request latency but for large queue depth it effectively coalesces irqs. Comparision (KIOPS): QD1 4 16 64 QEMU 38 123 210 329 irq-eventfd 32 106 240 364 Signed

[PATCH v2 2/3] hw/nvme: use KVM irqfd when available

2022-08-26 Thread Jinhao Fan
Use KVM's irqfd to send interrupts when possible. This approach is thread safe. Moreover, it does not have the inter-thread communication overhead of plain event notifiers since handler callback are called in the same system call as irqfd write. Signed-off-by: Jinhao Fan Signed-off-by:

[PATCH 3/3] hw/nvme: add iothread support

2022-08-26 Thread Jinhao Fan
oorbells, which I will bring up in a following patch. Iothread can be enabled by: -object iothread,id=nvme0 \ -device nvme,iothread=nvme0 \ Performance comparisons (KIOPS): QD 1 4 16 64 QEMU 41 136 242 338 iothread 53 155 245 309 Signed-off-by: Jinhao Fan --- hw/n

[PATCH 2/3] hw/nvme: use KVM irqfd when available

2022-08-26 Thread Jinhao Fan
Use KVM's irqfd to send interrupts when possible. This approach is thread safe. Moreover, it does not have the inter-thread communication overhead of plain event notifiers since handler callback are called in the same system call as irqfd write. Signed-off-by: Jinhao Fan Signed-off-by:

[PATCH 1/3] hw/nvme: support irq(de)assertion with eventfd

2022-08-26 Thread Jinhao Fan
Asserting and deasseting irq with eventfd has some performance implications. For small queue depth it increases request latency but for large queue depth it effectively coalesces irqs. Comparision (KIOPS): QD1 4 16 64 QEMU 38 123 210 329 irq-eventfd 32 106 240 364 Signed

[PATCH 0/3] iothread and irqfd support

2022-08-26 Thread Jinhao Fan
This patch series adds support for using a seperate iothread for NVMe IO emulation, which brings the potential of applying polling. The first two patches implements support for irqfd, which solves thread safety problems for interrupt emulation outside the main loop thread. Jinhao Fan (3): hw

Re: [PATCH] hw/nvme: Add iothread support

2022-08-26 Thread Jinhao Fan
Sure. I’ve already reworked this iothread patch upon the new irqfd patch. I think I can post a v2 patch today. Do you mean I include irqfd v3 in the new iothread patch series? 发自我的iPhone > 在 2022年8月26日,15:12,Klaus Jensen 写道: > > On Jul 20 17:00, Jinhao Fan wrote: >> Add an o

Re: [PATCH v3 0/2] hw/nvme: add irqfd support

2022-08-25 Thread Jinhao Fan
s. Removing it from nvme_init_irq_notifier() makes > sure we do not try to double add. > > Now it does what it is supposed to; no hacks required :) > > Jinhao Fan (2): > hw/nvme: support irq(de)assertion with eventfd > hw/nvme: use KVM irqfd when available > > h

Re: [PATCH v2 1/3] hw/nvme: support irq(de)assertion with eventfd

2022-08-25 Thread Jinhao Fan
On 8/25/2022 9:59 PM, Klaus Jensen wrote: On Aug 25 21:09, Jinhao Fan wrote: 在 2022年8月25日,20:39,Klaus Jensen 写道: On Aug 25 13:56, Klaus Jensen wrote: On Aug 25 19:16, Jinhao Fan wrote: On 8/25/2022 5:33 PM, Klaus Jensen wrote: I'm still a bit perplexed by this issue, so I just

Re: [PATCH v2 1/3] hw/nvme: support irq(de)assertion with eventfd

2022-08-25 Thread Jinhao Fan
On 8/25/2022 7:56 PM, Klaus Jensen wrote: On Aug 25 19:16, Jinhao Fan wrote: On 8/25/2022 5:33 PM, Klaus Jensen wrote: I'm still a bit perplexed by this issue, so I just tried moving nvme_init_irq_notifier() to the end of nvme_init_cq() and removing this first_io_cqe thing. I did not ob

Re: [PATCH v2 1/3] hw/nvme: support irq(de)assertion with eventfd

2022-08-25 Thread Jinhao Fan
> 在 2022年8月25日,20:39,Klaus Jensen 写道: > > On Aug 25 13:56, Klaus Jensen wrote: >>> On Aug 25 19:16, Jinhao Fan wrote: >>> On 8/25/2022 5:33 PM, Klaus Jensen wrote: >>>> I'm still a bit perplexed by this issue, so I just tried moving >>>

Re: [PATCH v2 1/3] hw/nvme: support irq(de)assertion with eventfd

2022-08-25 Thread Jinhao Fan
On 8/25/2022 5:33 PM, Klaus Jensen wrote: I'm still a bit perplexed by this issue, so I just tried moving nvme_init_irq_notifier() to the end of nvme_init_cq() and removing this first_io_cqe thing. I did not observe any particular issues? What bad behavior did you encounter, it seems to work fin

[PATCH v2 3/3] hw/nvme: add MSI-x mask handlers for irqfd

2022-08-25 Thread Jinhao Fan
When irqfd is enabled, we bypass QEMU's irq emulation and let KVM to directly assert the irq. However, KVM is not aware of the device's MSI-x masking status. Add MSI-x mask bookkeeping in NVMe emulation and detach the corresponding irqfd when the certain vector is masked. Signed-off-

[PATCH v2 1/3] hw/nvme: support irq(de)assertion with eventfd

2022-08-25 Thread Jinhao Fan
Asserting and deasseting irq with eventfd has some performance implications. For small queue depth it increases request latency but for large queue depth it effectively coalesces irqs. Comparision (KIOPS): QD1 4 16 64 QEMU 38 123 210 329 irq-eventfd 32 106 240 364 Signed

[PATCH v2 2/3] hw/nvme: use KVM irqfd when available

2022-08-25 Thread Jinhao Fan
Use KVM's irqfd to send interrupts when possible. This approach is thread safe. Moreover, it does not have the inter-thread communication overhead of plain event notifiers since handler callback are called in the same system call as irqfd write. Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c

[PATCH v2 0/3] hw/nvme: add irqfd support

2022-08-25 Thread Jinhao Fan
mulation is bypassed. Changes since v1: - Made nvme_irq_(de)assert wrappers around eventfd call and actual irq assertion - Dropped the previous first patch to avoid duplicate checks for irq_enabled and msix_enabled Jinhao Fan (3): hw/nvme: support irq(de)assertion with eventfd hw/nvme

Re: [PATCH 4/4] hw/nvme: add MSI-x mask handlers for irqfd

2022-08-24 Thread Jinhao Fan
On 8/24/2022 7:22 PM, Klaus Jensen wrote: What are the implications if we drop it? That is, if we go back to your version that did not include this? If it doesnt impact the kvm irqchip logic, then I'd rather that we rip it out and leave the device without masking/unmasking support, keeping irqfd

Re: [PATCH 4/4] hw/nvme: add MSI-x mask handlers for irqfd

2022-08-23 Thread Jinhao Fan
On 8/16/2022 6:46 PM, Klaus Jensen wrote: Did qtest work out for you for testing? If so, it would be nice to add a simple test case as well. Since MSI-x masking handlers are only implemented for IO queues, if we want to use qtest we need to implement utilities for controller initialization an

Re: [PATCH 1/4] hw/nvme: avoid unnecessary call to irq (de)assertion functions

2022-08-16 Thread Jinhao Fan
at 11:24 PM, Stefan Hajnoczi wrote: > Can the logic be moved into assert()/deassert() so callers don't need > to duplicate the checks? > > (I assume the optimization is that eventfd syscalls are avoided, not > that the function call is avoided.) I guess I can move the eventfd syscall into asser

Re: [PATCH 2/4] hw/nvme: add option to (de)assert irq with eventfd

2022-08-16 Thread Jinhao Fan
at 7:20 PM, Klaus Jensen wrote: > This option does not seem to change anything - the value is never used > ;) What a stupid mistake. I’ll fix this in the next version.

Re: [PATCH 4/4] hw/nvme: add MSI-x mask handlers for irqfd

2022-08-16 Thread Jinhao Fan
at 6:46 PM, Klaus Jensen wrote: > > Did qtest work out for you for testing? If so, it would be nice to add a > simple test case as well. I found MSI-x masking harder to test than we imagined. My plan is to only emulate IO queues in the IOthread and leave admin queue emulation in the main loop s

Re: [PATCH 0/4] hw/nvme: add irqfd support

2022-08-15 Thread Jinhao Fan
at 11:37 PM, Jinhao Fan wrote: > This patch series changes qemu-nvme's interrupt emulation to use event > notifiers, which can ensure thread-safe interrupt delivery when iothread > is used. In the first two patches, I convert qemu-nvme's IO emulation > logic to send irq vi

[PATCH 4/4] hw/nvme: add MSI-x mask handlers for irqfd

2022-08-11 Thread Jinhao Fan
When irqfd is enabled, we bypass QEMU's irq emulation and let KVM to directly assert the irq. However, KVM is not aware of the device's MSI-x masking status. Add MSI-x mask bookkeeping in NVMe emulation and detach the corresponding irqfd when the certain vector is masked. Signed-off-

[PATCH 3/4] hw/nvme: use irqfd to send interrupts

2022-08-11 Thread Jinhao Fan
Use KVM's irqfd to send interrupts when possible. This approach is thread safe. Moreover, it does not have the inter-thread communication overhead of plain event notifiers since handler callback are called in the same system call as irqfd write. Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c

[PATCH 2/4] hw/nvme: add option to (de)assert irq with eventfd

2022-08-11 Thread Jinhao Fan
Asserting and deasseting irq with eventfd has some performance implications. For small queue depth it increases request latency but for large queue depth it effectively coalesces irqs. Comparision (KIOPS): QD1 4 16 64 QEMU 38 123 210 329 irq-eventfd 32 106 240 364 Signed

[PATCH 1/4] hw/nvme: avoid unnecessary call to irq (de)assertion functions

2022-08-11 Thread Jinhao Fan
the unnecessary overhead of signalling eventfd. Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c | 40 ++-- 1 file changed, 22 insertions(+), 18 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 87aeba0564..bd3350d7e0 100644 --- a/hw/nvme/ctrl.c +++ b/hw/n

[PATCH 0/4] hw/nvme: add irqfd support

2022-08-11 Thread Jinhao Fan
s MSI-x emulation is bypassed. Jinhao Fan (4): hw/nvme: avoid unnecessary call to irq (de)assertion functions hw/nvme: add option to (de)assert irq with eventfd hw/nvme: use irqfd to send interrupts hw/nvme: add MSI-x mask handlers for irqfd hw/nvme/ctrl.c | 251 +++

Re: [RFC] hw/nvme: Use irqfd to send interrupts

2022-08-07 Thread Jinhao Fan
at 12:35 PM, Jinhao Fan wrote: > static void nvme_irq_assert(NvmeCtrl *n, NvmeCQueue *cq) > { > if (cq->irq_enabled) { > if (msix_enabled(&(n->parent_obj))) { > +/* Initialize CQ irqfd */ > +if (!cq->irqfd_enabled &&am

[PATCH v2] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Jinhao Fan
With the introduction of shadow doorbell and ioeventfd, we need to do frequent conversion between qid and its doorbell offset. The original hard-coded calculation is confusing and error-prone. Add several helper functions to do this task. Signed-off-by: Jinhao Fan --- Changes since v1: - Use

Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Jinhao Fan
at 4:54 PM, Klaus Jensen wrote: > I am unsure if the compiler will transform that division into the shift > if it can infer that the divisor is a power of two (it most likely > will be able to). > > But I see no reason to have a potential division here when we can do > without and to me it is ju

Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Jinhao Fan
at 2:02 PM, Klaus Jensen wrote: > On Jul 28 16:07, Jinhao Fan wrote: >> With the introduction of shadow doorbell and ioeventfd, we need to do >> frequent conversion between qid and its doorbell offset. The original >> hard-coded calculation is confusing and error-pron

Re: [RFC] hw/nvme: Use irqfd to send interrupts

2022-08-01 Thread Jinhao Fan
asked. Hi Stefan, While implementing interrupt masking support, I found it hard to test this feature on the host. Keith told me that no NVMe drivers are currently using this feature. Do you remember how you tested interrupt masking? Regards, Jinhao Fan

Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-01 Thread Jinhao Fan
at 4:07 PM, Jinhao Fan wrote: > With the introduction of shadow doorbell and ioeventfd, we need to do > frequent conversion between qid and its doorbell offset. The original > hard-coded calculation is confusing and error-prone. Add several helper > functions to do this task. >

Re: [RFC] hw/nvme: Use irqfd to send interrupts

2022-07-28 Thread Jinhao Fan
at 11:18 PM, Stefan Hajnoczi wrote: > I think that is incorrect. QEMU has guest notifier emulation for the > non-KVM (and non-MSI-X PCI) cases. When there is no irqfd support > available, QEMU sets up a regular eventfd and calls > virtio_queue_guest_notifier_read() when it becomes readable. Than

Re: [PATCH for-7.1 0/3] hw/nvme: misc ioeventfd fixes

2022-07-28 Thread Jinhao Fan
not enable ioeventfd by default > > hw/nvme/ctrl.c | 12 +--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > -- > 2.36.1 Looks good to me as well. Reviewed-by: Jinhao Fan

[PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-07-28 Thread Jinhao Fan
With the introduction of shadow doorbell and ioeventfd, we need to do frequent conversion between qid and its doorbell offset. The original hard-coded calculation is confusing and error-prone. Add several helper functions to do this task. Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c | 61

Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-27 Thread Jinhao Fan
After digging through the source code, I found event_notifier_cleanup() only closes the eventfd, but does not de-register the event from QEMU’s event loop. event_notifier_set_handler() actually interacts with the event loop. It is a wrapper around aio_set_event_notifier(), which is again a wrapper

Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-27 Thread Jinhao Fan
at 3:06 PM, Klaus Jensen wrote: > On Jul 26 14:08, Klaus Jensen wrote: >> Alright. Forget about the iommu, that was just a coincidence. >> >> This patch seems to fix it. I guess it is the >> event_notifier_set_handler(..., NULL) that does the trick, but I'd like >> to understand why ;) >> >> >

Re: [PATCH] hw/nvme: add trace events for ioeventfd

2022-07-27 Thread Jinhao Fan
at 3:10 PM, Klaus Jensen wrote: > On Jul 20 10:47, Jinhao Fan wrote: >> at 10:41 PM, Jinhao Fan wrote: >> >>> at 1:34 PM, Klaus Jensen wrote: >>> >>>> From: Klaus Jensen >>>> >>>> While testing Jinhaos ioeventfd patch I f

Re: [PATCH] hw/nvme: Add iothread support

2022-07-26 Thread Jinhao Fan
at 10:56 AM, Jinhao Fan wrote: > at 2:07 AM, Keith Busch wrote: > >> MSI-x uses MMIO for masking, so there's no need for an NVMe specific way to >> mask these interrupts. You can just use the generic PCIe methods to clear the >> vector's enable bit. But no NV

Re: [PATCH] hw/nvme: Add iothread support

2022-07-26 Thread Jinhao Fan
at 2:07 AM, Keith Busch wrote: > MSI-x uses MMIO for masking, so there's no need for an NVMe specific way to > mask these interrupts. You can just use the generic PCIe methods to clear the > vector's enable bit. But no NVMe driver that I know of is making use of these > either, though it should b

Re: [PATCH] hw/nvme: Add iothread support

2022-07-26 Thread Jinhao Fan
at 10:45 PM, Keith Busch wrote: > On Tue, Jul 26, 2022 at 04:55:54PM +0800, Jinhao Fan wrote: >> Hi Klaus and Keith, >> >> I just added support for interrupt masking. How can I test interrupt >> masking? > > Are you asking about MSI masking? I don't thi

Re: [PATCH] hw/nvme: Add iothread support

2022-07-26 Thread Jinhao Fan
at 5:00 PM, Jinhao Fan wrote: > Add an option "iothread=x" to do emulation in a seperate iothread. > This improves the performance because QEMU's main loop is responsible > for a lot of other work while iothread is dedicated to NVMe emulation. > Moreover, emula

Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-26 Thread Jinhao Fan
at 3:41 PM, Klaus Jensen wrote: > On Jul 26 15:35, Jinhao Fan wrote: >> at 4:55 AM, Klaus Jensen wrote: >> >>> We have a regression following this patch that we need to address. >>> >>> With this patch, issuing a reset on the device (`nvme reset /dev

Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-26 Thread Jinhao Fan
at 4:55 AM, Klaus Jensen wrote: > > We have a regression following this patch that we need to address. > > With this patch, issuing a reset on the device (`nvme reset /dev/nvme0` > will do the trick) causes QEMU to hog my host cpu at 100%. > > I'm still not sure what causes this. The trace out

Re: [RFC] hw/nvme: Use irqfd to send interrupts

2022-07-24 Thread Jinhao Fan
at 3:36 AM, Stefan Hajnoczi wrote: > > > On Sun, Jul 24, 2022, 11:21 Jinhao Fan wrote: > at 9:29 PM, Stefan Hajnoczi wrote: > > > > > Nice, perf(1) is good for that. You can enable trace events and add > > kprobes/uprobes to record timestamps wh

Re: [RFC] hw/nvme: Use irqfd to send interrupts

2022-07-24 Thread Jinhao Fan
at 9:29 PM, Stefan Hajnoczi wrote: > > Nice, perf(1) is good for that. You can enable trace events and add > kprobes/uprobes to record timestamps when specific functions are entered. > Thanks Stefan, One last question: Currently we can achieve hundreds of KIOPS. That means perf can easily cap

[PATCH] block/io_uring: add missing include file

2022-07-21 Thread Jinhao Fan
The commit "Use io_uring_register_ring_fd() to skip fd operations" uses warn_report but did not include the header file "qemu/error-report.h". This causes "error: implicit declaration of function ‘warn_report’". Include this header file. Signed-off-by: Jinhao Fan

Re: [RFC] hw/nvme: Use irqfd to send interrupts

2022-07-20 Thread Jinhao Fan
Hi Stefan, Thanks for the detailed explanation! at 6:21 PM, Stefan Hajnoczi wrote: > Hi Jinhao, > Thanks for working on this! > > irqfd is not necessarily faster than KVM ioctl interrupt injection. > > There are at least two non performance reasons for irqfd: > 1. It avoids QEMU emulation co

Re: [RFC] hw/nvme: Use irqfd to send interrupts

2022-07-20 Thread Jinhao Fan
at 12:35 PM, Jinhao Fan wrote: > Use irqfd to directly notify KVM to inject interrupts. This is done by > registering a virtual IRQ(virq) in KVM and associate the virq with an > irqfd, so that KVM can directly inject the interrupt when it receives > notification from the irqfd. This

[PATCH] hw/nvme: Add iothread support

2022-07-20 Thread Jinhao Fan
oorbells, which I will bring up in a following patch. Iothread can be enabled by: -object iothread,id=nvme0 \ -device nvme,iothread=nvme0 \ Performance comparisons (KIOPS): QD 1 4 16 64 QEMU 41 136 242 338 iothread 53 155 245 309 Signed-off-by: Jinhao Fan --- hw/n

Re: [PATCH] hw/nvme: add trace events for ioeventfd

2022-07-19 Thread Jinhao Fan
at 10:41 PM, Jinhao Fan wrote: > at 1:34 PM, Klaus Jensen wrote: > >> From: Klaus Jensen >> >> While testing Jinhaos ioeventfd patch I found it useful with a couple of >> additional trace events since we no longer see the mmio events. >> >> Sign

Re: [PATCH] hw/nvme: add trace events for ioeventfd

2022-07-14 Thread Jinhao Fan
at 1:34 PM, Klaus Jensen wrote: > From: Klaus Jensen > > While testing Jinhaos ioeventfd patch I found it useful with a couple of > additional trace events since we no longer see the mmio events. > > Signed-off-by: Klaus Jensen > --- > hw/nvme/ctrl.c | 8 > hw/nvme/trace-events

Re: [RFC] hw/nvme: Use irqfd to send interrupts

2022-07-14 Thread Jinhao Fan
at 12:18 PM, Klaus Jensen wrote: > On Jul 12 14:26, Klaus Jensen wrote: >> On Jul 9 12:35, Jinhao Fan wrote: >>> Use irqfd to directly notify KVM to inject interrupts. This is done by >>> registering a virtual IRQ(virq) in KVM and associate the virq with an >>

[RFC] hw/nvme: Use irqfd to send interrupts

2022-07-08 Thread Jinhao Fan
kernel KVM. Could you help me figure out the following questions: 1. How much performance improvement can I expect from using irqfd? 2. How can I debug this kind of cross QEMU-KVM problems? Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c | 67 - hw/n

Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-08 Thread Jinhao Fan
at 10:24 PM, Jinhao Fan wrote: > @@ -5793,6 +5891,7 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const > NvmeRequest *req) > uint64_t dbs_addr = le64_to_cpu(req->cmd.dptr.prp1); > uint64_t eis_addr = le64_to_cpu(req->cmd.dptr.prp2); > int i; > +int

Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-07 Thread Jinhao Fan
at 1:51 PM, Klaus Jensen wrote: > On Jul 6 19:34, Jinhao Fan wrote: >> at 2:43 AM, Keith Busch wrote: >> >>> On Tue, Jul 05, 2022 at 07:11:36PM +0200, Klaus Jensen wrote: >>>> On Jul 5 22:24, Jinhao Fan wrote: >>>>> @@ -1374,7 +1374,14 @@

Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-06 Thread Jinhao Fan
at 2:43 AM, Keith Busch wrote: > On Tue, Jul 05, 2022 at 07:11:36PM +0200, Klaus Jensen wrote: >> On Jul 5 22:24, Jinhao Fan wrote: >>> @@ -1374,7 +1374,14 @@ static void nvme_enqueue_req_completion(NvmeCQueue >>> *cq, NvmeRequest *req) >>> >>>

Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-06 Thread Jinhao Fan
at 1:11 AM, Klaus Jensen wrote: > On Jul 5 22:24, Jinhao Fan wrote: >> Add property "ioeventfd" which is enabled by default. When this is >> enabled, updates on the doorbell registers will cause KVM to signal >> an event to the QEMU main loop to handle the

[PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-05 Thread Jinhao Fan
e doorbell update event. IOPS comparison on Linux 5.19-rc2: (Unit: KIOPS) qd 1 4 16 64 qemu35 121 176 153 ioeventfd 41 133 258 313 Changes since v3: - Do not deregister ioeventfd when it was not enabled on a SQ/CQ Signed-off-by: Jinhao Fan --- hw/nvme/ct

Re: [PATCH v2] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-03 Thread Jinhao Fan
at 12:04 AM, Keith Busch wrote: > On Thu, Jun 30, 2022 at 11:22:31AM +0800, Jinhao Fan wrote: >> +static int nvme_init_sq_ioeventfd(NvmeSQueue *sq) >> +{ >> +NvmeCtrl *n = sq->ctrl; >> +uint16_t offset = sq->sqid << 3; >> +int ret; >>

[PATCH v3] hw/nvme: Use ioeventfd to handle doorbell updates

2022-07-03 Thread Jinhao Fan
e doorbell update event. IOPS comparison on Linux 5.19-rc2: (Unit: KIOPS) qd 1 4 16 64 qemu35 121 176 153 ioeventfd 41 133 258 313 Changes since v2: - Add memory_region_del_eventfd() when freeing queues to avoid leak Signed-off-by: Jinhao Fan --- hw/nvme/ct

[PATCH v2] hw/nvme: Use ioeventfd to handle doorbell updates

2022-06-29 Thread Jinhao Fan
e doorbell update event. IOPS comparison on Linux 5.19-rc2: (Unit: KIOPS) qd 1 4 16 64 qemu35 121 176 153 ioeventfd 41 133 258 313 Changes since v1: - Return value handling code cleanup Signed-off-by: Jinhao Fan --- hw/nvme/ct

Re: [PATCH] hw/nvme: Use ioeventfd to handle doorbell updates

2022-06-29 Thread Jinhao Fan
at 4:13 AM, Klaus Jensen wrote: > On Jun 27 18:48, Jinhao Fan wrote: >> Add property "ioeventfd" which is enabled by default. When this is >> enabled, updates on the doorbell registers will cause KVM to signal >> an event to the QEMU main loop to handle the

Re: [PATCH] hw/nvme: Use ioeventfd to handle doorbell updates

2022-06-29 Thread Jinhao Fan
> That looks correct since we don't need the ioevent is an optional > optimization. > > I would just suggest making this easier to read. For example, in > nvme_init_sq_ioeventfd(), instead of assigning within a conditional: > >if ((ret = event_notifier_init(&cq->notifier, 0))) { > > Do ea

Re: [PATCH] hw/nvme: Use ioeventfd to handle doorbell updates

2022-06-29 Thread Jinhao Fan
Ping~ > @@ -4271,6 +4343,11 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, > uint64_t dma_addr, > if (n->dbbuf_enabled) { > sq->db_addr = n->dbbuf_dbs + (sqid << 3); > sq->ei_addr = n->dbbuf_eis + (sqid << 3); > + > +if (n->params.ioeventfd && sq-

Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-06-27 Thread Jinhao Fan
> On Jun 28, 2022, at 3:33 AM, Klaus Jensen wrote: > > On Jun 27 13:17, Keith Busch wrote: >> On Thu, Jun 16, 2022 at 08:34:07PM +0800, Jinhao Fan wrote: >>> } >>> sq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_process_sq, sq); >>> >

[PATCH] hw/nvme: Use ioeventfd to handle doorbell updates

2022-06-27 Thread Jinhao Fan
e doorbell update event. IOPS comparison on Linux 5.19-rc2: (Unit: KIOPS) qd 1 4 16 64 qemu35 121 176 153 ioeventfd 41 133 258 313 Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c | 97 +- hw/nvme/nvme.h | 5 +++ 2 files ch

hw/nvme: why schedule sq timer when cq is full?

2022-06-24 Thread Jinhao Fan
etions”.Am I missing something? Thanks, Jinhao Fan

Re: [PATCH v3 0/2] hw/nvme: Add shadow doorbell buffer support

2022-06-17 Thread Jinhao Fan
> On Jun 17, 2022, at 8:56 PM, Klaus Jensen wrote: > > On Jun 17 20:47, Jinhao Fan wrote: >> >> >>> On Jun 17, 2022, at 7:54 PM, Klaus Jensen wrote: >>> >>> LGTM, >>> >>> Reviewed-by: Klaus Jensen >>> >> &g

Re: [PATCH v3 0/2] hw/nvme: Add shadow doorbell buffer support

2022-06-17 Thread Jinhao Fan
> On Jun 17, 2022, at 7:54 PM, Klaus Jensen wrote: > > On Jun 16 20:34, Jinhao Fan wrote: >> This patch adds shadow doorbell buffer support in NVMe 1.3 to QEMU >> NVMe. The Doorbell Buffer Config admin command is implemented for the >> guest to enable shadow doobel

[PATCH v3 0/2] hw/nvme: Add shadow doorbell buffer support

2022-06-16 Thread Jinhao Fan
hosts that do not use admin queue shadow doorbell Jinhao Fan (2): hw/nvme: Implement shadow doorbell buffer support hw/nvme: Add trace events for shadow doorbell buffer hw/nvme/ctrl.c | 118 ++- hw/nvme/nvme.h | 8 +++ hw/nvme/trace-events |

[PATCH v3 2/2] hw/nvme: Add trace events for shadow doorbell buffer

2022-06-16 Thread Jinhao Fan
command. Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c | 5 + hw/nvme/trace-events | 5 + 2 files changed, 10 insertions(+) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index f3aaff3e8d..c952c34f94 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -1309,6 +1309,7 @@ static void

[PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-06-16 Thread Jinhao Fan
dmin queue. Copying the doorbell register value to the shadow doorbell buffer allows us to support these hosts as well as spec-compliant hosts that use shadow doorbell buffer for the admin queue. Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c | 113 ++-

Re: [PATCH v2 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-06-16 Thread Jinhao Fan
> On Jun 16, 2022, at 6:40 PM, Klaus Jensen wrote: > > This wont work for drivers that *do* rely on updating the buffer for > admin queues, so we should read it regardless of the value of the queue > id (since we are now updating it through the "Keith Hack^TM"). Sure. Sorry I forgot to update th

Re: [PATCH 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-06-15 Thread Jinhao Fan
> On Jun 15, 2022, at 5:38 PM, Klaus Jensen wrote: > > I prefer we use the NVMe terminology to minimize misunderstandings, so > "host" means the driver and "device" means the qemu side of things > Thanks for helping me disambiguate this! Now that we have resolved all issues in v1, I’ve subm

[PATCH v2 2/2] hw/nvme: Add trace events for shadow doorbell buffer

2022-06-15 Thread Jinhao Fan
command. Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c | 5 + hw/nvme/trace-events | 5 + 2 files changed, 10 insertions(+) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 7be2e43f52..77fa79143d 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -1309,6 +1309,7 @@ static void

[PATCH v2 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-06-15 Thread Jinhao Fan
dmin queue. Copying the doorbell register value to the shadow doorbell buffer allows us to support these hosts as well as spec-compliant hosts that use shadow doorbell buffer for the admin queue. Signed-off-by: Jinhao Fan --- hw/nvme/ctrl.c | 112 ++-

[PATCH v2 0/2] hw/nvme: Add shadow doorbell buffer support

2022-06-15 Thread Jinhao Fan
rect=1 verify=0 time_based=1 ramp_time=0 runtime=30 ;size=1G ;iodepth=1 rw=randread bs=4k [test] numjobs=1 Changes since v1: - Add compatibility with hosts that do not use admin queue shadow doorbell Jinhao Fan (2): hw/nvme: Implement shadow doorbell buffer support hw/nvme: Add trace eve

Re: [PATCH 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-06-15 Thread Jinhao Fan
eventidx of 3 > nvme_dbbuf_need_event (1) > > set eventidx to 4 Therefore, at this point, we read the tail of 5. > go to sleep > > at (1), our tail update of 4->5 doesn't straddle the eventidx, so we don't > send > any MMIO, and the device won't wake up. This is why the above code checks the > tail twice for any concurrent update. Thanks, Jinhao Fan > > regards > john

Re: [PATCH 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-06-14 Thread Jinhao Fan
> On Jun 14, 2022, at 11:41 PM, Keith Busch wrote: > > It's a pretty nasty hack, and definitely not in compliance with the spec: the > db_addr is supposed to be read-only from the device side, though I do think > it's safe for this environment. Unless Klaus or anyone finds something I'm > missi

  1   2   >