at 10:11 PM, Klaus Jensen wrote:
> On Nov 8 12:39, John Levon wrote:
>> On Fri, Nov 04, 2022 at 07:32:12AM +0100, Klaus Jensen wrote:
>>
>>> On Nov 3 21:19, Jinhao Fan wrote:
>>>> On 11/3/2022 8:10 PM, Klaus Jensen wrote:
>>>>> I agree that
On 11/3/2022 8:10 PM, Klaus Jensen wrote:
I agree that the spec is a little unclear on this point. In any case, in
Linux, when the driver has decided that the sq tail must be updated,
it will use this check:
(new_idx - event_idx - 1) < (new_idx - old)
When eventidx is already behind, it's l
On 11/3/2022 8:11 PM, Klaus Jensen wrote:
I'll rerun some experiments and get back to you 😄
Thanks Klaus. I'll also run some experments on my machine, to see if the
reenabling cqe batching patch solves this problem.
at 7:10 PM, Klaus Jensen wrote:
> This doesn't do what you expect it to. By not updaring the eventidx it
> will fall behind the actual head, causing the host to think that the
> device is not processing events (but it is!), resulting in doorbell
> ringing.
I’m not sure I understand this correctl
at 7:13 PM, Klaus Jensen wrote:
> Otherwise, it all looks fine. I'm still seeing the weird slowdown when
> an iothread is enabled. I have yet to figure out why that is... But it
> scales! :)
How much slowdown do you observe on your machine?
at 1:37 PM, Klaus Jensen wrote:
> On Okt 21 10:37, Jinhao Fan wrote:
>> at 7:35 PM, Klaus Jensen wrote:
>>
>>> Commit 2e53b0b45024 ("hw/nvme: Use ioeventfd to handle doorbell
>>> updates") had the unintended effect of disabling batching of CQEs.
>
at 7:35 PM, Klaus Jensen wrote:
> Commit 2e53b0b45024 ("hw/nvme: Use ioeventfd to handle doorbell
> updates") had the unintended effect of disabling batching of CQEs.
>
> This patch changes the sq/cq timers to bottom halfs and instead of
> calling nvme_post_cqes() immediately (causing an interru
I also came across the Interrupt Coalescing feature in NVMe (5.21.1.8 in
Spec 1.4c). It seems to suggest an algorithm based on thresholds for the
maximum interrupt delay time and minimum number of coalesced interrupts.
Klaus, based on your experience, how does this algorithm work? Is is
commonly u
On 10/12/2022 10:39 PM, Klaus Jensen wrote:
I have been meaning to pick it up, but I got side-tracked. The polling
performance drop needs to be address as we discussed offline.
But the v4 looks pretty good and I can pick that up without the polling
support for now.
I've been using the v4 witho
shadow doorbell buffer when we want to post a cqe. Updates on the
SQ eventidx buffer is stopped during polling to avoid the host doing
unnecessary doorbell buffer writes.
Comparison (KIOPS):
QD1 4 16 64
QEMU 53 155 245 309
polling 123 165 189 191
Signed-off-by: Jinhao Fan
---
hw/nvme
duplicate initilization of cq timer
Jinhao Fan (4):
hw/nvme: support irq(de)assertion with eventfd
hw/nvme: use KVM irqfd when available
hw/nvme: add iothread support
hw/nvme: add polling support
hw/nvme/ctrl.c | 394 ---
hw/nvme/ns.c
Asserting and deasseting irq with eventfd has some performance
implications. For small queue depth it increases request latency but
for large queue depth it effectively coalesces irqs.
Comparision (KIOPS):
QD1 4 16 64
QEMU 38 123 210 329
irq-eventfd 32 106 240 364
Signed
Use KVM's irqfd to send interrupts when possible. This approach is
thread safe. Moreover, it does not have the inter-thread communication
overhead of plain event notifiers since handler callback are called
in the same system call as irqfd write.
Signed-off-by: Jinhao Fan
Signed-off-by:
oorbells, which I will bring up in a following patch.
Iothread can be enabled by:
-object iothread,id=nvme0 \
-device nvme,iothread=nvme0 \
Performance comparisons (KIOPS):
QD 1 4 16 64
QEMU 41 136 242 338
iothread 53 155 245 309
Signed-off-by: Jinhao Fan
---
hw/n
at 11:34 PM, Keith Busch wrote:
> On Fri, Aug 26, 2022 at 11:12:04PM +0800, Jinhao Fan wrote:
>> Use KVM's irqfd to send interrupts when possible. This approach is
>> thread safe. Moreover, it does not have the inter-thread communication
>> overhead of plain eve
at 11:34 PM, Keith Busch wrote:
> On Fri, Aug 26, 2022 at 11:12:04PM +0800, Jinhao Fan wrote:
>> Use KVM's irqfd to send interrupts when possible. This approach is
>> thread safe. Moreover, it does not have the inter-thread communication
>> overhead of plain eve
oorbells, which I will bring up in a following patch.
Iothread can be enabled by:
-object iothread,id=nvme0 \
-device nvme,iothread=nvme0 \
Performance comparisons (KIOPS):
QD 1 4 16 64
QEMU 41 136 242 338
iothread 53 155 245 309
Signed-off-by: Jinhao Fan
---
hw/n
:
- Avoid duplicate initilization of cq timer
Jinhao Fan (3):
hw/nvme: support irq(de)assertion with eventfd
hw/nvme: use KVM irqfd when available
hw/nvme: add iothread support
hw/nvme/ctrl.c | 328 +++
hw/nvme/ns.c | 21 ++-
hw/nvme/nvme.h
at 7:18 PM, Jinhao Fan wrote:
> @@ -4979,7 +5007,13 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n,
> uint64_t dma_addr,
> }
> }
> n->cq[cqid] = cq;
> -cq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_post_cqes, cq);
> +
> +if (cq
Asserting and deasseting irq with eventfd has some performance
implications. For small queue depth it increases request latency but
for large queue depth it effectively coalesces irqs.
Comparision (KIOPS):
QD1 4 16 64
QEMU 38 123 210 329
irq-eventfd 32 106 240 364
Signed
Use KVM's irqfd to send interrupts when possible. This approach is
thread safe. Moreover, it does not have the inter-thread communication
overhead of plain event notifiers since handler callback are called
in the same system call as irqfd write.
Signed-off-by: Jinhao Fan
Signed-off-by:
oorbells, which I will bring up in a following patch.
Iothread can be enabled by:
-object iothread,id=nvme0 \
-device nvme,iothread=nvme0 \
Performance comparisons (KIOPS):
QD 1 4 16 64
QEMU 41 136 242 338
iothread 53 155 245 309
Signed-off-by: Jinhao Fan
---
hw/n
Use KVM's irqfd to send interrupts when possible. This approach is
thread safe. Moreover, it does not have the inter-thread communication
overhead of plain event notifiers since handler callback are called
in the same system call as irqfd write.
Signed-off-by: Jinhao Fan
Signed-off-by:
Asserting and deasseting irq with eventfd has some performance
implications. For small queue depth it increases request latency but
for large queue depth it effectively coalesces irqs.
Comparision (KIOPS):
QD1 4 16 64
QEMU 38 123 210 329
irq-eventfd 32 106 240 364
Signed
This patch series adds support for using a seperate iothread for NVMe
IO emulation, which brings the potential of applying polling. The
first two patches implements support for irqfd, which solves thread
safety problems for interrupt emulation outside the main loop thread.
Jinhao Fan (3):
hw
Sure. I’ve already reworked this iothread patch upon the new irqfd patch. I
think I can post a v2 patch today. Do you mean I include irqfd v3 in the new
iothread patch series?
发自我的iPhone
> 在 2022年8月26日,15:12,Klaus Jensen 写道:
>
> On Jul 20 17:00, Jinhao Fan wrote:
>> Add an o
s. Removing it from nvme_init_irq_notifier() makes
> sure we do not try to double add.
>
> Now it does what it is supposed to; no hacks required :)
>
> Jinhao Fan (2):
> hw/nvme: support irq(de)assertion with eventfd
> hw/nvme: use KVM irqfd when available
>
> h
On 8/25/2022 9:59 PM, Klaus Jensen wrote:
On Aug 25 21:09, Jinhao Fan wrote:
在 2022年8月25日,20:39,Klaus Jensen 写道:
On Aug 25 13:56, Klaus Jensen wrote:
On Aug 25 19:16, Jinhao Fan wrote:
On 8/25/2022 5:33 PM, Klaus Jensen wrote:
I'm still a bit perplexed by this issue, so I just
On 8/25/2022 7:56 PM, Klaus Jensen wrote:
On Aug 25 19:16, Jinhao Fan wrote:
On 8/25/2022 5:33 PM, Klaus Jensen wrote:
I'm still a bit perplexed by this issue, so I just tried moving
nvme_init_irq_notifier() to the end of nvme_init_cq() and removing this
first_io_cqe thing. I did not ob
> 在 2022年8月25日,20:39,Klaus Jensen 写道:
>
> On Aug 25 13:56, Klaus Jensen wrote:
>>> On Aug 25 19:16, Jinhao Fan wrote:
>>> On 8/25/2022 5:33 PM, Klaus Jensen wrote:
>>>> I'm still a bit perplexed by this issue, so I just tried moving
>>>
On 8/25/2022 5:33 PM, Klaus Jensen wrote:
I'm still a bit perplexed by this issue, so I just tried moving
nvme_init_irq_notifier() to the end of nvme_init_cq() and removing this
first_io_cqe thing. I did not observe any particular issues?
What bad behavior did you encounter, it seems to work fin
When irqfd is enabled, we bypass QEMU's irq emulation and let KVM to
directly assert the irq. However, KVM is not aware of the device's MSI-x
masking status. Add MSI-x mask bookkeeping in NVMe emulation and
detach the corresponding irqfd when the certain vector is masked.
Signed-off-
Asserting and deasseting irq with eventfd has some performance
implications. For small queue depth it increases request latency but
for large queue depth it effectively coalesces irqs.
Comparision (KIOPS):
QD1 4 16 64
QEMU 38 123 210 329
irq-eventfd 32 106 240 364
Signed
Use KVM's irqfd to send interrupts when possible. This approach is
thread safe. Moreover, it does not have the inter-thread communication
overhead of plain event notifiers since handler callback are called
in the same system call as irqfd write.
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c
mulation is bypassed.
Changes since v1:
- Made nvme_irq_(de)assert wrappers around eventfd call and actual irq
assertion
- Dropped the previous first patch to avoid duplicate checks for
irq_enabled and msix_enabled
Jinhao Fan (3):
hw/nvme: support irq(de)assertion with eventfd
hw/nvme
On 8/24/2022 7:22 PM, Klaus Jensen wrote:
What are the implications if we drop it? That is, if we go back to your
version that did not include this? If it doesnt impact the kvm irqchip
logic, then I'd rather that we rip it out and leave the device without
masking/unmasking support, keeping irqfd
On 8/16/2022 6:46 PM, Klaus Jensen wrote:
Did qtest work out for you for testing? If so, it would be nice to add a
simple test case as well.
Since MSI-x masking handlers are only implemented for IO queues, if we
want to use qtest we need to implement utilities for controller
initialization an
at 11:24 PM, Stefan Hajnoczi wrote:
> Can the logic be moved into assert()/deassert() so callers don't need
> to duplicate the checks?
>
> (I assume the optimization is that eventfd syscalls are avoided, not
> that the function call is avoided.)
I guess I can move the eventfd syscall into asser
at 7:20 PM, Klaus Jensen wrote:
> This option does not seem to change anything - the value is never used
> ;)
What a stupid mistake. I’ll fix this in the next version.
at 6:46 PM, Klaus Jensen wrote:
>
> Did qtest work out for you for testing? If so, it would be nice to add a
> simple test case as well.
I found MSI-x masking harder to test than we imagined. My plan is to only
emulate IO queues in the IOthread and leave admin queue emulation in the
main loop s
at 11:37 PM, Jinhao Fan wrote:
> This patch series changes qemu-nvme's interrupt emulation to use event
> notifiers, which can ensure thread-safe interrupt delivery when iothread
> is used. In the first two patches, I convert qemu-nvme's IO emulation
> logic to send irq vi
When irqfd is enabled, we bypass QEMU's irq emulation and let KVM to
directly assert the irq. However, KVM is not aware of the device's MSI-x
masking status. Add MSI-x mask bookkeeping in NVMe emulation and
detach the corresponding irqfd when the certain vector is masked.
Signed-off-
Use KVM's irqfd to send interrupts when possible. This approach is
thread safe. Moreover, it does not have the inter-thread communication
overhead of plain event notifiers since handler callback are called
in the same system call as irqfd write.
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c
Asserting and deasseting irq with eventfd has some performance
implications. For small queue depth it increases request latency but
for large queue depth it effectively coalesces irqs.
Comparision (KIOPS):
QD1 4 16 64
QEMU 38 123 210 329
irq-eventfd 32 106 240 364
Signed
the unnecessary overhead of signalling eventfd.
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c | 40 ++--
1 file changed, 22 insertions(+), 18 deletions(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 87aeba0564..bd3350d7e0 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/n
s MSI-x emulation is bypassed.
Jinhao Fan (4):
hw/nvme: avoid unnecessary call to irq (de)assertion functions
hw/nvme: add option to (de)assert irq with eventfd
hw/nvme: use irqfd to send interrupts
hw/nvme: add MSI-x mask handlers for irqfd
hw/nvme/ctrl.c | 251 +++
at 12:35 PM, Jinhao Fan wrote:
> static void nvme_irq_assert(NvmeCtrl *n, NvmeCQueue *cq)
> {
> if (cq->irq_enabled) {
> if (msix_enabled(&(n->parent_obj))) {
> +/* Initialize CQ irqfd */
> +if (!cq->irqfd_enabled &&am
With the introduction of shadow doorbell and ioeventfd, we need to do
frequent conversion between qid and its doorbell offset. The original
hard-coded calculation is confusing and error-prone. Add several helper
functions to do this task.
Signed-off-by: Jinhao Fan
---
Changes since v1:
- Use
at 4:54 PM, Klaus Jensen wrote:
> I am unsure if the compiler will transform that division into the shift
> if it can infer that the divisor is a power of two (it most likely
> will be able to).
>
> But I see no reason to have a potential division here when we can do
> without and to me it is ju
at 2:02 PM, Klaus Jensen wrote:
> On Jul 28 16:07, Jinhao Fan wrote:
>> With the introduction of shadow doorbell and ioeventfd, we need to do
>> frequent conversion between qid and its doorbell offset. The original
>> hard-coded calculation is confusing and error-pron
asked.
Hi Stefan,
While implementing interrupt masking support, I found it hard to test this
feature on the host. Keith told me that no NVMe drivers are currently using
this feature. Do you remember how you tested interrupt masking?
Regards,
Jinhao Fan
at 4:07 PM, Jinhao Fan wrote:
> With the introduction of shadow doorbell and ioeventfd, we need to do
> frequent conversion between qid and its doorbell offset. The original
> hard-coded calculation is confusing and error-prone. Add several helper
> functions to do this task.
>
at 11:18 PM, Stefan Hajnoczi wrote:
> I think that is incorrect. QEMU has guest notifier emulation for the
> non-KVM (and non-MSI-X PCI) cases. When there is no irqfd support
> available, QEMU sets up a regular eventfd and calls
> virtio_queue_guest_notifier_read() when it becomes readable.
Than
not enable ioeventfd by default
>
> hw/nvme/ctrl.c | 12 +---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> --
> 2.36.1
Looks good to me as well.
Reviewed-by: Jinhao Fan
With the introduction of shadow doorbell and ioeventfd, we need to do
frequent conversion between qid and its doorbell offset. The original
hard-coded calculation is confusing and error-prone. Add several helper
functions to do this task.
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c | 61
After digging through the source code, I found event_notifier_cleanup() only
closes the eventfd, but does not de-register the event from QEMU’s event
loop. event_notifier_set_handler() actually interacts with the event loop.
It is a wrapper around aio_set_event_notifier(), which is again a wrapper
at 3:06 PM, Klaus Jensen wrote:
> On Jul 26 14:08, Klaus Jensen wrote:
>> Alright. Forget about the iommu, that was just a coincidence.
>>
>> This patch seems to fix it. I guess it is the
>> event_notifier_set_handler(..., NULL) that does the trick, but I'd like
>> to understand why ;)
>>
>>
>
at 3:10 PM, Klaus Jensen wrote:
> On Jul 20 10:47, Jinhao Fan wrote:
>> at 10:41 PM, Jinhao Fan wrote:
>>
>>> at 1:34 PM, Klaus Jensen wrote:
>>>
>>>> From: Klaus Jensen
>>>>
>>>> While testing Jinhaos ioeventfd patch I f
at 10:56 AM, Jinhao Fan wrote:
> at 2:07 AM, Keith Busch wrote:
>
>> MSI-x uses MMIO for masking, so there's no need for an NVMe specific way to
>> mask these interrupts. You can just use the generic PCIe methods to clear the
>> vector's enable bit. But no NV
at 2:07 AM, Keith Busch wrote:
> MSI-x uses MMIO for masking, so there's no need for an NVMe specific way to
> mask these interrupts. You can just use the generic PCIe methods to clear the
> vector's enable bit. But no NVMe driver that I know of is making use of these
> either, though it should b
at 10:45 PM, Keith Busch wrote:
> On Tue, Jul 26, 2022 at 04:55:54PM +0800, Jinhao Fan wrote:
>> Hi Klaus and Keith,
>>
>> I just added support for interrupt masking. How can I test interrupt
>> masking?
>
> Are you asking about MSI masking? I don't thi
at 5:00 PM, Jinhao Fan wrote:
> Add an option "iothread=x" to do emulation in a seperate iothread.
> This improves the performance because QEMU's main loop is responsible
> for a lot of other work while iothread is dedicated to NVMe emulation.
> Moreover, emula
at 3:41 PM, Klaus Jensen wrote:
> On Jul 26 15:35, Jinhao Fan wrote:
>> at 4:55 AM, Klaus Jensen wrote:
>>
>>> We have a regression following this patch that we need to address.
>>>
>>> With this patch, issuing a reset on the device (`nvme reset /dev
at 4:55 AM, Klaus Jensen wrote:
>
> We have a regression following this patch that we need to address.
>
> With this patch, issuing a reset on the device (`nvme reset /dev/nvme0`
> will do the trick) causes QEMU to hog my host cpu at 100%.
>
> I'm still not sure what causes this. The trace out
at 3:36 AM, Stefan Hajnoczi wrote:
>
>
> On Sun, Jul 24, 2022, 11:21 Jinhao Fan wrote:
> at 9:29 PM, Stefan Hajnoczi wrote:
>
> >
> > Nice, perf(1) is good for that. You can enable trace events and add
> > kprobes/uprobes to record timestamps wh
at 9:29 PM, Stefan Hajnoczi wrote:
>
> Nice, perf(1) is good for that. You can enable trace events and add
> kprobes/uprobes to record timestamps when specific functions are entered.
>
Thanks Stefan,
One last question: Currently we can achieve hundreds of KIOPS. That means
perf can easily cap
The commit "Use io_uring_register_ring_fd() to skip fd operations" uses
warn_report but did not include the header file "qemu/error-report.h".
This causes "error: implicit declaration of function ‘warn_report’".
Include this header file.
Signed-off-by: Jinhao Fan
Hi Stefan,
Thanks for the detailed explanation!
at 6:21 PM, Stefan Hajnoczi wrote:
> Hi Jinhao,
> Thanks for working on this!
>
> irqfd is not necessarily faster than KVM ioctl interrupt injection.
>
> There are at least two non performance reasons for irqfd:
> 1. It avoids QEMU emulation co
at 12:35 PM, Jinhao Fan wrote:
> Use irqfd to directly notify KVM to inject interrupts. This is done by
> registering a virtual IRQ(virq) in KVM and associate the virq with an
> irqfd, so that KVM can directly inject the interrupt when it receives
> notification from the irqfd. This
oorbells, which I will bring up in a following patch.
Iothread can be enabled by:
-object iothread,id=nvme0 \
-device nvme,iothread=nvme0 \
Performance comparisons (KIOPS):
QD 1 4 16 64
QEMU 41 136 242 338
iothread 53 155 245 309
Signed-off-by: Jinhao Fan
---
hw/n
at 10:41 PM, Jinhao Fan wrote:
> at 1:34 PM, Klaus Jensen wrote:
>
>> From: Klaus Jensen
>>
>> While testing Jinhaos ioeventfd patch I found it useful with a couple of
>> additional trace events since we no longer see the mmio events.
>>
>> Sign
at 1:34 PM, Klaus Jensen wrote:
> From: Klaus Jensen
>
> While testing Jinhaos ioeventfd patch I found it useful with a couple of
> additional trace events since we no longer see the mmio events.
>
> Signed-off-by: Klaus Jensen
> ---
> hw/nvme/ctrl.c | 8
> hw/nvme/trace-events
at 12:18 PM, Klaus Jensen wrote:
> On Jul 12 14:26, Klaus Jensen wrote:
>> On Jul 9 12:35, Jinhao Fan wrote:
>>> Use irqfd to directly notify KVM to inject interrupts. This is done by
>>> registering a virtual IRQ(virq) in KVM and associate the virq with an
>>
kernel KVM.
Could you help me figure out the following questions:
1. How much performance improvement can I expect from using irqfd?
2. How can I debug this kind of cross QEMU-KVM problems?
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c | 67 -
hw/n
at 10:24 PM, Jinhao Fan wrote:
> @@ -5793,6 +5891,7 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const
> NvmeRequest *req)
> uint64_t dbs_addr = le64_to_cpu(req->cmd.dptr.prp1);
> uint64_t eis_addr = le64_to_cpu(req->cmd.dptr.prp2);
> int i;
> +int
at 1:51 PM, Klaus Jensen wrote:
> On Jul 6 19:34, Jinhao Fan wrote:
>> at 2:43 AM, Keith Busch wrote:
>>
>>> On Tue, Jul 05, 2022 at 07:11:36PM +0200, Klaus Jensen wrote:
>>>> On Jul 5 22:24, Jinhao Fan wrote:
>>>>> @@ -1374,7 +1374,14 @@
at 2:43 AM, Keith Busch wrote:
> On Tue, Jul 05, 2022 at 07:11:36PM +0200, Klaus Jensen wrote:
>> On Jul 5 22:24, Jinhao Fan wrote:
>>> @@ -1374,7 +1374,14 @@ static void nvme_enqueue_req_completion(NvmeCQueue
>>> *cq, NvmeRequest *req)
>>>
>>>
at 1:11 AM, Klaus Jensen wrote:
> On Jul 5 22:24, Jinhao Fan wrote:
>> Add property "ioeventfd" which is enabled by default. When this is
>> enabled, updates on the doorbell registers will cause KVM to signal
>> an event to the QEMU main loop to handle the
e doorbell
update event.
IOPS comparison on Linux 5.19-rc2: (Unit: KIOPS)
qd 1 4 16 64
qemu35 121 176 153
ioeventfd 41 133 258 313
Changes since v3:
- Do not deregister ioeventfd when it was not enabled on a SQ/CQ
Signed-off-by: Jinhao Fan
---
hw/nvme/ct
at 12:04 AM, Keith Busch wrote:
> On Thu, Jun 30, 2022 at 11:22:31AM +0800, Jinhao Fan wrote:
>> +static int nvme_init_sq_ioeventfd(NvmeSQueue *sq)
>> +{
>> +NvmeCtrl *n = sq->ctrl;
>> +uint16_t offset = sq->sqid << 3;
>> +int ret;
>>
e doorbell
update event.
IOPS comparison on Linux 5.19-rc2: (Unit: KIOPS)
qd 1 4 16 64
qemu35 121 176 153
ioeventfd 41 133 258 313
Changes since v2:
- Add memory_region_del_eventfd() when freeing queues to avoid leak
Signed-off-by: Jinhao Fan
---
hw/nvme/ct
e doorbell
update event.
IOPS comparison on Linux 5.19-rc2: (Unit: KIOPS)
qd 1 4 16 64
qemu35 121 176 153
ioeventfd 41 133 258 313
Changes since v1:
- Return value handling code cleanup
Signed-off-by: Jinhao Fan
---
hw/nvme/ct
at 4:13 AM, Klaus Jensen wrote:
> On Jun 27 18:48, Jinhao Fan wrote:
>> Add property "ioeventfd" which is enabled by default. When this is
>> enabled, updates on the doorbell registers will cause KVM to signal
>> an event to the QEMU main loop to handle the
> That looks correct since we don't need the ioevent is an optional
> optimization.
>
> I would just suggest making this easier to read. For example, in
> nvme_init_sq_ioeventfd(), instead of assigning within a conditional:
>
>if ((ret = event_notifier_init(&cq->notifier, 0))) {
>
> Do ea
Ping~
> @@ -4271,6 +4343,11 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n,
> uint64_t dma_addr,
> if (n->dbbuf_enabled) {
> sq->db_addr = n->dbbuf_dbs + (sqid << 3);
> sq->ei_addr = n->dbbuf_eis + (sqid << 3);
> +
> +if (n->params.ioeventfd && sq-
> On Jun 28, 2022, at 3:33 AM, Klaus Jensen wrote:
>
> On Jun 27 13:17, Keith Busch wrote:
>> On Thu, Jun 16, 2022 at 08:34:07PM +0800, Jinhao Fan wrote:
>>> }
>>> sq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_process_sq, sq);
>>>
>
e doorbell
update event.
IOPS comparison on Linux 5.19-rc2: (Unit: KIOPS)
qd 1 4 16 64
qemu35 121 176 153
ioeventfd 41 133 258 313
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c | 97 +-
hw/nvme/nvme.h | 5 +++
2 files ch
etions”.Am
I missing something?
Thanks,
Jinhao Fan
> On Jun 17, 2022, at 8:56 PM, Klaus Jensen wrote:
>
> On Jun 17 20:47, Jinhao Fan wrote:
>>
>>
>>> On Jun 17, 2022, at 7:54 PM, Klaus Jensen wrote:
>>>
>>> LGTM,
>>>
>>> Reviewed-by: Klaus Jensen
>>>
>>
&g
> On Jun 17, 2022, at 7:54 PM, Klaus Jensen wrote:
>
> On Jun 16 20:34, Jinhao Fan wrote:
>> This patch adds shadow doorbell buffer support in NVMe 1.3 to QEMU
>> NVMe. The Doorbell Buffer Config admin command is implemented for the
>> guest to enable shadow doobel
hosts that do not use admin queue shadow doorbell
Jinhao Fan (2):
hw/nvme: Implement shadow doorbell buffer support
hw/nvme: Add trace events for shadow doorbell buffer
hw/nvme/ctrl.c | 118 ++-
hw/nvme/nvme.h | 8 +++
hw/nvme/trace-events |
command.
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c | 5 +
hw/nvme/trace-events | 5 +
2 files changed, 10 insertions(+)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index f3aaff3e8d..c952c34f94 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1309,6 +1309,7 @@ static void
dmin queue. Copying the doorbell register
value to the shadow doorbell buffer allows us to support these hosts
as well as spec-compliant hosts that use shadow doorbell buffer for
the admin queue.
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c | 113 ++-
> On Jun 16, 2022, at 6:40 PM, Klaus Jensen wrote:
>
> This wont work for drivers that *do* rely on updating the buffer for
> admin queues, so we should read it regardless of the value of the queue
> id (since we are now updating it through the "Keith Hack^TM").
Sure. Sorry I forgot to update th
> On Jun 15, 2022, at 5:38 PM, Klaus Jensen wrote:
>
> I prefer we use the NVMe terminology to minimize misunderstandings, so
> "host" means the driver and "device" means the qemu side of things
>
Thanks for helping me disambiguate this!
Now that we have resolved all issues in v1, I’ve subm
command.
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c | 5 +
hw/nvme/trace-events | 5 +
2 files changed, 10 insertions(+)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 7be2e43f52..77fa79143d 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1309,6 +1309,7 @@ static void
dmin queue. Copying the doorbell register
value to the shadow doorbell buffer allows us to support these hosts
as well as spec-compliant hosts that use shadow doorbell buffer for
the admin queue.
Signed-off-by: Jinhao Fan
---
hw/nvme/ctrl.c | 112 ++-
rect=1
verify=0
time_based=1
ramp_time=0
runtime=30
;size=1G
;iodepth=1
rw=randread
bs=4k
[test]
numjobs=1
Changes since v1:
- Add compatibility with hosts that do not use admin queue shadow doorbell
Jinhao Fan (2):
hw/nvme: Implement shadow doorbell buffer support
hw/nvme: Add trace eve
eventidx of 3
> nvme_dbbuf_need_event (1)
>
> set eventidx to 4
Therefore, at this point, we read the tail of 5.
> go to sleep
>
> at (1), our tail update of 4->5 doesn't straddle the eventidx, so we don't
> send
> any MMIO, and the device won't wake up. This is why the above code checks the
> tail twice for any concurrent update.
Thanks,
Jinhao Fan
>
> regards
> john
> On Jun 14, 2022, at 11:41 PM, Keith Busch wrote:
>
> It's a pretty nasty hack, and definitely not in compliance with the spec: the
> db_addr is supposed to be read-only from the device side, though I do think
> it's safe for this environment. Unless Klaus or anyone finds something I'm
> missi
1 - 100 of 106 matches
Mail list logo