On 15/07/2016 12:17, Roman Penyaev wrote: > On Fri, Jul 15, 2016 at 11:58 AM, Paolo Bonzini <pbonz...@redhat.com> wrote: >> >> >> On 15/07/2016 11:18, Roman Penyaev wrote: >>> Those 3 red spikes and a blue hill is what we have to focus on. The >>> blue hill at the right corner of the chart means that almost always the >>> ring buffer was observed as full, i.e. qemu_laio_completion_bh() got >>> a chance to reap completions not very often, meanwhile completed >>> requests stand in the ring buffer for quite a long time which degrades >>> the overall performance. >>> >>> The results covered by the red line are much better and that can be >>> explained by those 3 red spikes, which are almost in the middle of the >>> whole distribution, i.e. qemu_laio_completion_bh() is called more often, >>> completed requests do not stall, giving fio a chance to submit new fresh >>> requests. >>> >>> The theoretical fix would be to schedule completion BH just after >>> successful io_submit, i.e.: >> >> What about removing the qemu_bh_cancel but keeping the rest of the patch? > > That exactly what I did. Numbers go to expected from ~1600MB/s to ~1800MB/s. > So basically this hunk of the debatable patch: > > if (event_notifier_test_and_clear(&s->e)) { > - qemu_bh_schedule(s->completion_bh); > + qemu_laio_completion_bh(s); > } > > does not have any impact and can be ignored. At least I did not notice > anything important. > >> >> I'm also interested in a graph with this patch ("linux-aio: prevent >> submitting more than MAX_EVENTS") on top of origin/master. > > I can plot it also of course. > >> >> Thanks for the analysis. Sometimes a picture _is_ worth a thousand >> words, even if it's measuring "only" second-order effects (# of >> completions is not what causes the slowdown, but # of completions >> affects latency which causes the slowdown). > > Yes, you are right, latency. With userspace io_getevents ~0 costs we > can peek requests as often as we like to decrease latency on very > fast devices. That can also bring something. Probably after each > io_submit() it makes sense to peek and complete something.
Right, especially 1) because io_getevents with timeout 0 is cheap (it peeks at the ring buffer before the syscall); 2) because we want anyway to replace io_getevents with userspace code through your other patch. Paolo