On Wed, Jul 13, 2016 at 1:45 PM, Kevin Wolf <kw...@redhat.com> wrote: > Am 13.07.2016 um 13:33 hat Roman Penyaev geschrieben: >> Just to be sure that we are on the same page: >> >> 1. We have this commit "linux-aio: Cancel BH if not needed" which >> >> a) introduces performance regression on my fio workloads on the >> following config: "iothread=1, VCPU=8, MQ=8". Performance >> dropped from 1878MB/s to 1606MB/s with Stefan's fix, that is >> ~14%. > > Do we already understand why the performance regresses with the patch?
This is very good question from the author of the patch. Speaking for myself, I do not understand. > As long as we don't, everything we do is just guesswork. Kevin, did you miss the ending "with Stefan's fix" ? Since your patch reproduces another problem it is impossible to test it isolated or IO hangs. I tested four variations and invite you to do the same, since you are the author of the debatable patch: 1. as-is, i.e. + Stefan's "virtio-blk: dataplane multiqueue support" + yours "linux-aio: Cancel BH if not needed" As we already discussed - IO hangs. 2. + Stefan's "linux-aio: keep processing events if MAX_EVENTS reached" READ: io=48199MB, aggrb=1606.5MB/s, minb=1606.5MB/s, maxb=1606.5MB/s, mint=30003msec, maxt=30003msec WRITE: io=48056MB, aggrb=1601.8MB/s, minb=1601.8MB/s, maxb=1601.8MB/s, mint=30003msec, maxt=30003msec 3. - Stefan's "linux-aio: keep processing events if MAX_EVENTS reached" + my "linux-aio: prevent submitting more than MAX_EVENTS" READ: io=53294MB, aggrb=1776.3MB/s, minb=1776.3MB/s, maxb=1776.3MB/s, mint=30003msec, maxt=30003msec WRITE: io=53177MB, aggrb=1772.4MB/s, minb=1772.4MB/s, maxb=1772.4MB/s, mint=30003msec, maxt=30003msec 4. - my "linux-aio: prevent submitting more than MAX_EVENTS" - yours "linux-aio: Cancel BH if not needed" READ: io=56362MB, aggrb=1878.4MB/s, minb=1878.4MB/s, maxb=1878.4MB/s, mint=30007msec, maxt=30007msec WRITE: io=56255MB, aggrb=1874.8MB/s, minb=1874.8MB/s, maxb=1874.8MB/s, mint=30007msec, maxt=30007msec The drop from 1878MB/s to 1776MB/s is ~5% and probably can be ignored (I say probably because would be nice to have some other numbers from you or anybody else) The drop from 1878MB/s to 1606Mb/s is ~14% and seems like a serious degradation. Also to go deeper and to avoid possible suspicions I tested isolated Stefan's "linux-aio: keep processing events if MAX_EVENTS reached" patch, without yours "linux-aio: Cancel BH if not needed": READ: io=109970MB, aggrb=1832.8MB/s, minb=1832.8MB/s, maxb=1832.8MB/s, mint=60003msec, maxt=60003msec WRITE: io=109820MB, aggrb=1830.3MB/s, minb=1830.3MB/s, maxb=1830.3MB/s, mint=60003msec, maxt=60003msec As you can see no significant drop. That means that only the following pair: Stefan's "linux-aio: keep processing events if MAX_EVENTS reached" yours "linux-aio: Cancel BH if not needed" impacts the performance. As I already told we have different choices to follow and one of them (the simplest) is to revert everything. -- Roman > > Kevin > >> b) reproduces IO hang, because of in-flights > MAX_EVENTS. >> >> So probably this commit should be reverted because of a) not b). >> >> 2. Stefan has fix for 1.b) issue repeating io_getevents(), which >> is obviously an excess for generic cases where MQ=1 (queue depth >> for virtio_blk is also set to 128 i.e. equal to MAX_EVENTS on >> QEMU side). >> >> 3. The current patch also aims to fix 1.b) issue restricting number >> of in-flights. >> >> >> Reverting 1. will fix all the problems, without any need to apply >> 2. or 3. The most lazy variant. >> >> Restricting in-flights is also a step forward, since submitting >> till -EAGAIN is also possible, but leads (as we already know) to >> IO hang on specific loads and conditions. >> >> But 2. and 3. are mutual exclusive and should not be applied >> together. >> >> So we have several alternatives and a choice what to follow. >> >> -- >> Roman