On Mon, Oct 26, 2015 at 7:37 PM, Paolo Bonzini <pbonz...@redhat.com> wrote: > On 26/10/2015 17:31, Andrey Korolyov wrote: >>> the virtio block device is always splitting a single read >>> range request to 4k ones, bringing the overall performance of the >>> sequential reads far below virtio-scsi. >>> >>> How does the blktrace look like in the guest? >> >> Yep, thanks for suggestion. It looks now like a pure driver issue: >> >> Reads Queued: 11008, 44032KiB Writes Queued: 0, >> 0KiB >> Read Dispatches: 11008, 44032KiB Write Dispatches: 0, >> 0KiB >> >> vs >> >> Reads Queued: 185728, 742912KiB Writes Queued: 0, >> 0KiB >> Read Dispatches: 2902, 742912KiB Write Dispatches: 0, >> 0KiB >> >> Because guest virtio-blk driver lacks *any* blk scheduler management, >> this is kinda logical. Requests for scsi backend are dispatched in > ^^^^^^^^^^ > > queued you mean? > >> single block-sized chunks as well, but they are mostly merged by a >> scheduler before being passed to the device layer. Could be there any >> improvements over the situation except writing an underlay b/w virtio >> emulator backend and the real storage? > > This is probably the fall-out of converting the virtio-blk to use > blk-mq, which was premature to say the least. Jeff Moyer was working on > it, but I'm not sure if this has been merged. Andrey, what kernel are > you using? > > Paolo
Queued, sorry for honest mistype, guest kernel is a 3.16.x from jessie, so regular blk-mq is there. Any point of interest for trying something newer? And of course I didn`t thought about something older, will try against 3.10 now.