On Fri, Dec 15, 2017 at 06:02:50PM +0300, Denis V. Lunev wrote: > Linux guests submit IO requests no longer than PAGE_SIZE * max_seg > field reported by SCSI controler. Thus typical sequential read with > 1 MB size results in the following pattern of the IO from the guest: > 8,16 1 15754 2.766095122 2071 D R 2095104 + 1008 [dd] > 8,16 1 15755 2.766108785 2071 D R 2096112 + 1008 [dd] > 8,16 1 15756 2.766113486 2071 D R 2097120 + 32 [dd] > 8,16 1 15757 2.767668961 0 C R 2095104 + 1008 [0] > 8,16 1 15758 2.768534315 0 C R 2096112 + 1008 [0] > 8,16 1 15759 2.768539782 0 C R 2097120 + 32 [0] > The IO was generated by > dd if=/dev/sda of=/dev/null bs=1024 iflag=direct > > This effectively means that on rotational disks we will observe 3 IOPS > for each 2 MBs processed. This definitely negatively affects both > guest and host IO performance. > > The cure is relatively simple - we should report lengthy scatter-gather > ability of the SCSI controller. Fortunately the situation here is very > good. VirtIO transport layer can accomodate 1024 items in one request > while we are using only 128. This situation is present since almost > very beginning. 2 items are dedicated for request metadata thus we > should publish VIRTQUEUE_MAX_SIZE - 2 as max_seg. > > The following pattern is observed after the patch: > 8,16 1 9921 2.662721340 2063 D R 2095104 + 1024 [dd] > 8,16 1 9922 2.662737585 2063 D R 2096128 + 1024 [dd] > 8,16 1 9923 2.665188167 0 C R 2095104 + 1024 [0] > 8,16 1 9924 2.665198777 0 C R 2096128 + 1024 [0] > which is much better. > > The dark side of this patch is that we are tweaking guest visible > parameter, though this should be relatively safe as above transport > layer support is present in QEMU/host Linux for a very long time. > The patch adds configurable property for VirtIO SCSI with a new default > and hardcode option for VirtBlock which does not provide good > configurable framework. > > Signed-off-by: Denis V. Lunev <d...@openvz.org> > CC: "Michael S. Tsirkin" <m...@redhat.com> > CC: Stefan Hajnoczi <stefa...@redhat.com> > CC: Kevin Wolf <kw...@redhat.com> > CC: Max Reitz <mre...@redhat.com> > CC: Paolo Bonzini <pbonz...@redhat.com> > CC: Richard Henderson <r...@twiddle.net> > CC: Eduardo Habkost <ehabk...@redhat.com> > --- > include/hw/compat.h | 17 +++++++++++++++++ > include/hw/virtio/virtio-blk.h | 1 + > include/hw/virtio/virtio-scsi.h | 1 + > hw/block/virtio-blk.c | 4 +++- > hw/scsi/vhost-scsi.c | 2 ++ > hw/scsi/vhost-user-scsi.c | 2 ++ > hw/scsi/virtio-scsi.c | 4 +++- > 7 files changed, 29 insertions(+), 2 deletions(-) > > diff --git a/include/hw/compat.h b/include/hw/compat.h > index 026fee9..b9be5d7 100644 > --- a/include/hw/compat.h > +++ b/include/hw/compat.h > @@ -2,6 +2,23 @@ > #define HW_COMPAT_H > > #define HW_COMPAT_2_11 \ > + {\ > + .driver = "virtio-blk-device",\ > + .property = "max_segments",\ > + .value = "126",\ > + },{\ > + .driver = "vhost-scsi",\ > + .property = "max_segments",\ > + .value = "126",\ > + },{\ > + .driver = "vhost-user-scsi",\ > + .property = "max_segments",\ > + .value = "126",\
Existing vhost-user-scsi slave programs might not expect up to 1022 segments. Hopefully we can get away with this change since there are relatively few vhost-user-scsi slave programs. CCed Felipe (Nutanix) and Jim (SPDK) in case they have comments.
signature.asc
Description: PGP signature