On 11/21/2012 01:39 PM, Asias He wrote: > On 11/20/2012 08:25 PM, Stefan Hajnoczi wrote: >> On Tue, Nov 20, 2012 at 1:21 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote: >>> On Tue, Nov 20, 2012 at 10:02 AM, Asias He <as...@redhat.com> wrote: >>>> Hello Stefan, >>>> >>>> On 11/15/2012 11:18 PM, Stefan Hajnoczi wrote: >>>>> This series adds the -device virtio-blk-pci,x-data-plane=on property that >>>>> enables a high performance I/O codepath. A dedicated thread is used to >>>>> process >>>>> virtio-blk requests outside the global mutex and without going through >>>>> the QEMU >>>>> block layer. >>>>> >>>>> Khoa Huynh <k...@us.ibm.com> reported an increase from 140,000 IOPS to >>>>> 600,000 >>>>> IOPS for a single VM using virtio-blk-data-plane in July: >>>>> >>>>> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580 >>>>> >>>>> The virtio-blk-data-plane approach was originally presented at Linux >>>>> Plumbers >>>>> Conference 2010. The following slides contain a brief overview: >>>>> >>>>> >>>>> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf >>>>> >>>>> The basic approach is: >>>>> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd >>>>> signalling when the guest kicks the virtqueue. >>>>> 2. Requests are processed without going through the QEMU block layer using >>>>> Linux AIO directly. >>>>> 3. Completion interrupts are injected via irqfd from the dedicated thread. >>>>> >>>>> To try it out: >>>>> >>>>> qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=... >>>>> -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on >>>> >>>> >>>> Is this the latest dataplane bits: >>>> (git://github.com/stefanha/qemu.git virtio-blk-data-plane) >>>> >>>> commit 7872075c24fa01c925d4f41faa9d04ce69bf5328 >>>> Author: Stefan Hajnoczi <stefa...@redhat.com> >>>> Date: Wed Nov 14 15:45:38 2012 +0100 >>>> >>>> virtio-blk: add x-data-plane=on|off performance feature >>>> >>>> >>>> With this commit on a ramdisk based box, I am seeing about 10K IOPS with >>>> x-data-plane on and 90K IOPS with x-data-plane off. >>>> >>>> Any ideas? >>>> >>>> Command line I used: >>>> >>>> IMG=/dev/ram0 >>>> x86_64-softmmu/qemu-system-x86_64 \ >>>> -drive file=/root/img/sid.img,if=ide \ >>>> -drive file=${IMG},if=none,cache=none,aio=native,id=disk1 -device >>>> virtio-blk-pci,x-data-plane=off,drive=disk1,scsi=off \ >>>> -kernel $KERNEL -append "root=/dev/sdb1 console=tty0" \ >>>> -L /tmp/qemu-dataplane/share/qemu/ -nographic -vnc :0 -enable-kvm -m >>>> 2048 -smp 4 -cpu qemu64,+x2apic -M pc >>> >>> Was just about to send out the latest patch series which addresses >>> review comments, so I have tested the latest code >>> (61b70fef489ce51ecd18d69afb9622c110b9315c). >> >> Rebased onto qemu.git/master before sending out. The commit ID is now: >> cf6ed6406543ecc43895012a9ac9665e3753d5e8 >> >> https://github.com/stefanha/qemu/commits/virtio-blk-data-plane >> >> Stefan > > Ok, thanks. /me trying
Hi Stefan, If I enable the merge in guest the IOPS for seq read/write goes up to ~400K/300K. If I disable the merge in guest the IOPS drops to ~17K/24K for seq read/write (which is similar to the result I posted yesterday, with merge disalbed). Could you please also share the numbers for rand read and write in your setup? 1.(With merge enabled in guest + dataplane on) echo noop > /sys/block/vda/queue/scheduler echo 0 > /sys/block/vda/queue/nomerges ------------------------------------- read : io=0 B, bw=1575.2MB/s, iops=403453 , runt= 10396msec write: io=0 B, bw=1224.1MB/s, iops=313592 , runt= 13375msec read : io=0 B, bw=99534KB/s, iops=24883 , runt=168558msec write: io=0 B, bw=197695KB/s, iops=49423 , runt= 84864msec clat (usec): min=102 , max=395042 , avg=2167.18, stdev=4307.91 clat (usec): min=89 , max=636874 , avg=2728.93, stdev=6777.71 clat (usec): min=285 , max=4737.1K, avg=39939.42, stdev=87137.41 clat (usec): min=74 , max=1408.1K, avg=18575.50, stdev=47752.13 cpu : usr=79.16%, sys=261.43%, ctx=1837824, majf=0, minf=58 cpu : usr=73.61%, sys=251.80%, ctx=1585892, majf=0, minf=7 cpu : usr=17.03%, sys=71.03%, ctx=6427788, majf=0, minf=6 cpu : usr=25.51%, sys=88.46%, ctx=5117624, majf=0, minf=1 vda: ios=5037311/5303500, merge=3351489/3084584, ticks=30674372/15110818, in_queue=45816533, util=97.50% 41: 40099 40163 40110 40157 PCI-MSI-edge virtio0-requests 41: interrupt in total: 160529 fio --exec_prerun="echo 3 > /proc/sys/vm/drop_caches" --group_reporting --ioscheduler=noop --thread --bs=4k --size=512MB --direct=1 --numjobs=16 --ioengine=libaio --iodepth=64 --loops=3 --ramp_time=0 --filename=/dev/vda --name=seq-read --stonewall --rw=read --name=seq-write --stonewall --rw=write --name=rnd-read --stonewall --rw=randread --name=rnd-write --stonewall --rw=randwrite 2.(With merge diabled in guest + dataplane on) echo noop > /sys/block/vda/queue/scheduler echo 2 > /sys/block/vda/queue/nomerges ------------------------------------- read : io=0 B, bw=69185KB/s, iops=17296 , runt=242497msec write: io=0 B, bw=96219KB/s, iops=24054 , runt=174365msec read : io=0 B, bw=90866KB/s, iops=22716 , runt=184637msec write: io=0 B, bw=202018KB/s, iops=50504 , runt= 83048msec clat (usec): min=98 , max=1719.7K, avg=57623.32, stdev=84730.86 clat (usec): min=0 , max=1372.8K, avg=41286.47, stdev=73252.45 clat (usec): min=82 , max=1308.8K, avg=43828.82, stdev=73483.73 clat (usec): min=0 , max=1239.6K, avg=18445.77, stdev=49099.15 cpu : usr=10.12%, sys=72.64%, ctx=7942850, majf=0, minf=62 cpu : usr=14.72%, sys=78.99%, ctx=7358973, majf=0, minf=3 cpu : usr=16.34%, sys=72.60%, ctx=7394674, majf=0, minf=5 cpu : usr=29.69%, sys=83.41%, ctx=5262809, majf=0, minf=4 vda: ios=8389288/8388552, merge=0/0, ticks=76013872/44425860, in_queue=120493774, util=99.58% 41: 89414 89456 89504 89534 PCI-MSI-edge virtio0-requests 41: interrupt in total: 357908 fio --exec_prerun="echo 3 > /proc/sys/vm/drop_caches" --group_reporting --ioscheduler=noop --thread --bs=4k --size=512MB --direct=1 --numjobs=16 --ioengine=libaio --iodepth=64 --loops=3 --ramp_time=0 --filename=/dev/vda --name=seq-read --stonewall --rw=read --name=seq-write --stonewall --rw=write --name=rnd-read --stonewall --rw=randread --name=rnd-write --stonewall --rw=randwrite 3.(With merge enabled in guest + dataplane off) ------------------------------------- read : io=0 B, bw=810220KB/s, iops=202554 , runt= 20707msec write: io=0 B, bw=999.97MB/s, iops=255984 , runt= 16385msec read : io=0 B, bw=338066KB/s, iops=84516 , runt= 49627msec write: io=0 B, bw=455420KB/s, iops=113854 , runt= 36839msec clat (usec): min=0 , max=26340 , avg=5019.36, stdev=2185.04 clat (usec): min=58 , max=21572 , avg=3972.33, stdev=1708.00 clat (usec): min=34 , max=90185 , avg=11879.72, stdev=8054.58 clat (usec): min=189 , max=122825 , avg=8822.87, stdev=4608.65 cpu : usr=44.89%, sys=141.08%, ctx=1611401, majf=0, minf=54 cpu : usr=58.46%, sys=177.50%, ctx=1582260, majf=0, minf=0 cpu : usr=21.09%, sys=63.61%, ctx=7609871, majf=0, minf=2 cpu : usr=28.88%, sys=73.51%, ctx=8140689, majf=0, minf=10 vda: ios=5222932/5209798, merge=3164880/3163834, ticks=10133305/7618081, in_queue=17773509, util=99.46% 41: 286378 284870 285478 285759 PCI-MSI-edge virtio0-requests 41: interrupt in total: 1142485 fio --exec_prerun="echo 3 > /proc/sys/vm/drop_caches" --group_reporting --ioscheduler=noop --thread --bs=4k --size=512MB --direct=1 --numjobs=16 --ioengine=libaio --iodepth=64 --loops=3 --ramp_time=0 --filename=/dev/vda --name=seq-read --stonewall --rw=read --name=seq-write --stonewall --rw=write --name=rnd-read --stonewall --rw=randread --name=rnd-write --stonewall --rw=randwrite 4.(With merge disabled in guest + dataplane off) ------------------------------------- read : io=0 B, bw=331147KB/s, iops=82786 , runt= 50664msec write: io=0 B, bw=431802KB/s, iops=107950 , runt= 38854msec read : io=0 B, bw=376424KB/s, iops=94105 , runt= 44570msec write: io=0 B, bw=373200KB/s, iops=93300 , runt= 44955msec clat (usec): min=84 , max=99635 , avg=12075.27, stdev=7899.12 clat (usec): min=97 , max=84882 , avg=9289.49, stdev=5688.96 clat (usec): min=56 , max=76176 , avg=10662.84, stdev=6375.63 clat (usec): min=158 , max=112216 , avg=10762.62, stdev=6412.89 cpu : usr=15.16%, sys=59.67%, ctx=8007217, majf=0, minf=62 cpu : usr=26.45%, sys=99.42%, ctx=7477108, majf=0, minf=8 cpu : usr=29.52%, sys=85.40%, ctx=7889672, majf=0, minf=2 cpu : usr=30.71%, sys=86.54%, ctx=7893782, majf=0, minf=2 vda: ios=8389195/8374171, merge=0/0, ticks=15202863/12550941, in_queue=27840167, util=99.63% 41: 179902 179182 179232 179539 PCI-MSI-edge virtio0-requests 41: interrupt in total: 717855 fio --exec_prerun="echo 3 > /proc/sys/vm/drop_caches" --group_reporting --ioscheduler=noop --thread --bs=4k --size=512MB --direct=1 --numjobs=16 --ioengine=libaio --iodepth=64 --loops=3 --ramp_time=0 --filename=/dev/vda --name=seq-read --stonewall --rw=read --name=seq-write --stonewall --rw=write --name=rnd-read --stonewall --rw=randread --name=rnd-write --stonewall --rw=randwrite -- Asias