On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> wrote: > On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote: >> Today, i did some basical I/O testing, and suddenly found that qemu write >> and rw speed is so low now, my qemu binary is built on commit >> 344eecf6995f4a0ad1d887cec922f6806f91a3f8. >> >> Do qemu have regression? >> >> The testing data is shown as below: >> >> 1.) write >> >> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1 > > Please post your QEMU command-line. If your -drive is using > cache=writethrough then small writes are slow because they require the > physical disk to write and then synchronize its write cache. Typically > cache=none is a good setting to use for local disks. > > The block size of 512 bytes is too small. Ext4 uses a 4 KB block size, > so I think a 512 byte write from the guest could cause a 4 KB > read-modify-write operation on the host filesystem. > > You can check this by running btrace(8) on the host during the > benchmark. The blktrace output and the summary statistics will show > what I/O pattern the host is issuing. 8,2 0 1 0.000000000 337 A WS 425081504 + 8 <- (253,1) 42611360 8,0 0 2 0.000000896 337 A WS 426107552 + 8 <- (8,2) 425081504 8,2 0 3 0.000001772 337 Q WS 426107552 + 8 [jbd2/dm-1-8] 8,2 0 4 0.000006617 337 G WS 426107552 + 8 [jbd2/dm-1-8] 8,2 0 5 0.000007862 337 P N [jbd2/dm-1-8] 8,2 0 6 0.000010481 337 I WS 426107552 + 8 [jbd2/dm-1-8] ..... CPU0 (8,2): Reads Queued: 11, 416KiB Writes Queued: 20, 72KiB Read Dispatches: 12, 440KiB Write Dispatches: 8, 72KiB Reads Requeued: 0 Writes Requeued: 0 Reads Completed: 14, 448KiB Writes Completed: 12, 72KiB Read Merges: 0, 0KiB Write Merges: 10, 40KiB Read depth: 2 Write depth: 2 IO unplugs: 11 Timer unplugs: 0 CPU1 (8,2): Reads Queued: 8, 32KiB Writes Queued: 0, 0KiB Read Dispatches: 2, 8KiB Write Dispatches: 0, 0KiB Reads Requeued: 0 Writes Requeued: 0 Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB Read Merges: 5, 20KiB Write Merges: 0, 0KiB Read depth: 2 Write depth: 2 IO unplugs: 0 Timer unplugs: 0
Total (8,2): Reads Queued: 19, 448KiB Writes Queued: 20, 72KiB Read Dispatches: 14, 448KiB Write Dispatches: 8, 72KiB Reads Requeued: 0 Writes Requeued: 0 Reads Completed: 14, 448KiB Writes Completed: 12, 72KiB Read Merges: 5, 20KiB Write Merges: 10, 40KiB IO unplugs: 11 Timer unplugs: 0 Throughput (R/W): 69KiB/s / 11KiB/s Events (8,2): 411 entries Skips: 50 forward (5,937 - 93.5%) >From its log, host will write 8 blocks each time. what is each block's size? > > I suggest changing your fio block size to 8 KB if you want to try a > small block size. If you want a large block size, try 64 KB or 128 KB. When -drive if=virtio,cache=none,file=xxx,bps=1000000 is set (note that bps is in bytes). test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1 Starting 1 process ^Cbs: 1 (f=1): [W] [1.9% done] [0K/61K /s] [0/120 iops] [eta 01h:00m:57s] fio: terminating on signal 2 test: (groupid=0, jobs=1): err= 0: pid=27515 write: io=3,960KB, bw=56,775B/s, iops=110, runt= 71422msec slat (usec): min=19, max=31,032, avg=65.03, stdev=844.57 clat (msec): min=1, max=353, avg= 8.93, stdev=11.91 lat (msec): min=1, max=353, avg= 8.99, stdev=12.00 bw (KB/s) : min= 2, max= 60, per=102.06%, avg=56.14, stdev=10.89 cpu : usr=0.04%, sys=0.61%, ctx=7936, majf=0, minf=26 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w: total=0/7920, short=0/0 lat (msec): 2=0.03%, 4=0.09%, 10=98.48%, 20=0.54%, 50=0.67% lat (msec): 100=0.03%, 250=0.05%, 500=0.11% Run status group 0 (all jobs): WRITE: io=3,960KB, aggrb=55KB/s, minb=56KB/s, maxb=56KB/s, mint=71422msec, maxt=71422msec Disk stats (read/write): dm-0: ios=6/8007, merge=0/0, ticks=179/78114, in_queue=78272, util=99.58%, aggrios=4/7975, aggrmerge=0/44, aggrticks=179/75153, aggrin_queue=75304, aggrutil=99.53% vda: ios=4/7975, merge=0/44, ticks=179/75153, in_queue=75304, util=99.53% test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1 Starting 1 process Jobs: 1 (f=1): [W] [100.0% done] [0K/752K /s] [0/91 iops] [eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=27526 write: io=51,200KB, bw=668KB/s, iops=83, runt= 76622msec slat (usec): min=20, max=570K, avg=386.16, stdev=11400.53 clat (msec): min=1, max=1,699, avg=11.57, stdev=29.85 lat (msec): min=1, max=1,699, avg=11.96, stdev=33.24 bw (KB/s) : min= 20, max= 968, per=104.93%, avg=700.95, stdev=245.18 cpu : usr=0.08%, sys=0.41%, ctx=6418, majf=0, minf=25 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w: total=0/6400, short=0/0 lat (msec): 2=0.05%, 4=0.09%, 10=94.08%, 20=0.78%, 50=3.39% lat (msec): 100=0.33%, 250=1.12%, 500=0.11%, 750=0.03%, 2000=0.02% Run status group 0 (all jobs): WRITE: io=51,200KB, aggrb=668KB/s, minb=684KB/s, maxb=684KB/s, mint=76622msec, maxt=76622msec Disk stats (read/write): dm-0: ios=913/6731, merge=0/0, ticks=161809/696060, in_queue=858086, util=100.00%, aggrios=1070/6679, aggrmerge=316/410, aggrticks=163975/1587245, aggrin_queue=1751267, aggrutil=100.00% vda: ios=1070/6679, merge=316/410, ticks=163975/1587245, in_queue=1751267, util=100.00% test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1 Starting 1 process Jobs: 1 (f=1): [W] [100.0% done] [0K/458K /s] [0/6 iops] [eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=27614 write: io=51,200KB, bw=428KB/s, iops=6, runt=119618msec slat (usec): min=28, max=5,507K, avg=7618.90, stdev=194782.69 clat (msec): min=14, max=9,418, avg=140.49, stdev=328.96 lat (msec): min=14, max=9,418, avg=148.11, stdev=382.21 bw (KB/s) : min= 11, max= 664, per=114.06%, avg=488.19, stdev=53.97 cpu : usr=0.03%, sys=0.04%, ctx=825, majf=0, minf=27 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w: total=0/800, short=0/0 lat (msec): 20=0.38%, 50=2.50%, 250=97.00%, >=2000=0.12% Run status group 0 (all jobs): WRITE: io=51,200KB, aggrb=428KB/s, minb=438KB/s, maxb=438KB/s, mint=119618msec, maxt=119618msec Disk stats (read/write): dm-0: ios=0/938, merge=0/0, ticks=0/517622, in_queue=517712, util=100.00%, aggrios=0/883, aggrmerge=0/55, aggrticks=0/351498, aggrin_queue=351497, aggrutil=100.00% vda: ios=0/883, merge=0/55, ticks=0/351498, in_queue=351497, util=100.00% test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1 Starting 1 process Jobs: 1 (f=1): [W] [100.0% done] [0K/374K /s] [0/2 iops] [eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=27623 write: io=51,200KB, bw=422KB/s, iops=3, runt=121420msec slat (usec): min=29, max=5,484K, avg=17456.88, stdev=274747.99 clat (msec): min=174, max=9,559, avg=283.02, stdev=465.37 lat (msec): min=176, max=9,559, avg=300.47, stdev=538.58 bw (KB/s) : min= 22, max= 552, per=114.21%, avg=480.81, stdev=49.10 cpu : usr=0.00%, sys=0.03%, ctx=425, majf=0, minf=27 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w: total=0/400, short=0/0 lat (msec): 250=9.50%, 500=90.25%, >=2000=0.25% Run status group 0 (all jobs): WRITE: io=51,200KB, aggrb=421KB/s, minb=431KB/s, maxb=431KB/s, mint=121420msec, maxt=121420msec Disk stats (read/write): dm-0: ios=0/541, merge=0/0, ticks=0/573761, in_queue=574004, util=100.00%, aggrios=0/484, aggrmerge=0/57, aggrticks=0/396662, aggrin_queue=396662, aggrutil=100.00% vda: ios=0/484, merge=0/57, ticks=0/396662, in_queue=396662, util=100.00% > > Stefan > > -- Regards, Zhi Yong Wu