This is real log when fio issued with bs=128K and bps=1000000(block I/O throttling):
8,2 0 1 0.000000000 24332 A WS 79958528 + 256 <- (253,2) 71830016 8,0 0 2 0.000000912 24332 A WS 80984576 + 256 <- (8,2) 79958528 8,2 0 3 0.000001778 24332 Q WS 80984576 + 256 [qemu-system-x86] 8,2 0 4 0.000006527 24332 G WS 80984576 + 256 [qemu-system-x86] 8,2 0 5 0.000007817 24332 P N [qemu-system-x86] 8,2 0 6 0.000011234 24332 I WS 80984576 + 256 [qemu-system-x86] CPU0 (8,2): Reads Queued: 0, 0KiB Writes Queued: 558, 25,244KiB Read Dispatches: 0, 0KiB Write Dispatches: 265, 25,440KiB Reads Requeued: 0 Writes Requeued: 0 Reads Completed: 0, 0KiB Writes Completed: 1,027, 56,420KiB Read Merges: 0, 0KiB Write Merges: 19, 76KiB Read depth: 0 Write depth: 3 IO unplugs: 217 Timer unplugs: 268 CPU1 (8,2): Reads Queued: 0, 0KiB Writes Queued: 483, 31,176KiB Read Dispatches: 0, 0KiB Write Dispatches: 262, 30,980KiB Reads Requeued: 0 Writes Requeued: 0 Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB Read Merges: 0, 0KiB Write Merges: 20, 80KiB Read depth: 0 Write depth: 3 IO unplugs: 265 Timer unplugs: 181 Total (8,2): Reads Queued: 0, 0KiB Writes Queued: 1,041, 56,420KiB Read Dispatches: 0, 0KiB Write Dispatches: 527, 56,420KiB Reads Requeued: 0 Writes Requeued: 0 Reads Completed: 0, 0KiB Writes Completed: 1,027, 56,420KiB Read Merges: 0, 0KiB Write Merges: 39, 156KiB IO unplugs: 482 Timer unplugs: 449 Throughput (R/W): 0KiB/s / 482KiB/s Events (8,2): 17,661 entries Skips: 1,820 forward (3,918,005 - 99.6%) I found that I/O pattern host issues each time is different when fio is issued with different bs value. On Tue, Sep 13, 2011 at 10:38 AM, Zhi Yong Wu <zwu.ker...@gmail.com> wrote: > On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi > <stefa...@linux.vnet.ibm.com> wrote: >> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote: >>> Today, i did some basical I/O testing, and suddenly found that qemu write >>> and rw speed is so low now, my qemu binary is built on commit >>> 344eecf6995f4a0ad1d887cec922f6806f91a3f8. >>> >>> Do qemu have regression? >>> >>> The testing data is shown as below: >>> >>> 1.) write >>> >>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1 >> >> Please post your QEMU command-line. If your -drive is using >> cache=writethrough then small writes are slow because they require the >> physical disk to write and then synchronize its write cache. Typically >> cache=none is a good setting to use for local disks. >> >> The block size of 512 bytes is too small. Ext4 uses a 4 KB block size, >> so I think a 512 byte write from the guest could cause a 4 KB >> read-modify-write operation on the host filesystem. >> >> You can check this by running btrace(8) on the host during the >> benchmark. The blktrace output and the summary statistics will show >> what I/O pattern the host is issuing. > 8,2 0 1 0.000000000 337 A WS 425081504 + 8 <- > (253,1) 42611360 > 8,0 0 2 0.000000896 337 A WS 426107552 + 8 <- > (8,2) 425081504 > 8,2 0 3 0.000001772 337 Q WS 426107552 + 8 [jbd2/dm-1-8] > 8,2 0 4 0.000006617 337 G WS 426107552 + 8 [jbd2/dm-1-8] > 8,2 0 5 0.000007862 337 P N [jbd2/dm-1-8] > 8,2 0 6 0.000010481 337 I WS 426107552 + 8 [jbd2/dm-1-8] > ..... > CPU0 (8,2): > Reads Queued: 11, 416KiB Writes Queued: 20, > 72KiB > Read Dispatches: 12, 440KiB Write Dispatches: 8, > 72KiB > Reads Requeued: 0 Writes Requeued: 0 > Reads Completed: 14, 448KiB Writes Completed: 12, > 72KiB > Read Merges: 0, 0KiB Write Merges: 10, > 40KiB > Read depth: 2 Write depth: 2 > IO unplugs: 11 Timer unplugs: 0 > CPU1 (8,2): > Reads Queued: 8, 32KiB Writes Queued: 0, > 0KiB > Read Dispatches: 2, 8KiB Write Dispatches: 0, > 0KiB > Reads Requeued: 0 Writes Requeued: 0 > Reads Completed: 0, 0KiB Writes Completed: 0, > 0KiB > Read Merges: 5, 20KiB Write Merges: 0, > 0KiB > Read depth: 2 Write depth: 2 > IO unplugs: 0 Timer unplugs: 0 > > Total (8,2): > Reads Queued: 19, 448KiB Writes Queued: 20, > 72KiB > Read Dispatches: 14, 448KiB Write Dispatches: 8, > 72KiB > Reads Requeued: 0 Writes Requeued: 0 > Reads Completed: 14, 448KiB Writes Completed: 12, > 72KiB > Read Merges: 5, 20KiB Write Merges: 10, > 40KiB > IO unplugs: 11 Timer unplugs: 0 > > Throughput (R/W): 69KiB/s / 11KiB/s > Events (8,2): 411 entries > Skips: 50 forward (5,937 - 93.5%) > > From its log, host will write 8 blocks each time. what is each block's size? > >> >> I suggest changing your fio block size to 8 KB if you want to try a >> small block size. If you want a large block size, try 64 KB or 128 KB. > When -drive if=virtio,cache=none,file=xxx,bps=1000000 is set > (note that bps is in bytes). > > test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1 > Starting 1 process > ^Cbs: 1 (f=1): [W] [1.9% done] [0K/61K /s] [0/120 iops] [eta 01h:00m:57s] > fio: terminating on signal 2 > > test: (groupid=0, jobs=1): err= 0: pid=27515 > write: io=3,960KB, bw=56,775B/s, iops=110, runt= 71422msec > slat (usec): min=19, max=31,032, avg=65.03, stdev=844.57 > clat (msec): min=1, max=353, avg= 8.93, stdev=11.91 > lat (msec): min=1, max=353, avg= 8.99, stdev=12.00 > bw (KB/s) : min= 2, max= 60, per=102.06%, avg=56.14, stdev=10.89 > cpu : usr=0.04%, sys=0.61%, ctx=7936, majf=0, minf=26 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued r/w: total=0/7920, short=0/0 > > lat (msec): 2=0.03%, 4=0.09%, 10=98.48%, 20=0.54%, 50=0.67% > lat (msec): 100=0.03%, 250=0.05%, 500=0.11% > > Run status group 0 (all jobs): > WRITE: io=3,960KB, aggrb=55KB/s, minb=56KB/s, maxb=56KB/s, > mint=71422msec, maxt=71422msec > > Disk stats (read/write): > dm-0: ios=6/8007, merge=0/0, ticks=179/78114, in_queue=78272, > util=99.58%, aggrios=4/7975, aggrmerge=0/44, aggrticks=179/75153, > aggrin_queue=75304, aggrutil=99.53% > vda: ios=4/7975, merge=0/44, ticks=179/75153, in_queue=75304, util=99.53% > test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1 > Starting 1 process > Jobs: 1 (f=1): [W] [100.0% done] [0K/752K /s] [0/91 iops] [eta 00m:00s] > test: (groupid=0, jobs=1): err= 0: pid=27526 > write: io=51,200KB, bw=668KB/s, iops=83, runt= 76622msec > slat (usec): min=20, max=570K, avg=386.16, stdev=11400.53 > clat (msec): min=1, max=1,699, avg=11.57, stdev=29.85 > lat (msec): min=1, max=1,699, avg=11.96, stdev=33.24 > bw (KB/s) : min= 20, max= 968, per=104.93%, avg=700.95, stdev=245.18 > cpu : usr=0.08%, sys=0.41%, ctx=6418, majf=0, minf=25 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued r/w: total=0/6400, short=0/0 > > lat (msec): 2=0.05%, 4=0.09%, 10=94.08%, 20=0.78%, 50=3.39% > lat (msec): 100=0.33%, 250=1.12%, 500=0.11%, 750=0.03%, 2000=0.02% > > Run status group 0 (all jobs): > WRITE: io=51,200KB, aggrb=668KB/s, minb=684KB/s, maxb=684KB/s, > mint=76622msec, maxt=76622msec > > Disk stats (read/write): > dm-0: ios=913/6731, merge=0/0, ticks=161809/696060, in_queue=858086, > util=100.00%, aggrios=1070/6679, aggrmerge=316/410, > aggrticks=163975/1587245, aggrin_queue=1751267, aggrutil=100.00% > vda: ios=1070/6679, merge=316/410, ticks=163975/1587245, > in_queue=1751267, util=100.00% > test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1 > Starting 1 process > Jobs: 1 (f=1): [W] [100.0% done] [0K/458K /s] [0/6 iops] [eta 00m:00s] > test: (groupid=0, jobs=1): err= 0: pid=27614 > write: io=51,200KB, bw=428KB/s, iops=6, runt=119618msec > slat (usec): min=28, max=5,507K, avg=7618.90, stdev=194782.69 > clat (msec): min=14, max=9,418, avg=140.49, stdev=328.96 > lat (msec): min=14, max=9,418, avg=148.11, stdev=382.21 > bw (KB/s) : min= 11, max= 664, per=114.06%, avg=488.19, stdev=53.97 > cpu : usr=0.03%, sys=0.04%, ctx=825, majf=0, minf=27 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued r/w: total=0/800, short=0/0 > > lat (msec): 20=0.38%, 50=2.50%, 250=97.00%, >=2000=0.12% > > Run status group 0 (all jobs): > WRITE: io=51,200KB, aggrb=428KB/s, minb=438KB/s, maxb=438KB/s, > mint=119618msec, maxt=119618msec > > Disk stats (read/write): > dm-0: ios=0/938, merge=0/0, ticks=0/517622, in_queue=517712, > util=100.00%, aggrios=0/883, aggrmerge=0/55, aggrticks=0/351498, > aggrin_queue=351497, aggrutil=100.00% > vda: ios=0/883, merge=0/55, ticks=0/351498, in_queue=351497, util=100.00% > test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1 > Starting 1 process > Jobs: 1 (f=1): [W] [100.0% done] [0K/374K /s] [0/2 iops] [eta 00m:00s] > test: (groupid=0, jobs=1): err= 0: pid=27623 > write: io=51,200KB, bw=422KB/s, iops=3, runt=121420msec > slat (usec): min=29, max=5,484K, avg=17456.88, stdev=274747.99 > clat (msec): min=174, max=9,559, avg=283.02, stdev=465.37 > lat (msec): min=176, max=9,559, avg=300.47, stdev=538.58 > bw (KB/s) : min= 22, max= 552, per=114.21%, avg=480.81, stdev=49.10 > cpu : usr=0.00%, sys=0.03%, ctx=425, majf=0, minf=27 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued r/w: total=0/400, short=0/0 > > lat (msec): 250=9.50%, 500=90.25%, >=2000=0.25% > > Run status group 0 (all jobs): > WRITE: io=51,200KB, aggrb=421KB/s, minb=431KB/s, maxb=431KB/s, > mint=121420msec, maxt=121420msec > > Disk stats (read/write): > dm-0: ios=0/541, merge=0/0, ticks=0/573761, in_queue=574004, > util=100.00%, aggrios=0/484, aggrmerge=0/57, aggrticks=0/396662, > aggrin_queue=396662, aggrutil=100.00% > vda: ios=0/484, merge=0/57, ticks=0/396662, in_queue=396662, util=100.00% > > >> >> Stefan >> >> > > > > -- > Regards, > > Zhi Yong Wu > -- Regards, Zhi Yong Wu