On 8/13/2011 6:53 AM, Dion Kant wrote: > Stan, > > You are right, with bs=4096 the write performance improves > significantly. From the man page of dd I concluded that not specifying > bs selects ibs=512 and obs=512. A bs=512 gives indeed similar > performance as not specifying bs at all. > > When observing the system with vmstat I see the same (strange) behaviour > for no bs specified, or bs=512: > > root@dom0-2:~# vmstat 2 > procs -----------memory---------- ---swap-- -----io---- -system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 1 0 0 6314620 125988 91612 0 0 0 3 5 5 0 0 > 100 0 > 1 1 0 6265404 173744 91444 0 0 23868 13 18020 12290 0 > 0 86 14 > 2 1 0 6214576 223076 91704 0 0 24666 1 18596 12417 0 > 0 90 10 > 0 1 0 6163004 273172 91448 0 0 25046 0 18867 12614 0 > 0 89 11 > 1 0 0 6111308 323252 91592 0 0 25042 0 18861 12608 0 > 0 92 8 > 0 1 0 6059860 373220 91648 0 0 24984 0 18821 12578 0 > 0 85 14 > 0 1 0 6008164 423304 91508 0 0 25040 0 18863 12611 0 > 0 95 5 > 2 1 0 5956344 473468 91604 0 0 25084 0 18953 12630 0 > 0 95 5 > 0 1 0 5904896 523548 91532 0 0 25038 0 18867 12607 0 > 0 87 13 > 0 1 0 5896068 528680 91520 0 0 2558 99597 2431 1373 0 0 > 92 8 > 0 2 0 5896088 528688 91520 0 0 0 73736 535 100 0 0 > 86 13 > 0 1 0 5896128 528688 91520 0 0 0 73729 545 99 0 0 > 88 12 > 1 0 0 6413920 28712 91612 0 0 54 2996 634 372 0 0 > 95 4 > 0 0 0 6413940 28712 91520 0 0 0 0 78 80 0 0 > 100 0 > 0 0 0 6413940 28712 91520 0 0 0 0 94 97 0 0 > 100 0 > > Remarkable behaviour in the sense that there is a lot of bi in the > beginning and finally I see bo at 75 MB/s.
That might be due to massive merges, but I'm not really a kernel hacker so I can't say for sure. > With obs=4096 it looks like > > root@dom0-2:~# vmstat 2 > procs -----------memory---------- ---swap-- -----io---- -system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 1 0 0 6413600 28744 91540 0 0 0 3 5 5 0 0 > 100 0 > 1 0 0 6413724 28744 91540 0 0 0 0 103 96 0 0 > 100 0 > 1 0 0 6121616 312880 91208 0 0 0 18 457 133 1 2 > 97 0 > 0 1 0 5895588 528756 91540 0 0 0 83216 587 88 1 3 > 90 6 > 0 1 0 5895456 528756 91540 0 0 0 73728 539 98 0 0 > 92 8 > 0 3 0 5895400 528760 91536 0 0 0 73735 535 93 0 0 > 86 14 > 1 0 0 6413520 28788 91436 0 0 54 19359 783 376 0 0 > 93 6 > 0 0 0 6413544 28788 91540 0 0 0 2 100 84 0 0 > 100 0 > 0 0 0 6413544 28788 91540 0 0 0 0 86 87 0 0 > 100 0 > 0 0 0 6413552 28796 91532 0 0 0 10 110 113 0 0 > 100 0 > > As soon as I select a bs which is not a whole multiple of 4096, I get a > lot of block input and a bad performance for writing data to disk. > I'll try to Google your mentioned thread(s) on this. I still feel not > very satisfied with your explanation though. My explanation to you wasn't fully correct. I confused specifying no block size with specifying an insanely large block size. The other post I was referring to dealt with people using a 1GB (or larger) block size because it made the math easier for them when wanting to write a large test file. Instead of dividing their total file size by 4096 and using the result for "bs=4096 count=X" (which is the proper method I described to you) they were simply specifying, for example, "bs=2G count=1" to write a 2 GB test file. Doing this causes the massive buffering I described, and consequently, horrible performance, typically by a factor of 10 or more, depending on the specific system. The horrible performance with bs=512 is likely due to the LVM block size being 4096, and forcing block writes that are 1/8th normal size, causing lots of merging. If you divide 120MB/s by 8 you get 15MB/s, which IIRC from your original post, is approximately the write performance you were seeing, which was 19MB/s. If my explanation doesn't seem thorough enough that's because I'm not a kernel expert. I'm just have a little better than average knowledge/ understanding of some of aspects of the kernel. If you want a really good explanation of the reasons behind this dd block size behavior while writing to a raw LVM device, try posting to lkml proper or one of the sub lists dealing with LVM and the block layer. Also, I'm sure some of the expert developers on the XFS list could answer this as well, though it would be a little OT there, unless of course your filesystem test yielding the 120MB/s was using XFS. ;) -- Stan -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4e468237.4020...@hardwarefreak.com