Hi,
Robert Luberda wrote:
Forwarding yet another bug report.
----- Forwarded message from Gabor Gombas <[EMAIL PROTECTED]> -----
I just noticed that "iostat -x -d 2" reports bogus values for avgqu-sz
and %util:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 5.00 0.00 6.00 0.00 88.00 0.00 44.00 14.67
978275.89 8.67 166.75 100.05
sdb 0.00 5.00 0.00 6.00 0.00 88.00 0.00 44.00 14.67
978275.87 4.58 166.75 100.05
md0 0.00 0.00 0.00 5.50 0.00 44.00 0.00 22.00 8.00
0.00 0.00 0.00 0.00
md4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
md3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
md2 0.00 0.00 0.00 1.50 0.00 12.00 0.00 6.00 8.00
0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
AMD64 has similar problems, just the numbers are larger:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
9269720640049696.00 0.00 0.00 100.55
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
9269720640049696.00 0.00 0.00 100.55
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
9269720640049696.00 0.00 0.00 100.55
sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
9269720640049696.00 0.00 0.00 100.55
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
md3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
I am wondering if this problem is not linked to the following Linux
kernel bug, as explained in sysstat FAQ :
*** BEGIN FAQ ***
3.6. iostat -x displays huge numbers for some fields...
Because of a Linux kernel bug, iostat -x may display huge I/O response times
(svctm) and a bandwidth utilization (%util) of 100% for some devices. Indeed
these devices have a value for the field #9 (beginning after the device
name)
in /proc/{partitions,diskstats} which is always different from 0, and even
negative sometimes. Yet this field should go to zero, since it gives the
number of I/Os currently in progress (it is incremented as requests are
submitted, and decremented as they finish).
To (temporarily) solve the problem, you should reboot your system to
reset the
counters in /proc/{partitions,diskstats}.
*** END FAQ ***
This could explain why we get such a value for %util (>100%).
Gabor : could you please send me the contents of your /proc/diskstats
file so that I can check it?
PS: Note that a problem with huge avgqu-sz values was also reported on
64-bit machines in LKML.
Though fixing iostat to handle this problem was possible, it was decided
to update the kernel's disk_stats structure to fix it (patch from Ben
Woodard which was finally included in 2.6.17-rc1).
Regards,
--
Sébastien Godard (sysstat <at> wanadoo.fr)
http://perso.wanadoo.fr/sebastien.godard/