> Yeah, it has a BBU, and it is charged and on..
> Very odd behavior, I'm stumped.

I advise double-checking raid volume settings and ensuring that policy
is truly such that the write cache is used. This may also be a
function of kernel driver settings depending on what RAID
controller/kernel version you have (for example make sure that an
fsync() doesn't result in asking the RAID controller to flush caches
regardless of BBU state). In any case, these stats:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d0        0.00   908.50    0.00  110.50     0.00  8152.00
73.77     0.57    5.20   4.93  54.50

... are highly inconsistent with write-back caching under the
assumption that the writes are indeed the sequential writes to the
commit log. Write counts in the ~ 100/second ballpark, with no reads,
average request size of 74, an average transaction time of 4.93 and a
utilization of 54% *really really* sounds like the I/O is going all
the way down to the platter. Either that or the RAID firmware/driver
is not doing its job properly.

I've attached a small script (note: UGLY hack since I just whipped it
up, not a proper tool, but it does the job for this) that you can run
to test it:

   ./writelat.py /path/to/file_to_write_to # warning, file will be
truncated/destroyed if it exists

If you run this on an idle system and you're truly doing write-back
caching, you should see numbers BELOW 1 (i.e., sub-millisecond times)
(but you can ignore the first sample probably).

It writes one 8kb chunk of data, fsync:s, writes another 8kb, fsyncs,
etc. I predict you'll see numbers in the 3-7 millisecond range.

Sample output for me with a plain old SATA drive (and ZFS on FreeBSD) follows.

33.8141918182
8.32605361938
8.44812393188
8.44788551331
8.40210914612
8.4490776062
8.38303565979
8.57901573181
8.20922851562
8.21614265442
10.0581645966
8.37683677673
8.50605964661
8.376121521
9.86790657043
8.43715667725
8.45789909363
8.40520858765
8.4171295166
8.46195220947
8.41498374939
8.46099853516
8.44287872314
8.43000411987
8.455991745

-- 
/ Peter Schuller
#!/usr/bin/env python

# ugly hack, reader beware

import os
import sys
import time

COUNT = 25

BLOCK = ''.join('x' for _ in xrange(8*1024))

f = file(sys.argv[1], 'w')
for _ in xrange(COUNT):
    start_time = time.time()
    f.write(BLOCK)
    f.flush()
    os.fsync(f.fileno())
    stop_time = time.time()
    # uncomment to test-rewriting the same block repeatedly
    # f.seek(0)
    print (stop_time - start_time) * 1000
f.close()

Reply via email to