There are a few ideas that come to mind, here, roughly in order of
postings, are some suggested lines of investigation:
I ran the default bonnie++ test suite on a single disk (no RAID) then
again on a two-disk RAID0 for each
Were the RAID0 sets stable, that is, the raid controller reported them
in working condition?
Also, you mentioned wanting stable (elegant, coherent, reliable)
performance for the machine, but did not choose RAID1 sets; was RAID0
chosen only for the performance tests?
I ran these test for various partition sizes.
Was the partition exactly the same between FreeBSD/OpenBSD and the
64-bit and 32-bit versions? Especially the offset? Recall that disk
throughput varies about 50% between the inside of the disk (slower) and
the outside of the disk (faster, also track 0 is on the outside). On the
surface of the disk, bits-per-mm is the same throughout, roughly, and
with the longer circumference at the outside the bits-per-second is much
higher.
bonnie++ results, single disk (extracted):
OpenBSD 5.4 amd64 50412 kbyte/sec write 9020 kbyte/sec read
FreeBSD 9.2 amd64 70558 kbyte/sec write 45458 kbyte/sec read
This is a 64GB file on a 8GB memory box; so this is a fair test.
then
OpenBSD 5.4 i386 122454 kbyte/sec write 126183 kbyte/sec read
FreeBSD 9.2 i386 124706 kbyte/sec write 130154 kbyte/sec read
These with 4GB files on a 4GB box (the limit at 32-bits); not really
fair since there could be some BSD write caching going on. You didn't
say whether the i386 test was single-disk or raid0, do you recall which?
Just accepting these as fair, then both FreeBSD and OpenBSD have a
32-bit performance anomaly, as the disk IO should be much closer between
32-bit and 64-bit. I would expect initiating and controlling SCSI IO to
be fairly consistent irrespective of the bit-ness of the OS.
You might try to repeat the single-disk test with 64GB partition and
16GB filesize.
The 32-bit vs 64-bit performance, assuming the partitioning and
controller/raid config is constant, points at the system configuration
when running, e.g. memory controller policies, caches enabled, etc, as
setup by *BSD processor-specific code.
If you can reboot into 32-bit and 64-bit, compare memory bandwidth of
the two widths (tcpbench(1) to localhost and/or download and compile
Streams memory benchmark). This might support the idea that the problem
is outside the RAID controller.
Getting back to the RAID controller, how you config single-disk can
vary: with the competition (Dell PERC) setting single disks to "raid-0"
with no caching is a signal to the controller to pass-through all I/O
resulting in higher throughput. That would not explain 9 MB/sec read
speeds, however. For those, there is some unhappy interaction between
the RAID controller and the OS. The average IO size is much smaller for
the same request rate, or caching by the controller or drive is
ineffective, or coalescing IOs is not happening.
How you configure RAID0 also counts; for the best IO you need the stripe
size to be different from the average IO size (resulting in most
transfers being split into two and so each disk is active at the same
time). So for comparison purposes the RAID0 configs have to be the
same.
That's a few ideas anyways.
--John