On Friday, June 15, 2012 12:28:59 am Charles Owens wrote:
Hello FreeBSD folk,
We're seeing what appears to be a storage performance regression as we
try to move from 8.1 (i386) to 8.3. We looked at 8.2 also and it
appears that the regression happened between 8.1 and 8.2.
Our system is an Intel S5520UR Server with 12 GB RAM, dual 4-core CPUs.
Storage is a LSI MegaSAS 1078 controller (mfi) in a RAID-10
configuration, using UFS + geom_journal for filesystem.
Postgresql performance, as seen via pgbench, dropped by approx 20%.
This testing was done with our usual PAE-enabled kernels. We then went
back to GENERIC kernels and did comparisons using "bonnie", results
below. Following that is a kernel boot log.
Notably, we're seeing this regression only with our RAID mfi(4) based
systems. Notably, from looking at FreeBSD source changelogs it appears
that the mfi(4) code has seen some changes since 8.1.
Between 8.1 and 8.2 mfi has not had any significant changes. The only changes
made to sys/dev/mfi were to add a new constant:
svn diff svn+ssh://svn.freebsd.org/base/releng/8.1/sys/dev/mfi
svn+ssh://svn.freebsd.org/base/releng/8.2/sys/dev/mfi
Index: mfireg.h
===================================================================
--- mfireg.h (.../8.1/sys/dev/mfi) (revision 237134)
+++ mfireg.h (.../8.2/sys/dev/mfi) (revision 237134)
@@ -975,7 +975,9 @@
MFI_PD_STATE_OFFLINE = 0x10,
MFI_PD_STATE_FAILED = 0x11,
MFI_PD_STATE_REBUILD = 0x14,
- MFI_PD_STATE_ONLINE = 0x18
+ MFI_PD_STATE_ONLINE = 0x18,
+ MFI_PD_STATE_COPYBACK = 0x20,
+ MFI_PD_STATE_SYSTEM = 0x40
};
union mfi_ld_ref {
The difference in write performance must be due to something else. You
mentioned you are using UFS + gjournal. I think gjournal uses BIO_FLUSH, so I
wonder if this is related:
------------------------------------------------------------------------
r212939 | gibbs | 2010-09-20 19:39:00 -0400 (Mon, 20 Sep 2010) | 61 lines
MFC 212160:
Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic.
Add the BIO_ORDERED flag for struct bio and update bio clients to use it.
The barrier semantics of bioq_insert_tail() were broken in two ways:
o In bioq_disksort(), an added bio could be inserted at the head of
the queue, even when a barrier was present, if the sort key for
the new entry was less than that of the last queued barrier bio.
o The last_offset used to generate the sort key for newly queued bios
did not stay at the position of the barrier until either the
barrier was de-queued, or a new barrier (which updates last_offset)
was queued. When a barrier is in effect, we know that the disk
will pass through the barrier position just before the
"blocked bios" are released, so using the barrier's offset for
last_offset is the optimal choice.
sys/geom/sched/subr_disk.c:
sys/kern/subr_disk.c:
o Update last_offset in bioq_insert_tail().
o Only update last_offset in bioq_remove() if the removed bio is
at the head of the queue (typically due to a call via
bioq_takefirst()) and no barrier is active.
o In bioq_disksort(), if we have a barrier (insert_point is non-NULL),
set prev to the barrier and cur to it's next element. Now that
last_offset is kept at the barrier position, this change isn't
strictly necessary, but since we have to take a decision branch
anyway, it does avoid one, no-op, loop iteration in the while
loop that immediately follows.
o In bioq_disksort(), bypass the normal sort for bios with the
BIO_ORDERED attribute and instead insert them into the queue
with bioq_insert_tail(). bioq_insert_tail() not only gives
the desired command order during insertion, but also provides
barrier semantics so that commands disksorted in the future
cannot pass the just enqueued transaction.
sys/sys/bio.h:
Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio.
sys/cam/ata/ata_da.c:
sys/cam/scsi/scsi_da.c
Use an ordered command for SCSI/ATA-NCQ commands issued in
response to bios with the BIO_ORDERED flag set.
sys/cam/scsi/scsi_da.c
Use an ordered tag when issuing a synchronize cache command.
Wrap some lines to 80 columns.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
sys/geom/geom_io.c
Mark bios with the BIO_FLUSH command as BIO_ORDERED.
Sponsored by: Spectra Logic Corporation
------------------------------------------------------------------------
Can you try perhaps commenting out the 'bp->bio_flags |= BIO_ORDERED' line
changed in geom_io.c in 8.2? That would be effectively reverting this
portion of the diff:
Index: geom_io.c
===================================================================
--- geom_io.c (.../8.1/sys/geom) (revision 237134)
+++ geom_io.c (.../8.2/sys/geom) (revision 237134)
@@ -265,6 +265,7 @@
g_trace(G_T_BIO, "bio_flush(%s)", cp->provider->name);
bp = g_alloc_bio();
bp->bio_cmd = BIO_FLUSH;
+ bp->bio_flags |= BIO_ORDERED;
bp->bio_done = NULL;
bp->bio_attribute = NULL;
bp->bio_offset = cp->provider->mediasize;