On Sat, Dec 09, 2017 at 10:14:50PM -0500, Mark Johnston wrote: > On Sat, Dec 09, 2017 at 07:36:59PM -0700, Warner Losh wrote: > > On Sat, Dec 9, 2017 at 11:03 AM, Andriy Gapon <a...@freebsd.org> wrote: > > > > > On 09/12/2017 17:44, Mark Johnston wrote: > > > > Some GEOMs do not appear to handle BIO_ORDERED correctly, meaning that > > > the > > > > barrier write may not work as intended. > > > > > > There's a few places we send down a BIO_ORDERED BIO_FLUSH command > > (see softdep_synchronize for one). Will those matter? > > Some classes have separate handling for BIO_FLUSH, so it depends. I > think gmirror's handling is buggy independent of BIO_ORDERED: > g_mirror_start() sends BIO_FLUSH commands directly to the mirrors, > while reads and writes are queued for handling by the gmirror worker > thread. So as far as I can tell, a BIO_WRITE which arrives at the > gmirror provider before a BIO_FLUSH might be sent to the mirrors > after that BIO_FLUSH. I would expect BIO_FLUSH to implicitly have > something like release semantics, i.e., a BIO_FLUSH shouldn't be > reordered with a BIO_WRITE that preceded it. But I might be > misunderstanding. > > > As I've noted elsewhere: I'd really like to kill BIO_ORDERED since it has > > too many icky effects (and BIO_FLUSH + BIO_ORDERED isn't guaranteed to do, > > well, anything since it can turn into a NOP in a number of places. Plus > > many of the implementations of BIO_ORDERED assume the drive is like SCSI > > and you just set the right tag to make it 'ordered'. For ATA we issue a non > > NCQ command, which is a full drain of outstanding commands, send this > > command, then start them again which really shuts down the parallelism we > > implemented NCQ for :(. We do similar for NVME which is even worse. There > > we have multiple submission queues in the hardware. To simulated it, we do > > a similar drain, but that's going to get in the way as we move to NUMA and > > systems where we try to do the I/O entirely on one CPU (both submission and > > completion) and ordered I/O is guaranteed lock contention. > > Independent of this, it doesn't really look like we have any way of > handling write errors when dependencies are enforced using BIO_ORDERED. > In the case of the babarrierwrite() consumer in FFS, what happens if the > inode block write fails due to a transient error, but the following CG > update succeeds?
Looking at this some more, I think iosched is ok since bioq_disksort() does in fact handle BIO_ORDERED. I'm concerned though that there are places in the tree where we should be using BIO_ORDERED but aren't. gmirror metadata writes for instance are performed synchronously but should not be reordered with preceding BIO_WRITEs. Anyway, I've posted a review for some gmirror I/O ordering fixes here if anyone is interested: https://reviews.freebsd.org/D13559 _______________________________________________ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"