Re: Proposal for "proper" durable fsync() and fdatasync()

Jeff Garzik Tue, 26 Feb 2008 09:58:31 -0800

Jamie Lokier wrote:

Jeff Garzik wrote:
Nick Piggin wrote:
Anyway, the idea of making fsync/fdatasync etc. safe by default is
a good idea IMO, and is a bad bug that we don't do that :(
Agreed... it's also disappointing that [unless I'm mistaken] you haveto hack each filesystem to support barriers.
It seems far easier to make sync_blkdev() Do The Right Thing, andmagically make all filesystems data-safe.
Well, you need ordered metadata writes, barriers _and_ flushes with
some filesystems.

Merely writing all the data pages than issuing a drive cache flush
won't Do The Right Thing with those filesystems - someone already
mentioned Btrfs, where it won't.

Oh certainly. That's why we have a VFS :) fsync for NFS will lookquite different, too.

But I agree that your suggestion would make a superb default, for
filesystems which don't provide their own function.


Yep.  That would immediately cover a bunch of filesystems.

It's not optimal even then.

  Devices: On a software RAID, you ideally don't want to issue flushes
  to all drives if your database did a 1 block commit entry.  (But they
  probably use O_DIRECT anyway, changing the rules again).  But all that
  can be optimised in generic VFS code eventually.  It doesn't need
  filesystem assistance in most cases.

My own idea is that we create a FLUSH command for blkdev request queues,to exist alongside READ, WRITE, and the current barrier implementation.Then FLUSH could be passed down through MD or DM.


        Jeff


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Proposal for "proper" durable fsync() and fdatasync()

Reply via email to