Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-13 Thread Roch - PAE
> > Given ZFS's copy-on-write transactional model, would it not be almost trivial > to implement fbarrier()? Basically just choose to wrap up the transaction at > the point of fbarrier() and that's it. > > Am I missing something? How do you guarantee that the disk driver and/or the

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-13 Thread Peter Schuller
> > That is interesting. Could this account for disproportionate kernel > > CPU usage for applications that perform I/O one byte at a time, as > > compared to other filesystems? (Nevermind that the application > > shouldn't do that to begin with.) > > I just quickly measured this (overwritting

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-13 Thread Roch - PAE
Peter Schuller writes: > > I agree about the usefulness of fbarrier() vs. fsync(), BTW. The cool > > thing is that on ZFS, fbarrier() is a no-op. It's implicit after > > every system call. > > That is interesting. Could this account for disproportionate kernel > CPU usage for applications

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Jeff Bonwick
That is interesting. Could this account for disproportionate kernel CPU usage for applications that perform I/O one byte at a time, as compared to other filesystems? (Nevermind that the application shouldn't do that to begin with.) No, this is entirely a matter of CPU efficiency in the current c

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Peter Schuller
> I agree about the usefulness of fbarrier() vs. fsync(), BTW. The cool > thing is that on ZFS, fbarrier() is a no-op. It's implicit after > every system call. That is interesting. Could this account for disproportionate kernel CPU usage for applications that perform I/O one byte at a time, as c

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Peter Schuller
> That said, actually implementing the underlying mechanisms may not be > worth the trouble. It is only a matter of time before disks have fast > non-volatile memory like PRAM or MRAM, and then the need to do > explicit cache management basically disappears. I meant fbarrier() as a syscall expose

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Jeff Bonwick
Do you agree that their is a major tradeoff of "builds up a wad of transactions in memory"? I don't think so. We trigger a transaction group commit when we have lots of dirty data, or 5 seconds elapse, whichever comes first. In other words, we don't let updates get stale. Jeff

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Erblichs
Jeff Bonwick, Do you agree that their is a major tradeoff of "builds up a wad of transactions in memory"? We loose the changes if we have an unstable environment. Thus, I don't quite understand why a 2-phase approach to commits isn't done. First, t

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Jeff Bonwick
Toby Thain wrote: I'm no guru, but would not ZFS already require strict ordering for its transactions ... which property Peter was exploiting to get "fbarrier()" for free? Exactly. Even if you disable the intent log, the transactional nature of ZFS ensures preservation of event ordering. Not

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Chris Csanady
2007/2/12, Frank Hofmann <[EMAIL PROTECTED]>: On Mon, 12 Feb 2007, Chris Csanady wrote: > This is true for NCQ with SATA, but SCSI also supports ordered tags, > so it should not be necessary. > > At least, that is my understanding. Except that ZFS doesn't talk SCSI, it talks to a target driver.

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Frank Hofmann
On Mon, 12 Feb 2007, Toby Thain wrote: [ ... ] I'm no guru, but would not ZFS already require strict ordering for its transactions ... which property Peter was exploiting to get "fbarrier()" for free? It achieves this by flushing the disk write cache when there's need to barrier. Which compl

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Toby Thain
On 12-Feb-07, at 5:55 PM, Frank Hofmann wrote: On Mon, 12 Feb 2007, Peter Schuller wrote: Hello, Often fsync() is used not because one cares that some piece of data is on stable storage, but because one wants to ensure the subsequent I/O operations are performed after previous I/O operat

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Bart Smaalders
Peter Schuller wrote: Hello, Often fsync() is used not because one cares that some piece of data is on stable storage, but because one wants to ensure the subsequent I/O operations are performed after previous I/O operations are on stable storage. In these cases the latency introduced by an f

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Frank Hofmann
On Mon, 12 Feb 2007, Chris Csanady wrote: [ ... ] > Am I missing something? How do you guarantee that the disk driver and/or the disk firmware doesn't reorder writes ? The only guarantee for in-order writes, on actual storage level, is to complete the outstanding ones before issuing new ones.

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Chris Csanady
2007/2/12, Frank Hofmann <[EMAIL PROTECTED]>: On Mon, 12 Feb 2007, Peter Schuller wrote: > Hello, > > Often fsync() is used not because one cares that some piece of data is on > stable storage, but because one wants to ensure the subsequent I/O operations > are performed after previous I/O opera

Re: [zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Frank Hofmann
On Mon, 12 Feb 2007, Peter Schuller wrote: Hello, Often fsync() is used not because one cares that some piece of data is on stable storage, but because one wants to ensure the subsequent I/O operations are performed after previous I/O operations are on stable storage. In these cases the latency

[zfs-discuss] Implementing fbarrier() on ZFS

2007-02-12 Thread Peter Schuller
Hello, Often fsync() is used not because one cares that some piece of data is on stable storage, but because one wants to ensure the subsequent I/O operations are performed after previous I/O operations are on stable storage. In these cases the latency introduced by an fsync() is completely unn