Peter Schuller wrote:
> > fsync() is, indeed, expensive. Lots of calls to fsync() that are not
> > necessary for correct application operation EXCEPT as a workaround for
> > lame filesystem re-ordering are a sure way to kill performance.
>
> IMO the fundamental problem is that the only way to ac
Uh, I should probably clarify some things (I was too quick to hit
send):
> IMO the fundamental problem is that the only way to achieve a write
> barrier is fsync() (disregarding direct I/O etc). Again I would just
> like an fbarrier() as I've mentioned on the list previously. It seems
Of course i
> fsync() is, indeed, expensive. Lots of calls to fsync() that are not
> necessary for correct application operation EXCEPT as a workaround for
> lame filesystem re-ordering are a sure way to kill performance.
IMO the fundamental problem is that the only way to achieve a write
barrier is fsync()
> "bf" == Bob Friesenhahn writes:
bf> If ZFS does try to order its disk updates in cronological
bf> order without prioritizing metadata updates over data, then
bf> the risk is minimized.
AIUI it doesn't exactly order them, just puts them into 5-second
chunks. so it rolls the on-
On Thu, 19 Mar 2009, Miles Nordin wrote:
And the guarantees ARE minimal---just:
http://www.google.com/search?q=POSIX+%22crash+consistency%22
and you'll find even people against T'so's who want to change ext4
still agree POSIX is on T'so's side.
Clearly I am guilty of inflated expectations.
> "dm" == David Magda writes:
dm> is this what POSIX actually specifies?
i doubt it. If it did, it would basically mandate a log-structured /
COW filesystem, which, although not a _bad_ idea, is way too far from
a settled debate to be enshrining in a mandatory ``standard'' (ex.,
the dat
POSIX has a Synchronized I/O Data (and File) Integrity Completion
definition (line 115434 of the Issue 7 (POSIX.1-2008) specification).
What it
says is that writes for a byte range in a file must complete before any
pending
reads for that byte range are satisfied.
It does not say that if you
On Mar 18, 2009, at 12:43, Bob Friesenhahn wrote:
POSIX does not care about "disks" or "filesystems". The only
correct behavior is for operations to be applied in the order that
they are requested of the operating system. This is a core function
of any operating system. It is therefore o
On Wed, March 18, 2009 11:59, Richard Elling wrote:
> Bob Friesenhahn wrote:
>> As it happens, current versions of my own application should be safe
>> from this Linux filesystem bug, but older versions are not. There is
>> even a way to request fsync() on every file close, but that could be
>> qu
On Wed, March 18, 2009 11:43, Bob Friesenhahn wrote:
> On Wed, 18 Mar 2009, Joerg Schilling wrote:
>>
>> The problem in this case is not whether rename() is atomic but whether
>> the
>> file that replaces the old file in an atomic rename() operation is in a
>> stable state on the disk before calli
>On Wed, Mar 18, 2009 at 11:43:09AM -0500, Bob Friesenhahn wrote:
>> In summary, I don't agree with you that the misbehavior is correct,
>> but I do agree that copious expensive fsync()s should be assured to
>> work around the problem.
>
>fsync() is, indeed, expensive. Lots of calls to fsync()
> "c" == Miles Nordin writes:
c> fbarrier()
on second thought that couldn't help this problem. The goal is to
associate writing to the directory (rename) with writing to the file
referenced by that inode/handle (write/fsync/``fbarrier''), and in
POSIX these two things are pretty distan
> "ja" == James Andrewartha writes:
ja> other people are arguing that POSIX says rename(2) is atomic,
Their statement is true but it's NOT an argument against T'so who is
100% right: the applications using that calling sequence for crash
consistency are not portable under POSIX.
atomic
On Wed, 18 Mar 2009, Richard Elling wrote:
Bob Friesenhahn wrote:
As it happens, current versions of my own application should be safe from
this Linux filesystem bug, but older versions are not. There is even a way
to request fsync() on every file close, but that could be quite expensive
so i
On Wed, Mar 18, 2009 at 11:43:09AM -0500, Bob Friesenhahn wrote:
> In summary, I don't agree with you that the misbehavior is correct,
> but I do agree that copious expensive fsync()s should be assured to
> work around the problem.
fsync() is, indeed, expensive. Lots of calls to fsync() that ar
On Wed, Mar 18, 2009 at 11:15:48AM -0400, Moore, Joe wrote:
> Posix doesn't require the OS to sync() the file contents on close for
> local files like it does for NFS access? How odd.
Why should it? If POSIX is agnostic as to system crashes / power
failures, then why should it say anything about
Bob Friesenhahn wrote:
As it happens, current versions of my own application should be safe
from this Linux filesystem bug, but older versions are not. There is
even a way to request fsync() on every file close, but that could be
quite expensive so it is not the default.
Pragmatically, it is
On Wed, March 18, 2009 05:08, Joerg Schilling wrote:
> The problem in this case is not whether rename() is atomic but whether the
> file that replaces the old file in an atomic rename() operation is in a
> stable state on the disk before calling rename().
Good, I was hoping somebody saw it that
On Wed, 18 Mar 2009, Joerg Schilling wrote:
The problem in this case is not whether rename() is atomic but whether the
file that replaces the old file in an atomic rename() operation is in a
stable state on the disk before calling rename().
This topic is quite disturbing to me ...
The callin
>AFAIUI, the ZFS transaction group maintains write ordering, at least as far as
>write()s to the fil
e would be in the ZIL ahead of the rename() metadata updates.
>
>So I think the atomicity is maintained without requiring the application to
>call fsync() before cl
osing the file. If the TXG i
Joerg Schilling wrote:
> James Andrewartha wrote:
> > Recently there's been discussion [1] in the Linux community about how
> > filesystems should deal with rename(2), particularly in the case of a crash.
> > ext4 was found to truncate files after a crash, that had been written with
> > open("foo
James Andrewartha wrote:
> Recently there's been discussion [1] in the Linux community about how
> filesystems should deal with rename(2), particularly in the case of a crash.
> ext4 was found to truncate files after a crash, that had been written with
> open("foo.tmp"), write(), close() and then
>Recently there's been discussion [1] in the Linux community about how
>filesystems should deal with rename(2), particularly in the case of a crash.
>ext4 was found to truncate files after a crash, that had been written with
>open("foo.tmp"), write(), close() and then rename("foo.tmp", "foo"). Thi
Hi all,
Recently there's been discussion [1] in the Linux community about how
filesystems should deal with rename(2), particularly in the case of a crash.
ext4 was found to truncate files after a crash, that had been written with
open("foo.tmp"), write(), close() and then rename("foo.tmp", "foo").
24 matches
Mail list logo