subject:"\[HACKERS\] silent data loss with ext4 \/ all current versions"

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-05-12 Thread Michael Paquier

On Thu, May 12, 2016 at 2:58 PM, Michael Paquier wrote: > On Mon, Mar 28, 2016 at 8:25 AM, Andres Freund wrote: >> I've also noticed that > > Coming back to this issue because... > >> a) pg_basebackup doesn't do anything about durability (it probably needs >>a very similar patch to the one pg

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-05-11 Thread Michael Paquier

On Mon, Mar 28, 2016 at 8:25 AM, Andres Freund wrote: > I've also noticed that Coming back to this issue because... > a) pg_basebackup doesn't do anything about durability (it probably needs >a very similar patch to the one pg_rewind just received). I think that one of the QE tests running

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-27 Thread Michael Paquier

On Mon, Mar 28, 2016 at 8:25 AM, Andres Freund wrote: > On 2016-03-18 15:08:32 +0900, Michael Paquier wrote: >> + fprintf(stderr, _("%s: could not rename file \"%s\": %s\n"), >> + progname, current_walfile_name, >> strerror(errno)); > > current_walfile_name

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-27 Thread Andres Freund

Hi, On 2016-03-18 15:08:32 +0900, Michael Paquier wrote: > +/* > + * Sync data directory to ensure that what has been generated up to now is > + * persistent in case of a crash, and this is done once globally for > + * performance reasons as sync requests on individual files would be > + * a negat

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-19 Thread Michael Paquier

On Wed, Mar 16, 2016 at 2:46 AM, Andres Freund wrote: > On 2016-03-15 15:39:50 +0100, Michael Paquier wrote: >> Yeah, true. We definitely need to do something for that, even for HEAD >> it seems like an overkill to have something in for example src/common >> to allow frontends to have something if

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-19 Thread Andres Freund

On 2016-03-17 23:05:42 +0900, Michael Paquier wrote: > > Are you working on a fix for pg_rewind? Let's go with initdb -S in a > > first iteration, then we can, if somebody is interest enough, work on > > making this nicer in master. > > I am really -1 for this approach. Wrapping initdb -S with > f

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-19 Thread Michael Paquier

On Fri, Mar 18, 2016 at 12:03 AM, Andres Freund wrote: > This is a *much* more expensive approach though. Doing the fsync > directly after modifying the file. One file by one file. Will usually > result in each fsync blocking for a while. > > In comparison of doing a flush and then an fsync pass o

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-18 Thread Robert Haas

On Thu, Mar 17, 2016 at 11:03 AM, Andres Freund wrote: > On 2016-03-17 23:05:42 +0900, Michael Paquier wrote: >> > Are you working on a fix for pg_rewind? Let's go with initdb -S in a >> > first iteration, then we can, if somebody is interest enough, work on >> > making this nicer in master. >> >>

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-15 Thread David Steele

On 3/15/16 10:39 AM, Michael Paquier wrote: > On Thu, Mar 10, 2016 at 4:25 AM, Andres Freund wrote: > >> Note that we currently have some frontend programs with the equivalent >> problem. Most importantly receivelog.c (pg_basebackup/pg_recveivexlog) >> are missing pretty much the same directory f

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-15 Thread Andres Freund

On 2016-03-15 15:39:50 +0100, Michael Paquier wrote: > I have finally been able to spend some time reviewing what you pushed > on back-branches, and things are in correct shape I think. One small > issue that I have is that for EXEC_BACKEND builds, in > write_nondefault_variables we still use one i

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-15 Thread Michael Paquier

On Thu, Mar 10, 2016 at 4:25 AM, Andres Freund wrote: > I've finally pushed these, after making a number of mostly cosmetic > fixes. The only of real consequence is that I've removed the durable_* > call from the renames to .deleted in xlog[archive].c - these don't need > to be durable, and are win

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-09 Thread Andres Freund

On 2016-03-07 21:55:52 -0800, Andres Freund wrote: > Here's my updated version. > > Note that I've split the patch into two. One for the infrastructure, and > one for the callsites. I've finally pushed these, after making a number of mostly cosmetic fixes. The only of real consequence is that I'v

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-08 Thread Joshua D. Drake

On 03/08/2016 02:16 PM, Robert Haas wrote: On Mon, Mar 7, 2016 at 10:18 PM, Andres Freund wrote: Instead of "durable" I think that "persistent" makes more sense. I find durable a lot more descriptive. persistent could refer to retrying the rename or something. Yeah, I like durable, too. T

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-08 Thread Andres Freund

On 2016-03-08 23:47:48 +0100, Tomas Vondra wrote: > I've repeated the power-loss testing today. With the patches applied I'm > not longer able to reproduce the issue (despite trying about 10x), while > without them I've hit it on the first try. This is on kernel 4.4.2. Yay, thanks for testing! An

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-08 Thread Tomas Vondra

Hi, On Mon, 2016-03-07 at 21:55 -0800, Andres Freund wrote: > On 2016-03-08 12:26:34 +0900, Michael Paquier wrote: > > On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund wrote: > > > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: > > >> I have spent a couple of hours looking at that in details,

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-08 Thread Robert Haas

On Mon, Mar 7, 2016 at 10:18 PM, Andres Freund wrote: >> Instead of "durable" I think that "persistent" makes more sense. > > I find durable a lot more descriptive. persistent could refer to > retrying the rename or something. Yeah, I like durable, too. -- Robert Haas EnterpriseDB: http://www.e

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Andres Freund

Hi, On 2016-03-08 16:21:45 +0900, Michael Paquier wrote: > + durable_link_or_rename(tmppath, path, ERROR); > + durable_rename(path, xlogfpath, ERROR); > You may want to add a (void) cast in front of those calls for correctness. "correctness"? This is neatnikism, not correctness. I've actual

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Michael Paquier

On Tue, Mar 8, 2016 at 2:55 PM, Andres Freund wrote: > On 2016-03-08 12:26:34 +0900, Michael Paquier wrote: >> On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund wrote: >> > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: >> >> I have spent a couple of hours looking at that in details, and the >

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Andres Freund

On 2016-03-08 12:26:34 +0900, Michael Paquier wrote: > On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund wrote: > > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: > >> I have spent a couple of hours looking at that in details, and the > >> patch is neat. > > > > Cool. Doing some more polishing

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Michael Paquier

On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund wrote: > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: >> I have spent a couple of hours looking at that in details, and the >> patch is neat. > > Cool. Doing some more polishing right now. Will be back with an updated > version soonish. > > Di

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Andres Freund

Hi, On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: > I have spent a couple of hours looking at that in details, and the > patch is neat. Cool. Doing some more polishing right now. Will be back with an updated version soonish. Did you do some testing? > + * This routine ensures that, after

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Michael Paquier

On Mon, Mar 7, 2016 at 3:38 PM, Andres Freund wrote: > On 2016-03-05 19:54:05 -0800, Andres Freund wrote: >> I started working on this; delayed by taking longer than planned on the >> logical decoding stuff (quite a bit complicated by >> e1a11d93111ff3fba7a91f3f2ac0b0aca16909a8). I'm not very hap

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-06 Thread Andres Freund

Hi, On 2016-03-05 19:54:05 -0800, Andres Freund wrote: > I started working on this; delayed by taking longer than planned on the > logical decoding stuff (quite a bit complicated by > e1a11d93111ff3fba7a91f3f2ac0b0aca16909a8). I'm not very happy with the > error handling as it is right now. For

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-05 Thread Andres Freund

On 2016-03-05 22:25:36 +0900, Michael Paquier wrote: > OK, I hacked a v7: > - Move the link()/rename() group with HAVE_WORKING_LINK into a single > routine, making the previous link_safe renamed to replace_safe. This > is sharing a lot of things with rename_safe. I am not sure it is worth > complic

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-05 Thread Michael Paquier

On Sat, Mar 5, 2016 at 7:47 AM, Andres Freund wrote: > On 2016-03-05 07:43:00 +0900, Michael Paquier wrote: >> On Sat, Mar 5, 2016 at 7:35 AM, Andres Freund wrote: >> > On 2016-03-04 14:51:50 +0900, Michael Paquier wrote: >> >> On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: >> >> Hm. OK. I

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Andres Freund

On 2016-03-05 07:43:00 +0900, Michael Paquier wrote: > On Sat, Mar 5, 2016 at 7:35 AM, Andres Freund wrote: > > On 2016-03-04 14:51:50 +0900, Michael Paquier wrote: > >> On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: > >> Hm. OK. I don't see any reason why switching to link() even in code >

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Michael Paquier

On Sat, Mar 5, 2016 at 7:37 AM, Andres Freund wrote: > On 2016-03-05 07:29:35 +0900, Michael Paquier wrote: >> OK. I could produce that by tonight my time, not before unfortunately. > > I'm switching to this patch, after pushing the pending logical decoding > fixes. Probably not today, but tomorro

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Michael Paquier

On Sat, Mar 5, 2016 at 7:35 AM, Andres Freund wrote: > On 2016-03-04 14:51:50 +0900, Michael Paquier wrote: >> On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: >> > I don't think we want any stat()s here. I'd much, much rather check open >> > for ENOENT. >> >> OK. So you mean more or less tha

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Andres Freund

On 2016-03-05 07:29:35 +0900, Michael Paquier wrote: > OK. I could produce that by tonight my time, not before unfortunately. I'm switching to this patch, after pushing the pending logical decoding fixes. Probably not today, but tomorrow PST afternoon should work. > And FWIW, per the comments of

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Andres Freund

On 2016-03-04 14:51:50 +0900, Michael Paquier wrote: > On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: > > Hi, > > Thanks for the review. > > >> +/* > >> + * rename_safe -- rename of a file, making it on-disk persistent > >> + * > >> + * This routine ensures that a rename file persists in c

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Michael Paquier

On Sat, Mar 5, 2016 at 1:23 AM, Robert Haas wrote: > On Fri, Mar 4, 2016 at 11:09 AM, Tom Lane wrote: >> Alvaro Herrera writes: >>> I would like to have a patch for this finalized today, so that we can >>> apply to master before or during the weekend; with it in the tree for >>> about a week we

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Robert Haas

On Fri, Mar 4, 2016 at 11:09 AM, Tom Lane wrote: > Alvaro Herrera writes: >> I would like to have a patch for this finalized today, so that we can >> apply to master before or during the weekend; with it in the tree for >> about a week we can be more confident and backpatch close to next >> weeke

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Tom Lane

Alvaro Herrera writes: > I would like to have a patch for this finalized today, so that we can > apply to master before or during the weekend; with it in the tree for > about a week we can be more confident and backpatch close to next > weekend, so that we see it in the next set of minor releases.

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Alvaro Herrera

I would like to have a patch for this finalized today, so that we can apply to master before or during the weekend; with it in the tree for about a week we can be more confident and backpatch close to next weekend, so that we see it in the next set of minor releases. Does that sound good? -- Álv

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-03 Thread Michael Paquier

On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: > Hi, Thanks for the review. >> +/* >> + * rename_safe -- rename of a file, making it on-disk persistent >> + * >> + * This routine ensures that a rename file persists in case of a crash by >> using >> + * fsync on the old and new files befor

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-03 Thread Andres Freund

Hi, > +/* > + * rename_safe -- rename of a file, making it on-disk persistent > + * > + * This routine ensures that a rename file persists in case of a crash by > using > + * fsync on the old and new files before and after performing the rename so > as > + * this categorizes as an all-or-nothing

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-23 Thread Michael Paquier

On Wed, Feb 24, 2016 at 7:26 AM, Tomas Vondra wrote: > 1) I'm not quite sure why the patch adds missing_ok to fsync_fname()? The > only place where we use missing_ok=true is in rename_safe, where right at > the beginning we do this: > > fsync_fname(newfile, false, true); > > I.e. we're fsyncing

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-23 Thread Tomas Vondra

Hi, On 02/05/2016 10:40 AM, Michael Paquier wrote: On Thu, Feb 4, 2016 at 2:34 PM, Michael Paquier wrote: On Thu, Feb 4, 2016 at 12:02 PM, Michael Paquier wrote: On Tue, Feb 2, 2016 at 4:20 PM, Michael Paquier wrote: ... So, attached is an updated patch that adds a new routine link_safe()

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-06 Thread Andres Freund

On 2016-02-06 17:43:48 +0100, Tomas Vondra wrote: > >Still the data is here... But well. I won't insist. > > Huh? This thread started by an example how to cause loss of committed > transactions. That fits my definition of "data loss" quite well. Agreed, that view doesn't seem to make much sense.

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-06 Thread Tomas Vondra

Hi, On 02/06/2016 01:16 PM, Michael Paquier wrote: On Sat, Feb 6, 2016 at 2:11 AM, Tomas Vondra wrote: On 02/04/2016 09:59 AM, Michael Paquier wrote: On Tue, Feb 2, 2016 at 9:59 AM, Andres Freund wrote: On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: And there is no actual risk of

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-06 Thread Michael Paquier

On Sat, Feb 6, 2016 at 2:11 AM, Tomas Vondra wrote: > On 02/04/2016 09:59 AM, Michael Paquier wrote: >> >> On Tue, Feb 2, 2016 at 9:59 AM, Andres Freund wrote: >>> >>> On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: And there is no actual risk of data loss >>> >>> >>> Huh? >> >> >>

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-05 Thread Tomas Vondra

On 02/04/2016 09:59 AM, Michael Paquier wrote: On Tue, Feb 2, 2016 at 9:59 AM, Andres Freund wrote: On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: And there is no actual risk of data loss Huh? More precise: what I mean here is that should an OS crash or a power failure happen, we wou

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-05 Thread Michael Paquier

On Thu, Feb 4, 2016 at 2:34 PM, Michael Paquier wrote: > On Thu, Feb 4, 2016 at 12:02 PM, Michael Paquier > wrote: >> On Tue, Feb 2, 2016 at 4:20 PM, Michael Paquier wrote: >>> Not wrong, and this leads to the following: >>> void rename_safe(const char *old, const char *new, bool isdir, int eleve

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-04 Thread Michael Paquier

On Thu, Feb 4, 2016 at 12:02 PM, Michael Paquier wrote: > On Tue, Feb 2, 2016 at 4:20 PM, Michael Paquier wrote: >> Not wrong, and this leads to the following: >> void rename_safe(const char *old, const char *new, bool isdir, int elevel); >> Controlling elevel is necessary per the multiple code pa

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-04 Thread Michael Paquier

On Tue, Feb 2, 2016 at 4:20 PM, Michael Paquier wrote: > Not wrong, and this leads to the following: > void rename_safe(const char *old, const char *new, bool isdir, int elevel); > Controlling elevel is necessary per the multiple code paths that would > use it. Some use ERROR, most of them FATAL, a

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-04 Thread Michael Paquier

On Tue, Feb 2, 2016 at 9:59 AM, Andres Freund wrote: > On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: >> And there is no actual risk of data loss > > Huh? More precise: what I mean here is that should an OS crash or a power failure happen, we would fall back to recovery at next restart, so

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Michael Paquier

On Tue, Feb 2, 2016 at 1:07 AM, Andres Freund wrote: > On 2016-01-25 16:30:47 +0900, Michael Paquier wrote: >> diff --git a/src/backend/access/transam/xlog.c >> b/src/backend/access/transam/xlog.c >> index a2846c4..b124f90 100644 >> --- a/src/backend/access/transam/xlog.c >> +++ b/src/backend/acc

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Andres Freund

On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: > And there is no actual risk of data loss Huh? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Michael Paquier

On Tue, Feb 2, 2016 at 12:49 AM, Alvaro Herrera wrote: > Michael Paquier wrote: >> On Mon, Jan 25, 2016 at 6:50 PM, Tomas Vondra >> wrote: >> > Seems OK to me. Thanks for the time and improvements! >> >> Thanks. Perhaps a committer could have a look then? I have switched >> the patch as such in t

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Michael Paquier

On Tue, Feb 2, 2016 at 1:08 AM, Andres Freund wrote: > On 2016-02-01 16:49:46 +0100, Alvaro Herrera wrote: >> Yeah. On 9.4 there are already some conflicts, and I'm sure there will >> be more in almost each branch. Does anyone want to volunteer for >> producing per-branch versions? > >> The next

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Andres Freund

On 2016-02-01 16:49:46 +0100, Alvaro Herrera wrote: > Yeah. On 9.4 there are already some conflicts, and I'm sure there will > be more in almost each branch. Does anyone want to volunteer for > producing per-branch versions? > The next minor release is to be tagged next week and it'd be good to

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Andres Freund

On 2016-01-25 16:30:47 +0900, Michael Paquier wrote: > diff --git a/src/backend/access/transam/xlog.c > b/src/backend/access/transam/xlog.c > index a2846c4..b124f90 100644 > --- a/src/backend/access/transam/xlog.c > +++ b/src/backend/access/transam/xlog.c > @@ -3278,6 +3278,14 @@ InstallXLogFileSe

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Alvaro Herrera

Michael Paquier wrote: > On Mon, Jan 25, 2016 at 6:50 PM, Tomas Vondra > wrote: > > Seems OK to me. Thanks for the time and improvements! > > Thanks. Perhaps a committer could have a look then? I have switched > the patch as such in the CF app. Seeing the accumulated feedback > upthread that's so

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-25 Thread Michael Paquier

On Mon, Jan 25, 2016 at 6:50 PM, Tomas Vondra wrote: > Seems OK to me. Thanks for the time and improvements! Thanks. Perhaps a committer could have a look then? I have switched the patch as such in the CF app. Seeing the accumulated feedback upthread that's something that should be backpatched. -

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-25 Thread Tomas Vondra

On 01/25/2016 08:30 AM, Michael Paquier wrote: On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier wrote: ,,, My first line of thoughts after looking at the patch is that I am not against adding those fsync calls on HEAD as there is roughly an advantage to not go back to recovery in most cases

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-24 Thread Michael Paquier

On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier wrote: > On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra > wrote: >> On 01/22/2016 06:45 AM, Michael Paquier wrote: >>> Here are some comments about your patch after a look at the code. >>> >>> Regarding the additions in fsync_fname() in xlog.c: >>>

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-23 Thread Michael Paquier

On Sat, Jan 23, 2016 at 11:39 AM, Tomas Vondra wrote: > On 01/23/2016 02:35 AM, Michael Paquier wrote: >> >> On Fri, Jan 22, 2016 at 9:41 PM, Greg Stark wrote: >>> On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra >>> wrote: >>> LVM snapshots would have the advantage that you can keep running the >>

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Tomas Vondra

On 01/23/2016 02:35 AM, Michael Paquier wrote: On Fri, Jan 22, 2016 at 9:41 PM, Greg Stark wrote: On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra wrote: On 01/22/2016 06:45 AM, Michael Paquier wrote: So, I have been playing with a Linux VM with VMware Fusion and on ext4 with data=ordered the

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Michael Paquier

On Fri, Jan 22, 2016 at 9:41 PM, Greg Stark wrote: > On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra > wrote: >> On 01/22/2016 06:45 AM, Michael Paquier wrote: >> >>> So, I have been playing with a Linux VM with VMware Fusion and on >>> ext4 with data=ordered the renames are getting lost if the roo

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Andres Freund

On 2016-01-22 21:32:29 +0900, Michael Paquier wrote: > Group shot with 3), 4) and 5). Well, there is no data loss here, > putting me in the direction of considering this addition of an fsync > as an optimization and not a bug. I think this is an extremely weak argument. The reasoning when exactly

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Greg Stark

On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra wrote: > On 01/22/2016 06:45 AM, Michael Paquier wrote: > >> So, I have been playing with a Linux VM with VMware Fusion and on >> ext4 with data=ordered the renames are getting lost if the root >> folder is not fsync. By killing-9 the VM I am able to r

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Michael Paquier

On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra wrote: > On 01/22/2016 06:45 AM, Michael Paquier wrote: >> Here are some comments about your patch after a look at the code. >> >> Regarding the additions in fsync_fname() in xlog.c: >> 1) In InstallXLogFileSegment, rename() will be called only if >> H

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Magnus Hagander

On Fri, Jan 22, 2016 at 9:26 AM, Tomas Vondra wrote: > Hi, > > On 01/22/2016 06:45 AM, Michael Paquier wrote: > > So, I have been playing with a Linux VM with VMware Fusion and on >> ext4 with data=ordered the renames are getting lost if the root >> folder is not fsync. By killing-9 the VM I am a

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Tomas Vondra

Hi, On 01/22/2016 06:45 AM, Michael Paquier wrote: So, I have been playing with a Linux VM with VMware Fusion and on ext4 with data=ordered the renames are getting lost if the root folder is not fsync. By killing-9 the VM I am able to reproduce that really easily. Yep. Same experience here (w

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-21 Thread Michael Paquier

On Tue, Jan 19, 2016 at 4:20 PM, Tomas Vondra wrote: > > > On 01/19/2016 08:03 AM, Michael Paquier wrote: >> >> On Tue, Jan 19, 2016 at 3:58 PM, Tomas Vondra >> wrote: >>> >>> > ... Tomas, I am planning to have a look at that, because it seems to be important. In case it becomes lo

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-18 Thread Tomas Vondra

On 01/19/2016 08:03 AM, Michael Paquier wrote: On Tue, Jan 19, 2016 at 3:58 PM, Tomas Vondra wrote: ... Tomas, I am planning to have a look at that, because it seems to be important. In case it becomes lost on my radar, do you mind if I add it to the 2016-03 CF? Well, what else can I do

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-18 Thread Michael Paquier

On Tue, Jan 19, 2016 at 3:58 PM, Tomas Vondra wrote: > > > On 01/19/2016 07:44 AM, Michael Paquier wrote: >> >> On Wed, Dec 2, 2015 at 3:24 PM, Michael Paquier >> wrote: >>> >>> On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier >>> wrote: On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-18 Thread Tomas Vondra

On 01/19/2016 07:44 AM, Michael Paquier wrote: On Wed, Dec 2, 2015 at 3:24 PM, Michael Paquier wrote: On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier wrote: On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra wrote: Attached is v2 of the patch, that (a) adds explicit fsync on the parent directo

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-18 Thread Michael Paquier

On Wed, Dec 2, 2015 at 3:24 PM, Michael Paquier wrote: > On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier > wrote: >> On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra >> wrote: >>> Attached is v2 of the patch, that >>> >>> (a) adds explicit fsync on the parent directory after all the rename() >>>

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Michael Paquier

On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier wrote: > On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra > wrote: >> Attached is v2 of the patch, that >> >> (a) adds explicit fsync on the parent directory after all the rename() >> calls in timeline.c, xlog.c, xlogarchive.c and pgarch.c >> >> (b)

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Michael Paquier

On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra wrote: > Attached is v2 of the patch, that > > (a) adds explicit fsync on the parent directory after all the rename() > calls in timeline.c, xlog.c, xlogarchive.c and pgarch.c > > (b) adds START/END_CRIT_SECTION around the new fsync_fname calls >

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Tomas Vondra

Attached is v2 of the patch, that (a) adds explicit fsync on the parent directory after all the rename() calls in timeline.c, xlog.c, xlogarchive.c and pgarch.c (b) adds START/END_CRIT_SECTION around the new fsync_fname calls (except for those in timeline.c, as the START/END_CRIT_SECTION

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Tomas Vondra

On 12/01/2015 10:44 PM, Peter Eisentraut wrote: On 11/27/15 8:18 AM, Michael Paquier wrote: On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra wrote: So, what's going on? The problem is that while the rename() is atomic, it's not guaranteed to be durable without an explicit fsync on the parent di

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Peter Eisentraut

On 11/27/15 8:18 AM, Michael Paquier wrote: > On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra > wrote: >> > So, what's going on? The problem is that while the rename() is atomic, it's >> > not guaranteed to be durable without an explicit fsync on the parent >> > directory. And by default we only do

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra

On 11/29/2015 03:33 PM, Tomas Vondra wrote: Hi, On 11/29/2015 02:38 PM, Craig Ringer wrote: I've had a few tries at implementing a qemu-based crashtester where it hard kills the qemu instance at a random point then starts it back up. I've tried to reproduce the issue by killing a qemu VM, a

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra

On 11/29/2015 02:41 PM, Craig Ringer wrote: On 27 November 2015 at 19:17, Tomas Vondra mailto:tomas.von...@2ndquadrant.com>> wrote: It's also possible to mitigate this by setting wal_sync_method=fsync Are you sure? https://lwn.net/Articles/322823/ tends to suggest that fsync() on the fi

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra

Hi, On 11/29/2015 02:38 PM, Craig Ringer wrote: On 27 November 2015 at 21:28, Greg Stark mailto:st...@mit.edu>> wrote: On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra mailto:tomas.von...@2ndquadrant.com>> wrote: > I plan to do more power failure testing soon, with more complex te

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Craig Ringer

On 27 November 2015 at 19:17, Tomas Vondra wrote: > It's also possible to mitigate this by setting wal_sync_method=fsync Are you sure? https://lwn.net/Articles/322823/ tends to suggest that fsync() on the file is insufficient to ensure rename() is persistent, though it's somewhat old. -- C

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Craig Ringer

On 27 November 2015 at 21:28, Greg Stark wrote: > On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra > wrote: > > I plan to do more power failure testing soon, with more complex test > > scenarios. I suspect there might be other similar issues (e.g. when we > > rename a file before a checkpoint and

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Michael Paquier

On Sat, Nov 28, 2015 at 3:01 AM, Tomas Vondra wrote: > > > On 11/27/2015 02:18 PM, Michael Paquier wrote: >> >> On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra >> wrote: >>> >>> So, what's going on? The problem is that while the rename() is atomic, >>> it's >>> not guaranteed to be durable without

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Tomas Vondra

On 11/27/2015 02:18 PM, Michael Paquier wrote: On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra wrote: So, what's going on? The problem is that while the rename() is atomic, it's not guaranteed to be durable without an explicit fsync on the parent directory. And by default we only do fdatasync o

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Tomas Vondra

Hi, On 11/27/2015 02:28 PM, Greg Stark wrote: On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra wrote: I plan to do more power failure testing soon, with more complex test scenarios. I suspect there might be other similar issues (e.g. when we rename a file before a checkpoint and don't fsync the

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Greg Stark

On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra wrote: > I plan to do more power failure testing soon, with more complex test > scenarios. I suspect there might be other similar issues (e.g. when we > rename a file before a checkpoint and don't fsync the directory - then the > rename won't be repla

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Michael Paquier

On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra wrote: > So, what's going on? The problem is that while the rename() is atomic, it's > not guaranteed to be durable without an explicit fsync on the parent > directory. And by default we only do fdatasync on the recycled segments, > which may not force

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Teodor Sigaev

What happens is that when we recycle WAL segments, we rename them and then sync them using fdatasync (which is the default on Linux). However fdatasync does not force fsync on the parent directory, so in case of power failure the rename may get lost. The recovery won't realize those segments actua

[HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Tomas Vondra

Hi, I've been doing some power failure tests (i.e. unexpectedly interrupting power) a few days ago, and I've discovered a fairly serious case of silent data loss on ext3/ext4. Initially i thought it's a filesystem bug, but after further investigation I'm pretty sure it's our fault. What ha

86 matches

Mail list logo