Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Michael Paquier
On Tue, Mar 28, 2017 at 9:37 AM, Michael Paquier wrote: > On Tue, Mar 28, 2017 at 8:38 AM, Tsunakawa, Takayuki > wrote: >> From: pgsql-hackers-ow...@postgresql.org >>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier >>> Do you think that this qualifies as a bug fix for a

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Michael Paquier
On Tue, Mar 28, 2017 at 8:38 AM, Tsunakawa, Takayuki wrote: > From: pgsql-hackers-ow...@postgresql.org >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier >> Do you think that this qualifies as a bug fix for a backpatch? I would think >> so, but I would not mind waiting for

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier > Do you think that this qualifies as a bug fix for a backpatch? I would think > so, but I would not mind waiting for some dust to be on it before considering > applying that on back-

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Michael Paquier
On Tue, Mar 28, 2017 at 1:34 AM, Teodor Sigaev wrote: > Thank you, pushed. Thanks! Do you think that this qualifies as a bug fix for a backpatch? I would think so, but I would not mind waiting for some dust to be on it before considering applying that on back-branches. Thoughts from others? --

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Teodor Sigaev
Thank you, pushed Michael Paquier wrote: On Fri, Mar 24, 2017 at 11:36 PM, Teodor Sigaev wrote: And the renaming of pg_clog to pg_xact is also my fault. Attached is an updated patch. Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to another dist and le

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Teodor Sigaev
Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync on symlink will do nothing actually. I did not think of this case, but is that really common? There is even I saw a lot such cases. If

Re: [HACKERS] Potential data loss of 2PC files

2017-03-24 Thread Michael Paquier
On Fri, Mar 24, 2017 at 11:36 PM, Teodor Sigaev wrote: >> And the renaming of pg_clog to pg_xact is also my fault. Attached is >> an updated patch. > > > Thank you. One more question: what about symlinks? If DBA moves, for > example, pg_xact to another dist and leaves the symlink in data directoty

Re: [HACKERS] Potential data loss of 2PC files

2017-03-24 Thread Teodor Sigaev
And the renaming of pg_clog to pg_xact is also my fault. Attached is an updated patch. Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync on symlink will do nothing actually. -- Teodor

Re: [HACKERS] Potential data loss of 2PC files

2017-03-23 Thread Michael Paquier
On Fri, Mar 24, 2017 at 5:08 AM, Teodor Sigaev wrote: > Hmm, it doesn't work (but appplies) on current HEAD: > [...] > Data page checksums are disabled. > > fixing permissions on existing directory /spool/pg_data ... ok > creating subdirectories ... ok > selecting default max_connections ... 100 >

Re: [HACKERS] Potential data loss of 2PC files

2017-03-23 Thread Teodor Sigaev
Hmm, it doesn't work (but appplies) on current HEAD: % uname -a FreeBSD *** 11.0-RELEASE-p8 FreeBSD 11.0-RELEASE-p8 #0 r315651: Tue Mar 21 02:44:23 MSK 2017 teodor@***:/usr/obj/usr/src/sys/XOR amd64 % pg_config --configure '--enable-depend' '--enable-cassert' '--enable-debug' '--enable-tap

Re: [HACKERS] Potential data loss of 2PC files

2017-03-21 Thread Michael Paquier
On Wed, Mar 22, 2017 at 12:46 AM, Teodor Sigaev wrote: If that can happen, don't we have the same problem in many other places? Like, all the SLRUs? They don't fsync the directory either. >>> >>> Right, pg_commit_ts and pg_clog enter in this category. >> >> >> Implemented as attached. >>

Re: [HACKERS] Potential data loss of 2PC files

2017-03-21 Thread Teodor Sigaev
If that can happen, don't we have the same problem in many other places? Like, all the SLRUs? They don't fsync the directory either. Right, pg_commit_ts and pg_clog enter in this category. Implemented as attached. Is unlink() guaranteed to be durable, without fsyncing the directory? If not, t

Re: [HACKERS] Potential data loss of 2PC files

2017-03-17 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat > The scope of this work has expanded, since last time I reviewed and marked > it as RFC. Right now I am busy with partition-wise joins and do not have > sufficient time to take a look

Re: [HACKERS] Potential data loss of 2PC files

2017-03-16 Thread Ashutosh Bapat
On Thu, Mar 16, 2017 at 10:17 PM, David Steele wrote: > On 2/13/17 12:10 AM, Michael Paquier wrote: >> On Tue, Jan 31, 2017 at 11:07 AM, Michael Paquier >> wrote: >>> On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas >>> wrote: If that can happen, don't we have the same problem in many

Re: [HACKERS] Potential data loss of 2PC files

2017-03-16 Thread David Steele
On 2/13/17 12:10 AM, Michael Paquier wrote: > On Tue, Jan 31, 2017 at 11:07 AM, Michael Paquier > wrote: >> On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas wrote: >>> If that can happen, don't we have the same problem in many other places? >>> Like, all the SLRUs? They don't fsync the direct

Re: [HACKERS] Potential data loss of 2PC files

2017-02-12 Thread Michael Paquier
On Tue, Jan 31, 2017 at 11:07 AM, Michael Paquier wrote: > On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas wrote: >> If that can happen, don't we have the same problem in many other places? >> Like, all the SLRUs? They don't fsync the directory either. > > Right, pg_commit_ts and pg_clog ent

Re: [HACKERS] Potential data loss of 2PC files

2017-01-30 Thread Michael Paquier
On Fri, Jan 6, 2017 at 9:26 PM, Ashutosh Bapat wrote: > On Wed, Jan 4, 2017 at 12:16 PM, Michael Paquier > wrote: >> On Wed, Jan 4, 2017 at 1:23 PM, Ashutosh Bapat >> wrote: >>> I don't have anything more to review in this patch. I will leave that >>> commitfest entry in "needs review" status fo

Re: [HACKERS] Potential data loss of 2PC files

2017-01-30 Thread Michael Paquier
On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas wrote: > So, if I understood correctly, the problem scenario is: > > 1. Create and write to a file. > 2. fsync() the file. > 3. Crash. > 4. After restart, the file is gone. Yes, that's a problem with fsync's durability, and we need to achieve t

Re: [HACKERS] Potential data loss of 2PC files

2017-01-30 Thread Heikki Linnakangas
On 12/27/2016 01:31 PM, Andres Freund wrote: On 2016-12-27 14:09:05 +0900, Michael Paquier wrote: On Fri, Dec 23, 2016 at 3:02 AM, Andres Freund wrote: Not quite IIRC: that doesn't deal with file size increase. All this would be easier if hardlinks wouldn't exist IIUC. It's basically a quest

Re: [HACKERS] Potential data loss of 2PC files

2017-01-06 Thread Ashutosh Bapat
Marking this as ready for committer. On Wed, Jan 4, 2017 at 12:16 PM, Michael Paquier wrote: > On Wed, Jan 4, 2017 at 1:23 PM, Ashutosh Bapat > wrote: >> I don't have anything more to review in this patch. I will leave that >> commitfest entry in "needs review" status for few days in case anyone

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Michael Paquier
On Wed, Jan 4, 2017 at 1:23 PM, Ashutosh Bapat wrote: > I don't have anything more to review in this patch. I will leave that > commitfest entry in "needs review" status for few days in case anyone > else wants to review it. If none is going to review it, we can mark it > as "ready for committer".

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Ashutosh Bapat
On Tue, Jan 3, 2017 at 5:38 PM, Michael Paquier wrote: > On Tue, Jan 3, 2017 at 8:41 PM, Ashutosh Bapat > wrote: >> Are you talking about >> /* >> * Now we can mark ourselves as out of the commit critical section: a >> * checkpoint starting after this will certainly see the gxact as

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Michael Paquier
On Tue, Jan 3, 2017 at 8:41 PM, Ashutosh Bapat wrote: > Are you talking about > /* > * Now we can mark ourselves as out of the commit critical section: a > * checkpoint starting after this will certainly see the gxact as a > * candidate for fsyncing. > */ > MyPgXact->de

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Ashutosh Bapat
On Tue, Jan 3, 2017 at 2:50 PM, Michael Paquier wrote: > On Tue, Jan 3, 2017 at 3:32 PM, Ashutosh Bapat > wrote: >> I am wondering what happens if a 2PC file gets created, at the time of >> checkpoint we flush the pg_twophase directory, then the file gets >> removed. Do we need to flush the direc

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Michael Paquier
On Tue, Jan 3, 2017 at 3:32 PM, Ashutosh Bapat wrote: > I am wondering what happens if a 2PC file gets created, at the time of > checkpoint we flush the pg_twophase directory, then the file gets > removed. Do we need to flush the directory to ensure that the removal > persists? Whatever material I

Re: [HACKERS] Potential data loss of 2PC files

2017-01-02 Thread Ashutosh Bapat
On Sat, Dec 31, 2016 at 5:53 AM, Michael Paquier wrote: > On Fri, Dec 30, 2016 at 10:59 PM, Ashutosh Bapat > wrote: >>> >>> Well, flushing the meta-data of pg_twophase is really going to be far >>> cheaper than the many pages done until CheckpointTwoPhase is reached. >>> There should really be a

Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 10:59 PM, Ashutosh Bapat wrote: >> >> Well, flushing the meta-data of pg_twophase is really going to be far >> cheaper than the many pages done until CheckpointTwoPhase is reached. >> There should really be a check on serialized_xacts for the >> non-recovery code path, but

Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Ashutosh Bapat
> > Well, flushing the meta-data of pg_twophase is really going to be far > cheaper than the many pages done until CheckpointTwoPhase is reached. > There should really be a check on serialized_xacts for the > non-recovery code path, but considering how cheap that's going to be > compared to the res

Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 5:20 PM, Ashutosh Bapat wrote: > As per the prologue of the function, it doesn't expect any 2PC files > to be written out in the function i.e. between two checkpoints. Most > of those are created and deleted between two checkpoints. Same would > be true for recovery as well

Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Ashutosh Bapat
On Fri, Dec 30, 2016 at 11:22 AM, Michael Paquier wrote: > On Thu, Dec 29, 2016 at 6:41 PM, Ashutosh Bapat > wrote: >> I agree with this. >> If no prepared transactions were required to be fsynced >> CheckPointTwoPhase(), do we want to still fsync the directory? >> Probably not. >> >> May be you

Re: [HACKERS] Potential data loss of 2PC files

2016-12-29 Thread Michael Paquier
On Thu, Dec 29, 2016 at 6:41 PM, Ashutosh Bapat wrote: > I agree with this. > If no prepared transactions were required to be fsynced > CheckPointTwoPhase(), do we want to still fsync the directory? > Probably not. > > May be you want to call fsync_fname(TWOPHASE_DIR, true); if > serialized_xacts

Re: [HACKERS] Potential data loss of 2PC files

2016-12-29 Thread Ashutosh Bapat
On Thu, Dec 22, 2016 at 7:00 AM, Michael Paquier wrote: > Hi all, > > 2PC files are created using RecreateTwoPhaseFile() in two places currently: > - at replay on a XLOG_XACT_PREPARE record. > - At checkpoint with CheckPointTwoPhase(). > > Now RecreateTwoPhaseFile() is careful to call pg_fsync() t

Re: [HACKERS] Potential data loss of 2PC files

2016-12-27 Thread Andres Freund
On 2016-12-27 14:09:05 +0900, Michael Paquier wrote: > On Fri, Dec 23, 2016 at 3:02 AM, Andres Freund wrote: > > Not quite IIRC: that doesn't deal with file size increase. All this would > > be easier if hardlinks wouldn't exist IIUC. It's basically a question > > whether dentry, inode or conte

Re: [HACKERS] Potential data loss of 2PC files

2016-12-26 Thread Michael Paquier
On Fri, Dec 23, 2016 at 3:02 AM, Andres Freund wrote: > Not quite IIRC: that doesn't deal with file size increase. All this would be > easier if hardlinks wouldn't exist IIUC. It's basically a question whether > dentry, inode or contents need to be synced. Yes, it sucks. I did more monitorin

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Michael Paquier
On Fri, Dec 23, 2016 at 6:33 AM, Jim Nasby wrote: > On 12/22/16 12:02 PM, Andres Freund wrote: >> >> >> On December 22, 2016 6:44:22 PM GMT+01:00, Robert Haas >> wrote: >>> >>> On Thu, Dec 22, 2016 at 12:39 PM, Andres Freund >>> wrote: It makes more sense of you mentally separate betwe

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Jim Nasby
On 12/22/16 12:02 PM, Andres Freund wrote: On December 22, 2016 6:44:22 PM GMT+01:00, Robert Haas wrote: On Thu, Dec 22, 2016 at 12:39 PM, Andres Freund wrote: It makes more sense of you mentally separate between filename(s) and file contents. Having to do filesystem metatata transactions

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Andres Freund
On December 22, 2016 6:44:22 PM GMT+01:00, Robert Haas wrote: >On Thu, Dec 22, 2016 at 12:39 PM, Andres Freund >wrote: >> It makes more sense of you mentally separate between filename(s) and >file contents. Having to do filesystem metatata transactions for an >fsync intended to sync contents

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Robert Haas
On Thu, Dec 22, 2016 at 12:39 PM, Andres Freund wrote: > It makes more sense of you mentally separate between filename(s) and file > contents. Having to do filesystem metatata transactions for an fsync > intended to sync contents would be annoying... I thought that's why there's fdatasync. --

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Andres Freund
On December 22, 2016 5:50:38 PM GMT+01:00, Robert Haas wrote: >On Wed, Dec 21, 2016 at 8:30 PM, Michael Paquier > wrote: >> Hi all, >> >> 2PC files are created using RecreateTwoPhaseFile() in two places >currently: >> - at replay on a XLOG_XACT_PREPARE record. >> - At checkpoint with CheckPoint

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Robert Haas
On Wed, Dec 21, 2016 at 8:30 PM, Michael Paquier wrote: > Hi all, > > 2PC files are created using RecreateTwoPhaseFile() in two places currently: > - at replay on a XLOG_XACT_PREPARE record. > - At checkpoint with CheckPointTwoPhase(). > > Now RecreateTwoPhaseFile() is careful to call pg_fsync() t