Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-26 Thread Noah Misch
On Fri, Apr 25, 2025 at 03:35:06PM -0400, Andres Freund wrote: > On 2025-04-20 14:53:39 -0700, Noah Misch wrote: > > The checkpoints and WAL creation took 30s, but archiving was only 20% done > > (based on file name 0001006D) at the 360s PGCTLTIMEOUT. > > Huh. That seems surprisin

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-25 Thread Andres Freund
Hi, On 2025-04-20 14:53:39 -0700, Noah Misch wrote: > On Mon, Apr 14, 2025 at 09:19:35AM +0900, Michael Paquier wrote: > > On Sun, Apr 13, 2025 at 11:51:57AM -0400, Tom Lane wrote: > > > Noah Misch writes: > > > > Tom and Michael, do you still object to the test addition, or not? If > > > > the

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-20 Thread Noah Misch
On Sun, Apr 20, 2025 at 02:53:39PM -0700, Noah Misch wrote: > On Mon, Apr 14, 2025 at 09:19:35AM +0900, Michael Paquier wrote: > > On Sun, Apr 13, 2025 at 11:51:57AM -0400, Tom Lane wrote: > > > Noah Misch writes: > > > > Tom and Michael, do you still object to the test addition, or not? If > >

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-20 Thread Noah Misch
On Mon, Apr 14, 2025 at 09:19:35AM +0900, Michael Paquier wrote: > On Sun, Apr 13, 2025 at 11:51:57AM -0400, Tom Lane wrote: > > Noah Misch writes: > > > Tom and Michael, do you still object to the test addition, or not? If > > > there > > > are no new or renewed objections by 2025-04-20, I'll p

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-13 Thread Michael Paquier
On Sun, Apr 13, 2025 at 11:51:57AM -0400, Tom Lane wrote: > Noah Misch writes: > > Tom and Michael, do you still object to the test addition, or not? If there > > are no new or renewed objections by 2025-04-20, I'll proceed to add the > > test. > > While I still don't love it, I don't have a be

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-13 Thread Tom Lane
Noah Misch writes: > Tom and Michael, do you still object to the test addition, or not? If there > are no new or renewed objections by 2025-04-20, I'll proceed to add the test. While I still don't love it, I don't have a better proposal. regards, tom lane

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-13 Thread Noah Misch
On Sat, Apr 05, 2025 at 07:09:58PM -0700, Noah Misch wrote: > On Sun, Apr 06, 2025 at 07:42:02AM +0900, Michael Paquier wrote: > > On Sat, Apr 05, 2025 at 12:13:39PM -0700, Noah Misch wrote: > > > Since the 2025-02 releases made non-toy-size archive recoveries fail > > > easily, > > > that's not e

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-05 Thread Michael Paquier
On Sat, Apr 05, 2025 at 07:14:27PM +0900, Michael Paquier wrote: > On Wed, Apr 02, 2025 at 05:29:00PM -0700, Noah Misch wrote: >> Your back-patches are correct. Thanks. > > Thanks for double-checking. I'll move on with what I have after a > second look as it's been a few weeks since I've looked

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-05 Thread Noah Misch
On Sun, Apr 06, 2025 at 07:42:02AM +0900, Michael Paquier wrote: > On Sat, Apr 05, 2025 at 12:13:39PM -0700, Noah Misch wrote: > > Since the 2025-02 releases made non-toy-size archive recoveries fail easily, > > that's not enough. If the proposed 3-second test is the wrong thing, what > > instead?

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-05 Thread Michael Paquier
On Sat, Apr 05, 2025 at 12:13:39PM -0700, Noah Misch wrote: > Since the 2025-02 releases made non-toy-size archive recoveries fail easily, > that's not enough. If the proposed 3-second test is the wrong thing, what > instead? I don't have a good idea about that in ~16, TBH, but I am sure to not b

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-05 Thread Noah Misch
On Sat, Apr 05, 2025 at 11:07:13AM -0400, Tom Lane wrote: > Michael Paquier writes: > > On Wed, Apr 02, 2025 at 05:29:00PM -0700, Noah Misch wrote: > >> Here it is. Making it fail three times took looping 1383s, 5841s, and > >> 2594s. > >> Hence, it couldn't be expected to catch the regression b

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-05 Thread Noah Misch
On Tue, Mar 11, 2025 at 06:23:15PM -0700, Noah Misch wrote: > On Wed, Mar 12, 2025 at 09:46:27AM +0900, Michael Paquier wrote: > > On Tue, Mar 11, 2025 at 01:57:49PM -0700, Noah Misch wrote: > > > Thanks for crafting back-branch versions. I've queued a task to confirm > > > I get > > > the same r

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-05 Thread Tom Lane
Michael Paquier writes: > On Wed, Apr 02, 2025 at 05:29:00PM -0700, Noah Misch wrote: >> Here it is. Making it fail three times took looping 1383s, 5841s, and 2594s. >> Hence, it couldn't be expected to catch the regression before commit, but it >> would have made sufficient buildfarm and CI nois

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-04-05 Thread Michael Paquier
On Wed, Apr 02, 2025 at 05:29:00PM -0700, Noah Misch wrote: > Your back-patches are correct. Thanks. Thanks for double-checking. I'll move on with what I have after a second look as it's been a few weeks since I've looked at all these conflicts. I am also planning to add a few notes in the comm

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-03-11 Thread Noah Misch
On Wed, Mar 12, 2025 at 09:46:27AM +0900, Michael Paquier wrote: > On Tue, Mar 11, 2025 at 01:57:49PM -0700, Noah Misch wrote: > > Thanks for crafting back-branch versions. I've queued a task to confirm I > > get > > the same result. > > Thanks for that. That helps a lot. I'll let you know whe

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-03-11 Thread Michael Paquier
On Tue, Mar 11, 2025 at 01:57:49PM -0700, Noah Misch wrote: > Thanks for crafting back-branch versions. I've queued a task to confirm I get > the same result. Thanks for that. That helps a lot. > There's a test case I'll polish, too. Are you considering the addition of a TAP test in 17~ based

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-03-11 Thread Noah Misch
On Mon, Mar 10, 2025 at 02:25:28PM +0900, Michael Paquier wrote: > On Thu, Mar 06, 2025 at 01:44:30PM -0600, Nathan Bossart wrote: > > On Thu, Mar 06, 2025 at 11:30:13AM -0800, Noah Misch wrote: > >> 1. Make v14 and v13 skip WAL recycling and preallocation during archive > >>recovery, like newe

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-03-11 Thread Noah Misch
On Thu, Dec 19, 2024 at 02:44:53PM +0900, Michael Paquier wrote: > On Wed, Dec 18, 2024 at 08:51:20PM -0500, Andres Freund wrote: > > I don't think the issue is actually quite as unlikely to be hit as reasoned > > in > > the commit message. The crash has indeed to happen between the link() and >

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-03-11 Thread Michael Paquier
On Mon, Mar 10, 2025 at 02:25:28PM +0900, Michael Paquier wrote: > I am attaching a full set of patches for v14 and v13 that can be used > for these tests. WDYT? And I forgot to mention that I've checked both v14 and v13, where I have noticed the two patterns "invalid magic number in log seg

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-03-10 Thread Michael Paquier
On Thu, Mar 06, 2025 at 01:44:30PM -0600, Nathan Bossart wrote: > On Thu, Mar 06, 2025 at 11:30:13AM -0800, Noah Misch wrote: >> 1. Make v14 and v13 skip WAL recycling and preallocation during archive >>recovery, like newer branches do. I think that means back-patching the >> six >>commit

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2025-03-06 Thread Nathan Bossart
On Thu, Mar 06, 2025 at 11:30:13AM -0800, Noah Misch wrote: > Options I see: > > 1. Make v14 and v13 skip WAL recycling and preallocation during archive >recovery, like newer branches do. I think that means back-patching the six >commits cc2c7d6~4 cc2c7d6~3 cc2c7d6~2 cc2c7d6~1 cc2c7d6 e36

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2024-12-19 Thread Robert Pang
Dear Michael, Thank you for applying this back-patch. I also appreciate everyone's input on this issue. Sincerely, Robert Pang On Thu, Dec 19, 2024 at 4:13 PM Michael Paquier wrote: > On Thu, Dec 19, 2024 at 11:07:25AM -0500, Andres Freund wrote: > > On 2024-12-19 09:31:14 -0600, Nathan Boss

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2024-12-19 Thread Michael Paquier
On Thu, Dec 19, 2024 at 11:07:25AM -0500, Andres Freund wrote: > On 2024-12-19 09:31:14 -0600, Nathan Bossart wrote: >> LGTM > > Dito. Thanks for double-checking. Done this one. -- Michael signature.asc Description: PGP signature

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2024-12-19 Thread Andres Freund
On 2024-12-19 09:31:14 -0600, Nathan Bossart wrote: > On Thu, Dec 19, 2024 at 02:44:53PM +0900, Michael Paquier wrote: > > I've been double-checking the code to refresh myself with the problem, > > and I don't see a reason to not apply something like the attached set > > down to v13 for all these r

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2024-12-19 Thread Nathan Bossart
On Thu, Dec 19, 2024 at 02:44:53PM +0900, Michael Paquier wrote: > I've been double-checking the code to refresh myself with the problem, > and I don't see a reason to not apply something like the attached set > down to v13 for all these remaining branches (minus an edit of the > commit message).

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2024-12-18 Thread Michael Paquier
On Wed, Dec 18, 2024 at 08:51:20PM -0500, Andres Freund wrote: > I don't think the issue is actually quite as unlikely to be hit as reasoned in > the commit message. The crash has indeed to happen between the link() and > unlink() - but at the end of a checkpoint we do that operations hundreds of

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2024-12-18 Thread Andres Freund
Hi, On 2024-12-18 10:38:19 -0600, Nathan Bossart wrote: > On Tue, Dec 17, 2024 at 04:50:16PM -0800, Robert Pang wrote: > > We recently observed a few cases where Postgres running on Linux > > encountered an issue with WAL segment files. Specifically, two WAL > > segments were linked to the same ph

Re: Back-patch of: avoid multiple hard links to same WAL file after a crash

2024-12-18 Thread Nathan Bossart
On Tue, Dec 17, 2024 at 04:50:16PM -0800, Robert Pang wrote: > We recently observed a few cases where Postgres running on Linux > encountered an issue with WAL segment files. Specifically, two WAL > segments were linked to the same physical file after Postgres ran out > of memory and the OOM killer

Back-patch of: avoid multiple hard links to same WAL file after a crash

2024-12-17 Thread Robert Pang
Dear team, We recently observed a few cases where Postgres running on Linux encountered an issue with WAL segment files. Specifically, two WAL segments were linked to the same physical file after Postgres ran out of memory and the OOM killer terminated one of its processes. This resulted in the WA

Re: avoid multiple hard links to same WAL file after a crash

2022-07-05 Thread Michael Paquier
On Tue, Jul 05, 2022 at 09:58:38AM -0700, Nathan Bossart wrote: > Thanks! I wonder if we should add a comment in writeTimeLineHistoryFile() > about possible concurrent use by a WAL receiver and the startup process and > why that is okay. Agreed. Adding an extra note at the top of the routine wou

Re: avoid multiple hard links to same WAL file after a crash

2022-07-05 Thread Nathan Bossart
On Tue, Jul 05, 2022 at 10:19:49AM +0900, Michael Paquier wrote: > On Thu, May 05, 2022 at 08:10:02PM +0900, Michael Paquier wrote: >> I'd agree with removing all the callers at the end. pgrename() is >> quite robust on Windows, but I'd keep the two checks in >> writeTimeLineHistory(), as the logi

Re: avoid multiple hard links to same WAL file after a crash

2022-07-04 Thread Michael Paquier
On Thu, May 05, 2022 at 08:10:02PM +0900, Michael Paquier wrote: > I'd agree with removing all the callers at the end. pgrename() is > quite robust on Windows, but I'd keep the two checks in > writeTimeLineHistory(), as the logic around findNewestTimeLine() would > consider a past TLI history file

Re: avoid multiple hard links to same WAL file after a crash

2022-05-05 Thread Michael Paquier
On Mon, May 02, 2022 at 04:06:13PM -0700, Nathan Bossart wrote: > Here is a new patch set. For now, I've only removed the file existence > check in writeTimeLineHistoryFile(). I don't know if I'm totally convinced > that there isn't a problem here (e.g., due to concurrent .ready file > creation),

Re: avoid multiple hard links to same WAL file after a crash

2022-05-02 Thread Nathan Bossart
On Mon, May 02, 2022 at 10:39:07AM -0700, Nathan Bossart wrote: > On Mon, May 02, 2022 at 07:48:18PM +0900, Michael Paquier wrote: >> The WAL receiver upgrades the ERROR to a FATAL, and restarts >> streaming shortly after. Using durable_rename() would not be an issue >> here. > > Thanks for inves

Re: avoid multiple hard links to same WAL file after a crash

2022-05-02 Thread Nathan Bossart
On Mon, May 02, 2022 at 07:48:18PM +0900, Michael Paquier wrote: > Skimming through at the buildfarm logs, it happens that the tests are > able to see this race from time to time. Here is one such example on > rorqual: > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=rorqual&dt=2022

Re: avoid multiple hard links to same WAL file after a crash

2022-05-02 Thread Nathan Bossart
On Sun, May 01, 2022 at 10:08:53PM +0900, Michael Paquier wrote: > At the end, switching directly from durable_rename_excl() to > durable_rename() should be fine for the WAL segment initialization, > but we could do things a bit more carefully by adding a check on the > file existence before callin

Re: avoid multiple hard links to same WAL file after a crash

2022-05-02 Thread Michael Paquier
On Sun, May 01, 2022 at 10:08:53PM +0900, Michael Paquier wrote: > Now, I am surprised by the third code path of durable_rename_excl(), > as of the WAL receiver doing writeTimeLineHistoryFile(), to not cause > any issues, as link() should exit with EEXIST when the startup process > grabs the same h

Re: avoid multiple hard links to same WAL file after a crash

2022-05-01 Thread Michael Paquier
On Tue, Apr 12, 2022 at 09:27:42AM -0700, Nathan Bossart wrote: > On Tue, Apr 12, 2022 at 03:46:31PM +0900, Kyotaro Horiguchi wrote: >> At Mon, 11 Apr 2022 09:52:57 -0700, Nathan Bossart >> wrote in >>> I traced this back a while ago. I believe the link() was first added in >>> November 2000 as

Re: avoid multiple hard links to same WAL file after a crash

2022-04-27 Thread Michael Paquier
On Wed, Apr 27, 2022 at 11:42:04AM -0700, Nathan Bossart wrote: > Here is a new patch set with these assertions added. I think at least the > xlog.c change ought to be back-patched. The problem may be unlikely, but > AFAICT the possible consequences include WAL corruption. Okay, so I have applie

Re: avoid multiple hard links to same WAL file after a crash

2022-04-27 Thread Nathan Bossart
On Wed, Apr 27, 2022 at 04:09:20PM +0900, Michael Paquier wrote: > I am not sure that have any need to backpatch this change based on the > unlikeliness of the problem, TBH. One thing that is itching me a bit, > like Robert upthread, is that we don't check anymore that the newfile > does not exist

Re: avoid multiple hard links to same WAL file after a crash

2022-04-27 Thread Michael Paquier
On Tue, Apr 26, 2022 at 01:09:35PM -0700, Nathan Bossart wrote: > Here is an attempt at creating something that can be back-patched. 0001 > simply replaces calls to durable_rename_excl() with durable_rename() and is > intended to be back-patched. 0002 removes the definition of > durable_rename_ex

Re: avoid multiple hard links to same WAL file after a crash

2022-04-26 Thread Nathan Bossart
Here is an attempt at creating something that can be back-patched. 0001 simply replaces calls to durable_rename_excl() with durable_rename() and is intended to be back-patched. 0002 removes the definition of durable_rename_excl() and is _not_ intended for back-patching. I imagine 0002 will need

Re: avoid multiple hard links to same WAL file after a crash

2022-04-18 Thread Greg Stark
The readdir interface allows processes to be in the middle of reading a directory and unless a kernel was happy to either materialize the entire directory list when the readdir starts, or lock the entire directory against modification for the entire time the a process has a readdir fd open it's alw

Re: avoid multiple hard links to same WAL file after a crash

2022-04-18 Thread Tom Lane
Michael Paquier writes: > On Fri, Apr 08, 2022 at 09:00:36PM -0400, Robert Haas wrote: >> I wonder if this is really true. I thought rename() was supposed to be >> atomic. > Not always. For example, some old versions of MacOS have a non-atomic > implementation of rename(), like prairiedog with

Re: avoid multiple hard links to same WAL file after a crash

2022-04-18 Thread Nathan Bossart
On Mon, Apr 18, 2022 at 04:48:35PM +0900, Michael Paquier wrote: > Saying that, it would be nice to see durable_rename_excl() gone as it > has created quite a bit of pain for us in the past years. Yeah, I think this is the right thing to do. Patch upthread [0]. For back-branches, I suspect we'll

Re: avoid multiple hard links to same WAL file after a crash

2022-04-18 Thread Michael Paquier
On Fri, Apr 08, 2022 at 09:00:36PM -0400, Robert Haas wrote: > On Fri, Apr 8, 2022 at 12:53 PM Nathan Bossart > wrote: >> I think there might be another problem. The man page for rename() seems to >> indicate that overwriting an existing file also introduces a window where >> the old and new pat

Re: avoid multiple hard links to same WAL file after a crash

2022-04-12 Thread Nathan Bossart
On Tue, Apr 12, 2022 at 03:46:31PM +0900, Kyotaro Horiguchi wrote: > At Mon, 11 Apr 2022 09:52:57 -0700, Nathan Bossart > wrote in >> I traced this back a while ago. I believe the link() was first added in >> November 2000 as part of f0e37a8. This even predates WAL recycling, which >> was adde

Re: avoid multiple hard links to same WAL file after a crash

2022-04-11 Thread Kyotaro Horiguchi
At Mon, 11 Apr 2022 09:52:57 -0700, Nathan Bossart wrote in > On Mon, Apr 11, 2022 at 12:28:47PM -0400, Tom Lane wrote: > > Robert Haas writes: > >> On Mon, Apr 11, 2022 at 5:12 AM Kyotaro Horiguchi > >> wrote: > >>> If this diagnosis is correct, the comment is proved to be paranoid. > > > >>

Re: avoid multiple hard links to same WAL file after a crash

2022-04-11 Thread Nathan Bossart
On Mon, Apr 11, 2022 at 12:28:47PM -0400, Tom Lane wrote: > Robert Haas writes: >> On Mon, Apr 11, 2022 at 5:12 AM Kyotaro Horiguchi >> wrote: >>> If this diagnosis is correct, the comment is proved to be paranoid. > >> It's sometimes difficult to understand what problems really old code >> comm

Re: avoid multiple hard links to same WAL file after a crash

2022-04-11 Thread Tom Lane
Robert Haas writes: > On Mon, Apr 11, 2022 at 5:12 AM Kyotaro Horiguchi > wrote: >> If this diagnosis is correct, the comment is proved to be paranoid. > It's sometimes difficult to understand what problems really old code > comments are worrying about. For example, could they have been > worryi

Re: avoid multiple hard links to same WAL file after a crash

2022-04-11 Thread Robert Haas
On Mon, Apr 11, 2022 at 5:12 AM Kyotaro Horiguchi wrote: > So, the only thing we need to care is segment switch. Without it, the > segment that InstallXLogFileSegment found by the stat loop is known to > be safe to overwrite even if exists. > > When segment switch finds an existing file, it's no p

Re: avoid multiple hard links to same WAL file after a crash

2022-04-11 Thread Kyotaro Horiguchi
At Thu, 7 Apr 2022 11:29:54 -0700, Nathan Bossart wrote in > The attached patch prevents this problem by using durable_rename() instead > of durable_rename_excl() for WAL recycling. This removes the protection > against accidentally overwriting an existing WAL file, but there shouldn't > be one

Re: avoid multiple hard links to same WAL file after a crash

2022-04-08 Thread Justin Pryzby
On Fri, Apr 08, 2022 at 09:00:36PM -0400, Robert Haas wrote: > > I think there might be another problem. The man page for rename() seems to > > indicate that overwriting an existing file also introduces a window where > > the old and new path are hard links to the same file. This isn't a problem

Re: avoid multiple hard links to same WAL file after a crash

2022-04-08 Thread Robert Haas
On Fri, Apr 8, 2022 at 12:53 PM Nathan Bossart wrote: > On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote: > > I see that durable_rename_excl() has the following comment: "Similar > > to durable_rename(), except that this routine tries (but does not > > guarantee) not to overwrite the ta

Re: avoid multiple hard links to same WAL file after a crash

2022-04-08 Thread Nathan Bossart
On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote: > I'd actually be in favor of nuking durable_rename_excl() from orbit > and putting the file-exists tests in the callers. Otherwise, someone > might assume that it actually has the semantics that its name > suggests, which could be pretty

Re: avoid multiple hard links to same WAL file after a crash

2022-04-08 Thread Nathan Bossart
On Fri, Apr 08, 2022 at 09:53:12AM -0700, Nathan Bossart wrote: > On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote: >> I'd actually be in favor of nuking durable_rename_excl() from orbit >> and putting the file-exists tests in the callers. Otherwise, someone >> might assume that it actua

Re: avoid multiple hard links to same WAL file after a crash

2022-04-08 Thread Nathan Bossart
On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote: > I see that durable_rename_excl() has the following comment: "Similar > to durable_rename(), except that this routine tries (but does not > guarantee) not to overwrite the target file." If those are the desired > semantics, we could achi

Re: avoid multiple hard links to same WAL file after a crash

2022-04-08 Thread Robert Haas
On Thu, Apr 7, 2022 at 2:30 PM Nathan Bossart wrote: > Presently, WAL recycling uses durable_rename_excl(), which notes that a > crash at an unfortunate moment can result in two links to the same file. > My testing [1] demonstrated that it was possible to end up with two links > to the same file i

avoid multiple hard links to same WAL file after a crash

2022-04-07 Thread Nathan Bossart
>From 244726f6a78aca52c2fe6e70cef966f152057191 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Thu, 7 Apr 2022 10:07:42 -0700 Subject: [PATCH v1 1/1] Avoid multiple hard links to same WAL file after a crash. Presently, WAL recycling uses durable_rename_excl(), which notes that a cra