On 13/03/2025 11:04, vignesh C wrote:
## Analysis
I think it was caused due to the STATUS_DELETE_PENDING failure, not
related with recent
updates for pg_upgrade.

The file "base/1/2683" is an index file for
pg_largeobject_loid_pn_index, and the
output meant that file creation failed. Below is a backtrace.

```
pgwin32_open() // <-- this returns -1
open()
BasicOpenFilePerm()
PathNameOpenFilePerm()
PathNameOpenFile()
mdcreate()
smgrcreate()
RelationCreateStorage()
RelationSetNewRelfilenumber()
ExecuteTruncateGuts()
ExecuteTruncate()
```

But this is strange. Before calling mdcreate(), we surely unlink the
file which have the same name. Below is a trace until unlink.

```
pgunlink()
unlink()
mdunlinkfork()
mdunlink()
smgrdounlinkall()
RelationSetNewRelfilenumber() // common path with above
ExecuteTruncateGuts()
ExecuteTruncate()
```

I found Thomas said that [4] pgunlink sometimes could not remove a
file even if it returns OK, at that time NTSTATUS is
STATUS_DELETE_PENDING. Also, a comment in pgwin32_open_handle()
mentions the same thing:

```
/*
* ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
* gone (Windows NT status code is STATUS_DELETE_PENDING).  In that
* case, we'd better ask for the NT status too so we can translate it
* to a more Unix-like error.  We hope that nothing clobbers the NT
* status in between the internal NtCreateFile() call and CreateFile()
* returning.
*
```

The definition of STATUS_DELETE_PENDING can be seen in [5]. Based on
that, indeed, open() would be able to fail with STATUS_DELETE_PENDING
if the deletion is pending but it is trying to open.
---------------------------------------------

This was fixed by the following change in the target upgrade nodes:
bgwriter_lru_maxpages = 0
checkpoint_timeout = 1h

Attached is a patch in similar lines for 004_subscription.

Hmm, this problem isn't limited to this one pg_upgrade test, right? It could happen with any pg_upgrade invocation. And perhaps in a running server too, if a relfilenumber is reused quickly. In dropdb() and DropTableSpace() we do this:

WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));

Should we do the same here? Not sure where exactly to put that; perhaps in mdcreate(), if the creation fails with STATUS_DELETE_PENDING.

--
Heikki Linnakangas
Neon (https://neon.tech)



Reply via email to