Hi Michael,
Is there any progress about this problem? I could give more detailed information if you need. Best wishes, Chengwen Wu ------------------ Original ------------------ From: "Michael Paquier" <mich...@paquier.xyz>; Date: Wed, Sep 11, 2024 05:21 PM To: "????"<drec...@foxmail.com>; Cc: "pgsql-hackers"<pgsql-hackers@lists.postgresql.org>; Subject: Re: Fix orphaned 2pc file which may casue instance restart failed On Sun, Sep 08, 2024 at 01:01:37PM +0800, ???? wrote: > Hi all, I found that there is a race condition > between two global transaction, which may cause instance restart > failed with error 'could not access status of transaction > xxx","Could not open file ""pg_xact/xxx"": No such file or > directory'. > > > The scenery to reproduce the problem is: > 1. gxact1 is doing `FinishPreparedTransaction` and checkpoint > is issued, so gxact1 will generate a 2pc file. > 2. then gxact1 was removed from `TwoPhaseState-&gt;prepXacts` and > its state memory was returned to freelist. > 3. but just before gxact1 remove its 2pc file, gxact2 is issued, > gxact2 will reuse the same state memory of gxact1 and will > reset `gxact-&gt;ondisk` to false. > 4. gxact1 continue and found that `gxact-&gt;ondisk` is false, it won't > remove its 2pc file. This file is orphaned. > > If gxact1's local xid is not frozen, the startup process will remove > the orphaned 2pc file. However, if the xid's corresponding clog file is > truncated by `vacuum`, the startup process will raise error 'could not > access status of transaction xxx', due to it could not found the > transaction's status file in dir `pg_xact`. Hmm. I've not seen that in the field. Let me check that.. -- Michael