Re: PANIC: could not fsync file "pg_multixact/..." since commit dee663f7843

Tomas Vondra Wed, 04 Nov 2020 15:08:26 -0800

On 11/4/20 2:50 PM, Tomas Vondra wrote:

On Wed, Nov 04, 2020 at 05:36:46PM +1300, Thomas Munro wrote:

On Wed, Nov 4, 2020 at 2:57 PM Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:

On Wed, Nov 04, 2020 at 02:49:24PM +1300, Thomas Munro wrote:
>On Wed, Nov 4, 2020 at 2:32 PM Tomas Vondra
><tomas.von...@2ndquadrant.com> wrote:

>> After a while (~1h on my machine) the pg_multixact gets over 10GB,which>> triggers a more aggressive cleanup (perMultiXactMemberFreezeThreshold).

>> My guess is that this discards some of the files, but checkpointer is
>> not aware of that, or something like that. Not sure.
>
>Urgh.  Thanks.  Looks like perhaps the problem is that I have
>RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true) in one codepath
>that unlinks files, but not another.  Looking.


Maybe. I didn't have time to investigate this more deeply, and it takes
quite a bit of time to reproduce. I can try again with extra logging or
test some proposed fixes, if you give me a patch.


I think this should be fixed by doing all unlinking through a common
code path.  Does this pass your test?


Seems to be working - without the patch it failed after ~1h, now it's
running for more than 2h without a crash. I'll let it run for a few more
hours (on both machines).

It's been running for hours on both machines, without any crashes etc.While that's not a definitive proof the fix is correct, it certainlybehaves differently.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PANIC: could not fsync file "pg_multixact/..." since commit dee663f7843

Reply via email to