Hi,

While running some multixact-oriented stress tests, I noticed that
commit dee663f7843:

    Defer flushing of SLRU files.

    Previously, we called fsync() after writing out individual pg_xact,
    pg_multixact and pg_commit_ts pages due to cache pressure, leading to
    regular I/O stalls in user backends and recovery.  Collapse requests for
    the same file into a single system call as part of the next checkpoint,
    as we already did for relation files, using the infrastructure developed
    by commit 3eb77eba.  This can cause a significant improvement to
    recovery performance, especially when it's otherwise CPU-bound.

    ...

seems to trigger this issue:

    [17820] LOG:  checkpoint starting: wal
    [17820] PANIC:  could not fsync file "pg_multixact/offsets/06E0": No such 
file or directory
    [17818] LOG:  checkpointer process (PID 17820) was terminated by signal 6: 
Aborted
    [17818] LOG:  terminating any other active server processes

which is then followed by this during recovery:

    [18599] LOG:  redo starts at 1F/FF098138
    [18599] LOG:  file "pg_multixact/offsets/0635" doesn't exist, reading as 
zeroes
    [18599] CONTEXT:  WAL redo at 1F/FF09A218 for MultiXact/CREATE_ID: 
104201060 offset 1687158668 nmembers 3: 2128819 (keysh) 2128823 (keysh) 2128827 
(keysh)
    [18599] LOG:  file "pg_multixact/members/7DE3" doesn't exist, reading as 
zeroes
    [18599] CONTEXT:  WAL redo at 1F/FF09A218 for MultiXact/CREATE_ID: 
104201060 offset 1687158668 nmembers 3: 2128819 (keysh) 2128823 (keysh) 2128827 
(keysh)
    [18599] LOG:  redo done at 2A/D4D8BFB0 system usage: CPU: user: 265.57 s, 
system: 12.43 s, elapsed: 278.06 s
    [18599] LOG:  checkpoint starting: end-of-recovery immediate
    [18599] PANIC:  could not fsync file "pg_multixact/offsets/06E0": No such 
file or directory
    [17818] LOG:  startup process (PID 18599) was terminated by signal 6: 
Aborted
    [17818] LOG:  aborting startup due to startup process failure
    [17818] LOG:  database system is shut down

at which point the cluster is kaput, of course.

It's clearly the fault of dee663f7843 - 4 failures out of 4 attempts on
that commit, and after switching to ca7f8e2b86 it goes away.

Reproducing it is pretty simple, but it takes a bit of time. Essentially
do this:

    create table t (a int primary key);
    insert into t select i from generate_series(1,1000) s(i);

and then run

    SELECT * FROM t FOR KEY SHARE;

from pgbench with many concurrent clients. I do this:

    pgbench -n -c 32 -j 8 -f select.sql -T 86400 test

After a while (~1h on my machine) the pg_multixact gets over 10GB, which
triggers a more aggressive cleanup (per MultiXactMemberFreezeThreshold).
My guess is that this discards some of the files, but checkpointer is
not aware of that, or something like that. Not sure.

Attached are backtraces from the two crashes - regular and during
recovery. Not sure how interesting / helpful that is, it probably does
not say much about how we got there.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49      return ret;
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1  0x00007faee64fe53a in __GI_abort () at abort.c:79
#2  0x0000564dbf3ac117 in errfinish (filename=<optimized out>, 
lineno=<optimized out>, funcname=0x564dbf4ebb00 <__func__.16955> 
"ProcessSyncRequests") at elog.c:592
#3  0x0000564dbf2a2191 in ProcessSyncRequests () at sync.c:416
#4  0x0000564dbf05dee4 in CheckPointGuts (checkPointRedo=274827688840, 
flags=flags@entry=192) at xlog.c:9196
#5  0x0000564dbf0642e6 in CreateCheckPoint (flags=192) at xlog.c:8974
#6  0x0000564dbf222642 in CheckpointerMain () at checkpointer.c:451
#7  0x0000564dbf0740e7 in AuxiliaryProcessMain (argc=argc@entry=2, 
argv=argv@entry=0x7ffd07376d40) at bootstrap.c:451
#8  0x0000564dbf22dd8b in StartChildProcess (type=CheckpointerProcess) at 
postmaster.c:5512
#9  0x0000564dbf22f042 in reaper (postgres_signal_arg=<optimized out>) at 
postmaster.c:3044
#10 <signal handler called>
#11 0x00007faee65d1d46 in __GI___select (nfds=nfds@entry=8, 
readfds=readfds@entry=0x7ffd073774e0, writefds=writefds@entry=0x0, 
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7ffd07377440) at 
../sysdeps/unix/sysv/linux/select.c:41
#12 0x0000564dbf22fa0c in ServerLoop () at postmaster.c:1706
#13 0x0000564dbf2314f3 in PostmasterMain (argc=3, argv=<optimized out>) at 
postmaster.c:1415
#14 0x0000564dbefd78ae in main (argc=3, argv=0x564dbfe19530) at main.c:209


Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49      return ret;
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1  0x00007faee64fe53a in __GI_abort () at abort.c:79
#2  0x0000564dbf3ac117 in errfinish (filename=<optimized out>, 
lineno=<optimized out>, funcname=0x564dbf4ebb00 <__func__.16955> 
"ProcessSyncRequests") at elog.c:592
#3  0x0000564dbf2a2191 in ProcessSyncRequests () at sync.c:416
#4  0x0000564dbf05dee4 in CheckPointGuts (checkPointRedo=275381223376, 
flags=flags@entry=6) at xlog.c:9196
#5  0x0000564dbf0642e6 in CreateCheckPoint (flags=6) at xlog.c:8974
#6  0x0000564dbf06790f in StartupXLOG () at xlog.c:7775
#7  0x0000564dbf231bc2 in StartupProcessMain () at startup.c:204
#8  0x0000564dbf0740f1 in AuxiliaryProcessMain (argc=argc@entry=2, 
argv=argv@entry=0x7ffd07376d30) at bootstrap.c:441
#9  0x0000564dbf22dd8b in StartChildProcess (type=StartupProcess) at 
postmaster.c:5512
#10 0x0000564dbf22dea0 in PostmasterStateMachine () at postmaster.c:4027
#11 0x0000564dbf22ee2d in reaper (postgres_signal_arg=<optimized out>) at 
postmaster.c:3255
#12 <signal handler called>
#13 0x00007faee65d1d46 in __GI___select (nfds=nfds@entry=8, 
readfds=readfds@entry=0x7ffd073774e0, writefds=writefds@entry=0x0, 
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7ffd07377440) at 
../sysdeps/unix/sysv/linux/select.c:41
#14 0x0000564dbf22fa0c in ServerLoop () at postmaster.c:1706
#15 0x0000564dbf2314f3 in PostmasterMain (argc=3, argv=<optimized out>) at 
postmaster.c:1415
#16 0x0000564dbefd78ae in main (argc=3, argv=0x564dbfe19530) at main.c:209

Reply via email to