On Thu, Sep 01, 2022 at 12:05:36PM +1200, Thomas Munro wrote: > On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby <pry...@telsasoft.com> wrote: > > < 2022-08-31 08:44:10.495 CDT >LOG: checkpoint starting: end-of-recovery > > immediate wait > > < 2022-08-31 08:44:10.609 CDT >LOG: request to flush past end of > > generated WAL; request 1201/1CAF84F0, current position 1201/1CADB730 > > < 2022-08-31 08:44:10.609 CDT >CONTEXT: writing block 0 of relation > > base/16881/2840_vm > > < 2022-08-31 08:44:10.609 CDT >ERROR: xlog flush request 1201/1CAF84F0 is > > not satisfied --- flushed only to 1201/1CADB730 > > < 2022-08-31 08:44:10.609 CDT >CONTEXT: writing block 0 of relation > > base/16881/2840_vm > > < 2022-08-31 08:44:10.609 CDT >FATAL: checkpoint request failed > > > > I was able to start it with -c recovery_prefetch=no, so it seems like > > prefetch tried to do too much. The VM runs centos7 under qemu. > > I'm making a copy of the data dir in cases it's needed. > > Hmm, a page with an LSN set 118208 bytes past the end of WAL. It's a > vm fork page (which recovery prefetch should ignore completely). Did > you happen to get a copy before the successful recovery? After the > successful recovery, what LSN does that page have, and can you find > the references to it in the WAL with eg pg_waldump -R 1663/16681/2840 > -F vm? Have you turned FPW off (perhaps this is on ZFS?)?
Yes, I have a copy that reproduces the issue: #1 0x00000000009a0df4 in errfinish (filename=<optimized out>, filename@entry=0xa15535 "xlog.c", lineno=lineno@entry=2671, funcname=funcname@entry=0xa1da27 <__func__.22763> "XLogFlush") at elog.c:588 #2 0x000000000055f1cf in XLogFlush (record=19795985532144) at xlog.c:2668 #3 0x0000000000813b24 in FlushBuffer (buf=0x7fffdf1f8900, reln=<optimized out>) at bufmgr.c:2889 #4 0x0000000000817a5b in SyncOneBuffer (buf_id=buf_id@entry=7796, skip_recently_used=skip_recently_used@entry=false, wb_context=wb_context@entry=0x7fffffffcdf0) at bufmgr.c:2576 #5 0x00000000008181c2 in BufferSync (flags=flags@entry=358) at bufmgr.c:2164 #6 0x00000000008182f5 in CheckPointBuffers (flags=flags@entry=358) at bufmgr.c:2743 #7 0x00000000005587b2 in CheckPointGuts (checkPointRedo=19795985413936, flags=flags@entry=358) at xlog.c:6855 #8 0x000000000055feb3 in CreateCheckPoint (flags=flags@entry=358) at xlog.c:6534 #9 0x00000000007aceaa in CheckpointerMain () at checkpointer.c:455 #10 0x00000000007aac52 in AuxiliaryProcessMain (auxtype=auxtype@entry=CheckpointerProcess) at auxprocess.c:153 #11 0x00000000007b0bd8 in StartChildProcess (type=<optimized out>) at postmaster.c:5430 #12 0x00000000007b388f in PostmasterMain (argc=argc@entry=7, argv=argv@entry=0xf139e0) at postmaster.c:1463 #13 0x00000000004986a6 in main (argc=7, argv=0xf139e0) at main.c:202 It's not on zfs, and FPW have never been turned off. I should add that this instance has been pg_upgraded since v10. BTW, base/16881 is the postgres DB )which has 43GB of logfiles imported from CSV, plus 2GB of snapshots of pg_control_checkpoint, pg_settings, pg_stat_bgwriter, pg_stat_database, pg_stat_wal). postgres=# SELECT * FROM page_header(get_raw_page('pg_toast.pg_toast_2619', 'main', 0)); lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid ---------------+----------+-------+-------+-------+---------+----------+---------+------------ 1201/1CDD1F98 | -6200 | 1 | 44 | 424 | 8192 | 8192 | 4 | 3681043287 (1 fila) postgres=# SELECT * FROM page_header(get_raw_page('pg_toast.pg_toast_2619', 'vm', 0)); lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid ---------------+----------+-------+-------+-------+---------+----------+---------+----------- 1201/1CAF84F0 | -6010 | 0 | 24 | 8192 | 8192 | 8192 | 4 | 0 I found this in waldump (note that you had a typoe - it's 16881). [pryzbyj@template0 ~]$ sudo /usr/pgsql-15/bin/pg_waldump -R 1663/16881/2840 -F vm -p /mnt/tmp/15/data/pg_wal 00000001000012010000001C rmgr: Heap2 len (rec/tot): 64/ 122, tx: 0, lsn: 1201/1CAF2658, prev 1201/1CAF2618, desc: VISIBLE cutoff xid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0 FPW, blkref #1: rel 1663/16881/2840 blk 54 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF3AF8, prev 1201/1CAF2788, desc: VISIBLE cutoff xid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 0 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF3B70, prev 1201/1CAF3B38, desc: VISIBLE cutoff xid 3671427998 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 2 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF4DC8, prev 1201/1CAF3BB0, desc: VISIBLE cutoff xid 3672889900 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 4 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF5FB0, prev 1201/1CAF4E08, desc: VISIBLE cutoff xid 3679743844 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 5 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7320, prev 1201/1CAF5FF0, desc: VISIBLE cutoff xid 3679743844 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 6 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7398, prev 1201/1CAF7360, desc: VISIBLE cutoff xid 3679751919 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 11 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7410, prev 1201/1CAF73D8, desc: VISIBLE cutoff xid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 17 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7488, prev 1201/1CAF7450, desc: VISIBLE cutoff xid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 19 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7500, prev 1201/1CAF74C8, desc: VISIBLE cutoff xid 3645406844 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 23 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7578, prev 1201/1CAF7540, desc: VISIBLE cutoff xid 3669978567 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 24 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF75F0, prev 1201/1CAF75B8, desc: VISIBLE cutoff xid 0 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 25 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7668, prev 1201/1CAF7630, desc: VISIBLE cutoff xid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 26 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF76E0, prev 1201/1CAF76A8, desc: VISIBLE cutoff xid 3671911724 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 27 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7758, prev 1201/1CAF7720, desc: VISIBLE cutoff xid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 34 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF77D0, prev 1201/1CAF7798, desc: VISIBLE cutoff xid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 35 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7EF8, prev 1201/1CAF7810, desc: VISIBLE cutoff xid 3672408544 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 37 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7F70, prev 1201/1CAF7F38, desc: VISIBLE cutoff xid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 38 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF7FE8, prev 1201/1CAF7FB0, desc: VISIBLE cutoff xid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 39 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF8078, prev 1201/1CAF8040, desc: VISIBLE cutoff xid 3678237783 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 41 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF80F0, prev 1201/1CAF80B8, desc: VISIBLE cutoff xid 3672408544 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 42 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF8168, prev 1201/1CAF8130, desc: VISIBLE cutoff xid 3680789266 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 43 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF81E0, prev 1201/1CAF81A8, desc: VISIBLE cutoff xid 3667994218 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 44 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF8258, prev 1201/1CAF8220, desc: VISIBLE cutoff xid 3680789266 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 45 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF82D0, prev 1201/1CAF8298, desc: VISIBLE cutoff xid 3673830395 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 48 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF8348, prev 1201/1CAF8310, desc: VISIBLE cutoff xid 0 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 50 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF83C0, prev 1201/1CAF8388, desc: VISIBLE cutoff xid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 51 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF8438, prev 1201/1CAF8400, desc: VISIBLE cutoff xid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 52 rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn: 1201/1CAF84B0, prev 1201/1CAF8478, desc: VISIBLE cutoff xid 3678741092 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 53 pg_waldump: error: error en registro de WAL en 1201/1CD90E48: invalid record length at 1201/1CD91010: wanted 24, got 0 I could send our WAL to you if that's desirable .. -- Justin