Hello Michael and Bertrand, I'd also like to note that even with FREEZE added [1], I happened to see the test failure: 5 # Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_class' 5 # at t/035_standby_logical_decoding.pl line 222. 5 5 # Failed test 'activeslot slot invalidation is logged with vacuum on pg_class' 5 # at t/035_standby_logical_decoding.pl line 227.
where 035_standby_logical_decoding_primary.log contains: ... 2024-01-09 07:44:26.480 UTC [820142] 035_standby_logical_decoding.pl LOG: statement: DROP TABLE conflict_test; 2024-01-09 07:44:26.687 UTC [820142] 035_standby_logical_decoding.pl LOG: statement: VACUUM (VERBOSE, FREEZE) pg_class;2024-01-09 07:44:26.687 UTC [820142] 035_standby_logical_decoding.pl INFO: aggressively vacuuming "testdb.pg_catalog.pg_class"
2024-01-09 07:44:27.099 UTC [820143] DEBUG: autovacuum: processing database "testdb"2024-01-09 07:44:27.102 UTC [820142] 035_standby_logical_decoding.pl INFO: finished vacuuming "testdb.pg_catalog.pg_class": index scans: 1
pages: 0 removed, 11 remain, 11 scanned (100.00% of total) tuples: 0 removed, 423 remain, 4 are dead but not yet removable removable cutoff: 762, which was 2 XIDs old when operation ended new relfrozenxid: 762, which is 2 XIDs ahead of previous value frozen: 1 pages from table (9.09% of total) had 1 tuples frozen .... Thus just adding FREEZE is not enough, seemingly. It makes me wonder if 0174c2d21 should be superseded by a patch like discussed (or just have autovacuum = off added)... 09.01.2024 07:59, Michael Paquier wrote:
Alexander, does the test gain in stability once you begin using the patch posted on [2], mentioned by Bertrand? (Also, perhaps we'd better move the discussion to the other thread where the patch has been sent.) [2]: https://www.postgresql.org/message-id/d40d015f-03a4-1cf2-6c1f-2b9aca860...@gmail.com
09.01.2024 08:29, Bertrand Drouvot wrote:
Alexander, pleae find attached v3 which is more or less a rebased version of it.
Bertrand, thank you for updating the patch! Michael, it definitely increases stability of the test (tens of iterations with 20 tests in parallel performed successfully), although I've managed to see another interesting failure (twice): 13 # Failed test 'activeslot slot invalidation is logged with vacuum on pg_class' 13 # at t/035_standby_logical_decoding.pl line 227. psql:<stdin>:1: INFO: vacuuming "testdb.pg_catalog.pg_class" psql:<stdin>:1: INFO: finished vacuuming "testdb.pg_catalog.pg_class": index scans: 1 pages: 0 removed, 11 remain, 11 scanned (100.00% of total) tuples: 4 removed, 419 remain, 0 are dead but not yet removable removable cutoff: 754, which was 0 XIDs old when operation ended ... Waiting for replication conn standby's replay_lsn to pass 0/403E6F8 on primary But I see no VACUUM records in WAL:rmgr: Transaction len (rec/tot): 222/ 222, tx: 0, lsn: 0/0403E468, prev 0/0403E370, desc: INVALIDATION ; inval msgs: catcache 55 catcache 54 catcache 55 catcache 54 catcache 55 catcache 54 catcache 55 catcache 54 relcache 2662 relcache 2663 relcache 3455 relcache 1259 rmgr: Standby len (rec/tot): 234/ 234, tx: 0, lsn: 0/0403E548, prev 0/0403E468, desc: INVALIDATIONS ; relcache init file inval dbid 16384 tsid 1663; inval msgs: catcache 55 catcache 54 catcache 55 catcache 54 catcache 55 catcache 54 catcache 55 catcache 54 relcache 2662 relcache 2663 relcache 3455 relcache 1259 rmgr: Heap len (rec/tot): 60/ 140, tx: 754, lsn: 0/0403E638, prev 0/0403E548, desc: INSERT off: 2, flags: 0x08, blkref #0: rel 1663/16384/16385 blk 0 FPW rmgr: Transaction len (rec/tot): 46/ 46, tx: 754, lsn: 0/0403E6C8, prev 0/0403E638, desc: COMMIT 2024-01-09 13:40:59.873385 UTC rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0403E6F8, prev 0/0403E6C8, desc: RUNNING_XACTS nextXid 755 latestCompletedXid 754 oldestRunningXid 755
rmgr: XLOG len (rec/tot): 30/ 30, tx: 0, lsn: 0/0403E730, prev 0/0403E6F8, desc: CHECKPOINT_REDOrmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0403E750, prev 0/0403E730, desc: RUNNING_XACTS nextXid 755 latestCompletedXid 754 oldestRunningXid 755 rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/0403E788, prev 0/0403E750, desc: CHECKPOINT_ONLINE redo 0/403E730; tli 1; prev tli 1; fpw true; xid 0:755; oid 24576; multi 1; offset 0; oldest xid 728 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 755; online rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0403E800, prev 0/0403E788, desc: RUNNING_XACTS nextXid 755 latestCompletedXid 754 oldestRunningXid 755
(Full logs are attached.) [1] https://www.postgresql.org/message-id/4fd52508-54d7-0202-5bd3-546c2295967f%40gmail.com Best regards, Alexander
035-failures-after-vacuum-on-pg_class.tar.gz
Description: application/gzip