Hello Amit and Kuroda-san,
03.07.2024 14:02, Amit Kapila wrote:
Pushed 0002 and 0003. Let's wait for a discussion on 0001.
Please look at another failure of the test [1]:
[13:28:05.647](2.460s) not ok 26 - failover slot is synced
[13:28:05.648](0.001s) # Failed test 'failover slot is synced'
# at
/home/bf/bf-build/skink-master/HEAD/pgsql/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl
line 307.
[13:28:05.648](0.000s) # got: ''
# expected: 'failover_slot'
with 040_pg_createsubscriber_node_s.log containing:
2024-07-08 13:28:05.369 UTC [3985464][client backend][0/2:0] LOG: statement:
SELECT pg_sync_replication_slots()
2024-07-08 13:28:05.557 UTC [3985464][client backend][0/2:0] LOG: could not sync slot "failover_slot" as remote slot
precedes local slot
2024-07-08 13:28:05.557 UTC [3985464][client backend][0/2:0] DETAIL: Remote slot has LSN 0/30047B8 and catalog xmin
743, but local slot has LSN 0/30047B8 and catalog xmin 744.
I could not reproduce it locally, but I've discovered that that subtest
somehow depends on pg_createsubscriber executed for the
'primary contains unmet conditions on node P' check. For example with this
test modification:
@@ -249,7 +249,7 @@ command_fails(
$node_p->connstr($db1), '--socket-directory',
$node_s->host, '--subscriber-port',
$node_s->port, '--database',
- $db1, '--database',
+ 'XXX', '--database',
$db2
],
'primary contains unmet conditions on node P');
I see the same failure:
2024-07-09 10:19:43.284 UTC [938890] 040_pg_createsubscriber.pl LOG:
statement: SELECT pg_sync_replication_slots()
2024-07-09 10:19:43.292 UTC [938890] 040_pg_createsubscriber.pl LOG: could not sync slot "failover_slot" as remote slot
precedes local slot
2024-07-09 10:19:43.292 UTC [938890] 040_pg_createsubscriber.pl DETAIL: Remote slot has LSN 0/3004780 and catalog xmin
743, but local slot has LSN 0/3004780 and catalog xmin 744.
Thus maybe even a normal pg_createsubscriber run can affect the primary
server (it's catalog xmin) differently?
One difference I found in the logs, is that the skink failure's
regress_log_040_pg_createsubscriber contains:
pg_createsubscriber: error: publisher requires 2 wal sender processes, but only
1 remain
Though for a successful run I see locally (I can't find logs of
successful test runs on skink):
pg_createsubscriber: error: publisher requires 2 wal sender processes, but only
0 remain
[1]
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-07-08%2013%3A16%3A35
Best regards,
Alexander