Hello everyone,

Since I moved some clusters from PostgreSQL 12 to 14, I noticed random failures 
in streaming replication. I say "random" mostly because I haven't got the 
source of the issue.

I'm using the Ubuntu/cyphered ZFS/PostgreSQL combination. I'm using Ubuntu LTS 
(20.04 22.04) and provided ZFS/PostgreSQL with LTS (PostgreSQL 12 on Ubuntu 
20.04 and 14 on 22.04).

The streaming replication of PostgreSQL is configured with `primary_conninfo 
'host=main_server port=5432 user=replicant password=a_very_secure_password 
sslmode=require application_name=replication_postgresql_app' ` , no replication 
slot nor restore command, and the wal is configured with `full_page_writes = 
off wal_init_zero = off wal_recycle = off`

If this works like a charm on PostgreSQL 12, it's sometimes failing with 
PostgreSQL 14. As we also changed the OS, maybe the issue relies somewhere else.

When the issue is detected, the WAL on the primary is correct. A piece of the 
WAL is wrong on the secondary. Only some bytes. Some bytes later, the wal is 
again correct. Stopping PostgreSQL on the secondary, removing the wrong WAL 
file, and restarting PostgreSQL solves the issue.

We've added another secondary and noticed the issue can appear on one of the 
secondaries, not both at the same time.

What can I do to detect the origin of this issue?

Have a nice week,

Nicolas.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to