On Mon, Apr 15, 2024 at 2:53 AM Nicolas Seinlet <nico...@seinlet.com> wrote:

> Hello everyone,
>
> Since I moved some clusters from PostgreSQL 12 to 14, I noticed random
> failures in streaming replication. I say "random" mostly because I haven't
> got the source of the issue.
>
> I'm using the Ubuntu/cyphered ZFS/PostgreSQL combination. I'm using Ubuntu
> LTS (20.04 22.04) and provided ZFS/PostgreSQL with LTS (PostgreSQL 12 on
> Ubuntu 20.04 and 14 on 22.04).
>
> The streaming replication of PostgreSQL is configured with
> `primary_conninfo 'host=main_server port=5432 user=replicant
> password=a_very_secure_password sslmode=require
> application_name=replication_postgresql_app' ` , no replication slot nor
> restore command, and the wal is configured with `full_page_writes = off
> wal_init_zero = off wal_recycle = off`
>
> If this works like a charm on PostgreSQL 12, it's sometimes failing with
> PostgreSQL 14. As we also changed the OS, maybe the issue relies somewhere
> else.
>
> When the issue is detected, the WAL on the primary is correct. A piece of
> the WAL is wrong on the secondary. Only some bytes. Some bytes later, the
> wal is again correct. Stopping PostgreSQL on the secondary, removing the
> wrong WAL file, and restarting PostgreSQL solves the issue.
>
> We've added another secondary and noticed the issue can appear on one of
> the secondaries, not both at the same time.
>
> What can I do to detect the origin of this issue?
>

1. Minor version number?
2. Using replication_slots?
3. Error message(s)?

Reply via email to