Hi, I have a PostgreSQL (9.6.15) two node cluster setup with Patroni. The instances are configured to work in master <-> synchronous standby setup and both run in docker containers with pgdata volume mounted from host. When master is restarted the synchronous standby automatically takes the role of master and master starts operating as synchronous replica.
Everything works fine but whenever I restart master instance it creates a new wal file in pg_xlog/ however old wal files are not cleaned up. They pile up and in my reproduction environment, where there are no operations on the database, they currently occupy: du -sh pg_xlog/ 17G pg_xlog/ and the number of the files is more or less: ls pg_xlog/ | grep -v history | wc -l 1024 I've searched through the mailing lists and articles like https://dataegret.com/2018/04/pg_wal-is-too-big-whats-going-on/ and the main problems mentioned in most of the places are: 1. failing archive command 2. abandoned (inactive) replication slot 3. issues with checkpoints 3. too big wal_keep_segments value However, none of those problems seem to apply to my deployment: 1. I have `archive_mode` set to `off` 1a. I even tried enabling it and setting the `archive_command` to '/bin/true' just to confirm a suggestion found in one of the post on the mailing list (that didn't improve anything) 2. the only replication slot in `pg_replication_slots` is the one related to the synchronous replica and it is active 3. I've enabled `log_checkpoint` but doesn't see any errors or warnings related to checkpoints triggered either automatically or manually via `CHECKPOINT;` 4. `wal_keep_segments` is set to 8 and `max_wal_size` is set to 1GB Is there anything that I should check that could shed some light on this? A few configuration options taken from `pg_settings`: archive_command = (disabled) archive_mode = off archive_timeout = 0 check_function_bodies = on checkpoint_completion_target = 0.5 checkpoint_flush_after = 32 checkpoint_timeout = 300 hot_standby = on hot_standby_feedback = off max_replication_slots = 10 max_wal_senders = 10 max_wal_size = 64 min_wal_size = 5 synchronous_commit = on synchronous_standby_names = patroni1 wal_block_size = 8192 wal_buffers = 512 wal_compression = off wal_keep_segments = 8 wal_level = replica wal_log_hints = on wal_receiver_status_interval = 10 wal_receiver_timeout = 60000 wal_retrieve_retry_interval = 5000 wal_segment_size = 2048 wal_sender_timeout = 60000 wal_sync_method = fdatasync wal_writer_delay = 200 wal_writer_flush_after = 128 Kind regards