Hi folks, My pgbackrest backup on one of my RepoServer fails. The backup fails some times with the error WAL file cannot be archived before 60000 ms timeout.
The pgbackrest stanza check command is sometimes successful, but sometimes fails. I don't know why PG is unable to copy WAL files from pg_wal to /data/myarchive_dir in real time. I always observed a delay of around 10 minutes for a wal file in pg_wal to appear in /data/my_archive_dir. On investigation I'hv observed that our DB admin has put checkpoint_timeout = 10 m in the postgresql.conf file. I think this causes the WAL archiving delay and subsequently my pgbackrest fails while trying to backup the DB to a remote RepoServer. What the ideal value needed to be set for "checkpoint_timeout" to overcome this issue. I don't want pgbackrest backup fails due to this parameter ?. ( Is it possible to set a very minimum value for checkpoint_timeout what is the minimum value or can I put 0 ? ) archive_command = 'pgbackrest --stanza=My_Repo archive-push %p && cp %p /data/archive/%f' >From postgresql logs I am seeing this .. ERROR: [082]: unable to push WAL file '000000010000026300000002' to the archive asynchronously after 60 second(s) HINT: check '/var/log/pgbackrest/My_Repo-archive-push-async.log' for errors. INFO: archive-push command end: aborted with exception [082] 2025-05-02 12:15:17 IST LOG: archive command failed with exit code 82 2025-05-02 12:15:17 IST DETAIL: The failed archive command was: pgbackrest --stanza=My_Repo archive-push pg_wal/000000010000026300000002 && cp pg_wal/000000010000026300000002 /data/archive/000000010000026300000002 INFO: archive-push command begin 2.52.1: [pg_wal/000000010000026300000002] --archive-async --compress-type=zst --exec-id=2848559-384cf49c --log-level-console=info --log-level-file=debug --log-level-stderr=info --pg1-path= /var/lib/postgres/16/data --pg-version-force=16 --process-max=6 --repo1-host=10.50.12.202 --repo1-host-user=pgbackrest --spool-path=/var/spool/pgbackrest --stanza=My_Repo top output on DB cluster: top - 12:37:00 up 66 days, 17:24, 2 users, load average: 4.04, 4.72, 4.56 Tasks: 902 total, 4 running, 897 sleeping, 0 stopped, 1 zombie %Cpu(s): 7.4 us, 1.7 sy, 0.0 ni, 89.9 id, 0.4 wa, 0.2 hi, 0.4 si, 0.0 st MiB Mem : 31837.6 total, 706.1 free, 15243.0 used, 24741.0 buff/cache MiB Swap: 8060.0 total, 6634.0 free, 1426.0 used. 16608.9 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2839363 postgre+ 20 0 8965608 7.2g 7.1g S 70.2 23.0 2:02.61 postgres 2864108 postgre+ 20 0 8967848 7.1g 7.1g S 64.9 22.8 0:30.04 postgres 2865547 postgre+ 20 0 8965432 7.1g 7.1g S 39.1 22.8 0:32.30 postgres 2865752 postgre+ 20 0 8964352 6.9g 6.9g S 16.6 22.3 0:32.94 postgres Model name: Intel(R) Xeon(R) Gold 6430 BIOS Model name: Intel(R) Xeon(R) Gold 6430 CPU family: 6 Model: 143 Thread(s) per core: 1 Core(s) per socket: 16 These are vCPUs (16 nos) , OS RHEL 9, postgres 16 Any hints on how to make pgbackrest take backup properly are much appreciated. Thanks, Krishane