Hello, While doing some work/research on the new incremental backup feature some limitations were not listed in the docs. Mainly the fact that pg_combienbackup works with plain format and not tar.
Around the same time, Tomas Vondra tested incremental backups with a cluster where he enabled checksums after taking the previous full backup. After combining the backups the synthetic backup had pages with checksums and other pages without checksums which ended in checksum errors. I've attached two patches, the first one is just neat-picking things I found when I first read the docs. The second has a note on the two limitations listed above. The limitation on incremental backups of a cluster that had checksums enabled after the previous backup, I was not sure if that should go in pg_basebackup or pg_combienbackup reference documentation. Or maybe somewhere else. Kind regards, Martín -- Martín Marqués It’s not that I have something to hide, it’s that I have nothing I want you to see
From eda9f0c811ba115edf47b4f81200073a41d10cc3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mart=C3=ADn=20Marqu=C3=A9s?= <martin.marq...@gmail.com> Date: Sat, 6 Apr 2024 19:30:23 +0200 Subject: [PATCH 1/2] Remove unneeded wording in pg_combinebackup documentation --- doc/src/sgml/ref/pg_combinebackup.sgml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml index 658e9a759c..19b6d159ce 100644 --- a/doc/src/sgml/ref/pg_combinebackup.sgml +++ b/doc/src/sgml/ref/pg_combinebackup.sgml @@ -37,10 +37,10 @@ PostgreSQL documentation </para> <para> - Specify all of the required backups on the command line from oldest to newest. + Specify all required backups on the command line from oldest to newest. That is, the first backup directory should be the path to the full backup, and the last should be the path to the final incremental backup - that you wish to restore. The reconstructed backup will be written to the + you wish to restore. The reconstructed backup will be written to the output directory specified by the <option>-o</option> option. </para> @@ -48,7 +48,7 @@ PostgreSQL documentation Although <application>pg_combinebackup</application> will attempt to verify that the backups you specify form a legal backup chain from which a correct full backup can be reconstructed, it is not designed to help you keep track - of which backups depend on which other backups. If you remove the one or + of which backups depend on which other backups. If you remove one or more of the previous backups upon which your incremental backup relies, you will not be able to restore it. </para> -- 2.39.3
From 0fc5ea63d7a2700ea841c56dc766a11d8f4182ff Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mart=C3=ADn=20Marqu=C3=A9s?= <martin.marq...@gmail.com> Date: Tue, 9 Apr 2024 09:34:21 +0200 Subject: [PATCH 2/2] Add note of restrictions for combining incremental backups When taking incremental backups the user must be warned that the backup format has to be plain for pg_combinebackup to work properly. Another thing to consider is if a cluster had checksums enabled after the previous backup, an incremental backup will yield a possible valid cluster but with files from the previous backup that don't have checksums, giving checksum errors when replying subsequent changes to those blocks. This behavior was brought up by Tomas Vondra while testing. --- doc/src/sgml/ref/pg_combinebackup.sgml | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml index 19b6d159ce..1cafc0ab07 100644 --- a/doc/src/sgml/ref/pg_combinebackup.sgml +++ b/doc/src/sgml/ref/pg_combinebackup.sgml @@ -60,6 +60,27 @@ PostgreSQL documentation be specified on the command line in lieu of the chain of backups from which it was reconstructed. </para> + + <para> + Note that there are limitations in combining backups: + <itemizedlist> + <listitem> + <para> + <application>pg_combinebackup</application> works with plain format only. + In order to combine backups in tar format, they need to be untar first. + </para> + </listitem> + <listitem> + <para> + If an incremental backup is taken from a cluster where checksums were enabled + after the reference backup finished, the resulting data may be valid, but + the checksums wouldn't validate for files from the reference backup. + In case of enabling checksums on an existing cluster, the next backup must be + a full backup. + </para> + </listitem> + </itemizedlist> + </para> </refsect1> <refsect1> -- 2.39.3