Hi Robert,

On 3/20/19 5:08 PM, David Steele wrote:

I'll revise the patch if Peter thinks this approach looks reasonable.

Hopefully Peter's silence can be interpreted as consent. Probably just busy, though.

I used your suggestions with minor editing. After some reflection, I agree that the inline warnings are likely to be more effective than something at the end, at least for those working on a new implementation.

Regards,
--
-David
da...@pgmasters.net
From bd1754696ca503795ca69386d1d29a7347a77276 Mon Sep 17 00:00:00 2001
From: David Steele <da...@pgmasters.net>
Date: Fri, 29 Mar 2019 10:12:09 +0000
Subject: [PATCH] Add exclusive backup deprecation notes to documentation.

Update the exclusive backup documentation to explain the limitations of
the exclusive backup mode and make it clear that the feature is
deprecated.

Update the log message when the backup_label is found but recovery
cannot proceed.
---
 doc/src/sgml/backup.sgml          | 52 +++++++++++++++++++++++--------
 src/backend/access/transam/xlog.c | 10 ++++--
 2 files changed, 47 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index a73fd4d044..b67da8916a 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -948,13 +948,26 @@ SELECT * FROM pg_stop_backup(false, true);
    </sect3>
    <sect3 id="backup-lowlevel-base-backup-exclusive">
     <title>Making an exclusive low level backup</title>
+
+    <note>
+     <para>
+      The exclusive backup method is deprecated and should be avoided.
+      Prior to <productname>PostgreSQL</productname> 9.6, this was the only
+      low-level method available, but it is now recommended that all users
+      upgrade their scripts to use non-exclusive backups.
+     </para>
+    </note>
+
     <para>
      The process for an exclusive backup is mostly the same as for a
-     non-exclusive one, but it differs in a few key steps. This type of backup
-     can only be taken on a primary and does not allow concurrent backups.
-     Prior to <productname>PostgreSQL</productname> 9.6, this
-     was the only low-level method available, but it is now recommended that
-     all users upgrade their scripts to use non-exclusive backups if possible.
+     non-exclusive one, but it differs in a few key steps. This type of
+     backup can only be taken on a primary and does not allow concurrent
+     backups.  Moreover, because it writes a backup_label file on the
+     master, it can cause the master to fail to restart automatically after
+     a crash.  On the other hand, the erroneous removal of a backup_label
+     file from a backup or standby is a common mistake which can can result
+     in serious data corruption.  If it is necessary to use this method,
+     the following steps may be used.
     </para>
     <para>
   <orderedlist>
@@ -1011,9 +1024,17 @@ SELECT pg_start_backup('label', true);
      consider during this backup.
     </para>
     <para>
-      Note that if the server crashes during the backup it may not be
-      possible to restart until the <literal>backup_label</literal> file has 
been
-      manually deleted from the <envar>PGDATA</envar> directory.
+     As noted above, if the server crashes during the backup it may not be
+     possible to restart until the <literal>backup_label</literal> file has
+     been manually deleted from the <envar>PGDATA</envar> directory.  Note
+     that it is very important to never remove the
+     <literal>backup_label</literal> file when restoring a backup, because
+     this will result in corruption.  Confusion about when it is appropriate
+     to remove this file is a common cause of data corruption when using this
+     method; be very certain that you remove the file only on an existing
+     master and never when building a standby or restoring a backup, even if
+     you are building a standby that will subsequently be promoted to a new
+     master.
     </para>
    </listitem>
    <listitem>
@@ -1045,11 +1066,16 @@ SELECT pg_stop_backup();
      If the archive process has fallen behind
      because of failures of the archive command, it will keep retrying
      until the archive succeeds and the backup is complete.
-     If you wish to place a time limit on the execution of
-     <function>pg_stop_backup</function>, set an appropriate
-     <varname>statement_timeout</varname> value, but make note that if
-     <function>pg_stop_backup</function> terminates because of this your backup
-     may not be valid.
+    </para>
+
+    <para>
+     When using exclusive backup mode, it is absolutely imperative to ensure
+     that <function>pg_stop_backup</function> completes successfully at the
+     end of the backup.  Even if the backup itself fails, for example due to
+     lack of disk space, failure to call <function>pg_stop_backup</function>
+     will leave the server in backup mode indefinitely, causing future backups
+     to fail and increasing the risk of a restart failure during the time that
+     <literal>backup_label</literal> exists.
     </para>
    </listitem>
   </orderedlist>
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 19d7911ec5..9840a55c10 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6364,14 +6364,20 @@ StartupXLOG(void)
                                if (!ReadRecord(xlogreader, checkPoint.redo, 
LOG, false))
                                        ereport(FATAL,
                                                        (errmsg("could not find 
redo location referenced by checkpoint record"),
-                                                        errhint("If you are 
not restoring from a backup, try removing the file \"%s/backup_label\".", 
DataDir)));
+                                                        errhint("If you are 
restoring from a backup, touch \"%s/recovery.signal\" and add required recovery 
options.\n"
+                                                        "If you are not 
restoring from a backup, try removing the file \"%s/backup_label\".\n"
+                                                        "Be careful: removing 
\"%s/backup_label\" will result in a corrupt cluster if restoring from a 
backup.",
+                                                        DataDir, DataDir, 
DataDir)));
                        }
                }
                else
                {
                        ereport(FATAL,
                                        (errmsg("could not locate required 
checkpoint record"),
-                                        errhint("If you are not restoring from 
a backup, try removing the file \"%s/backup_label\".", DataDir)));
+                                        errhint("If you are restoring from a 
backup, touch \"%s/recovery.signal\" and add required recovery options.\n"
+                                                        "If you are not 
restoring from a backup, try removing the file \"%s/backup_label\".\n"
+                                                        "Be careful: removing 
\"%s/backup_label\" will result in a corrupt cluster if restoring from a 
backup.",
+                                                        DataDir, DataDir, 
DataDir)));
                        wasShutdown = false;    /* keep compiler quiet */
                }
 
-- 
2.17.2 (Apple Git-113)

Reply via email to