Le mer. 24 févr. 2021 à 14:52, Julien Rouhaud <rjuju...@gmail.com> a écrit :

> Hi,
>
> On Wed, Feb 24, 2021 at 8:21 PM talk to ben <blo.tal...@gmail.com> wrote:
> >
> > The documentation describes how a return code > 125 on the
> restore_command would prevent the server from starting [1] :
> >
> > "
> > It is important that the command return nonzero exit status on failure.
> The command will be called requesting files that are not present in the
> archive; it must return nonzero when so asked. This is not an error
> condition. An exception is that if the command was terminated by a signal
> (other than SIGTERM, which is used as part of a database server shutdown)
> or an error by the shell (such as command not found), then recovery will
> abort and the server will not start up.
> > "
> >
> > But, I dont see such a note on the archive_command side of thing. [2]
> >
> > It could happend in case the archive command is not checked beforehand
> or if the archive command becomes unavailable while PostgreSQL is running.
> rsync can also return 255 in some cases (bad ssh configuration or typos).
> In this case a fatal error is emitted, the archiver stops and is restarted
> by the postmaster.
> >
> > The view pg_stat_archiver is also not updated in this case. Is it on
> purpose ? It could be problematic if someone uses it to check the archiver
> process health.
>
> That's on purpose, see for instance that discussion:
> https://www.postgresql.org/message-id/flat/55731BB8.1050605%40dalibo.com
>

Thanks for pointing that out, I should have checked.


> > Should we document this ? (I can make a patch)
>
> I thought that this behavior was documented, especially for the lack
> of update of pg_stat_archiver.  If it's not the case then we should
> definitely fix that!
>

I tried to do it in the attached patch.
Building the doc worked fine on my computer.
From 350cd7c47d09754ae21f30f260a86e187054257f Mon Sep 17 00:00:00 2001
From: benoit <benoit.lobr...@dalibo.com>
Date: Thu, 25 Feb 2021 12:08:03 +0100
Subject: [PATCH] Document archive_command failures in more details

Document that, if the command was terminated by a signal (other than SIGTERM, which
is used as part of a database server shutdown) or an error by the shell with an exit
status greater than 125 (such as command not found), then the archiver process will
abort and the postmaster will restart it. In such cases, the failure will not be
reported in pg_stat_archiver.
---
 doc/src/sgml/backup.sgml     | 8 +++++++-
 doc/src/sgml/monitoring.sgml | 3 ++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 3c8aaed0b6..94d5dcbdf0 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -636,7 +636,13 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/0
     <productname>PostgreSQL</productname> will assume that the file has been
     successfully archived, and will remove or recycle it.  However, a nonzero
     status tells <productname>PostgreSQL</productname> that the file was not archived;
-    it will try again periodically until it succeeds.
+    it will try again periodically until it succeeds. 
+    An exception is that if the command was terminated by
+    a signal (other than <systemitem>SIGTERM</systemitem>, which is used as
+    part of a database server shutdown) or an error by the shell with an exit
+    status greater than 125 (such as command not found), then the archiver
+    process will abort and the postmaster will restart it. In such cases,
+    the failure will not be reported in <xref linkend="pg-stat-archiver-view"/>.
    </para>
 
    <para>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..391df3055b 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3251,7 +3251,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
        <structfield>failed_count</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of failed attempts for archiving WAL files
+      Number of failed attempts for archiving WAL files (See <xref 
+      linkend="continuous-archiving"/>)
       </para></entry>
      </row>
 
-- 
2.25.4

Reply via email to