On Wed, Sep 03, 2025 at 09:37:08AM +0200, Laurenz Albe wrote:
> I agree that it is a worthwhile goal to clarify the terms, and I
> think that the whole chapter should be reorganized:
> 
> Sections 26.2.5. to 26.2.9. should be moved to a new chapter
> 26.3. "Streaming Replication" (which will renumber the present 26.3.
> and 26.4.).

I would not disagree with that, the situation in the docs can be
confusing for one, as we mix file-based WAL files moved around and
streaming with the replication protocol.

One interesting portion is about replication slots, where we rely on
XLogGetReplicationSlotMinimumLSN() to decide the retention threshold,
Physical slots are updated in WAL senders via
PhysicalConfirmReceivedLocation, meaning that the replication protocol
is required.  Mixing that with the file-shipping part is a mistake.

Just moving the contents to a new "Streaming" section sounds like an
improvement, but the "log-shipping" part would still suck.  So this
stands for cleanup as well, providing a better split.  Perhaps we
should embrace the term "file-based WAL shipping" or "file-based log
shipping" and use that, giving a structure of:
* WAL shipping methods, log-shipping methods or just "Log Shipping"
** File-based WAL shipping
** Streaming

Warm standbys can use both methods.  The part about planning,
operation and preparing may be worth splitting outside the "method" 
portion..  The "continuous" archiving on standbys is not about
streaming, but about the file-based method, so it would need to be
inside the file-based subsection.  We could replace "Log" with just
"WAL", as well, if we're looking at more standardization of the whole
area, while on it.

> Perhaps "WAL shipping" would be a better term, with "WAL streaming"
> as alternative.

Perhaps that stands for improvement and more standarization.  This
term originates from 5e550acbc4d1 in 2006.  The industry has changed a
lot since and there may be standard terms which are much more adapted
for the "modern" user, even if there's a lot of Postgres-ism in the
architecture and how things are done.  There have been some proposals,
but nobody really stood up to commit something.

> But that would be a bigger endeavour that would require going over
> bigger parts of the documentation.  If you want to do that, I'd be
> happy to review it.
> 
> But I think that the factually wrong statement that my patch
> tries to address should get fixed first - who knows how long the
> bigger patch would take.
> 
> I am OK with Michael's suggestion to just remove the wrong line,
> although it wouldn't be bad to have an explanation of what we mean
> by "asynchronous" here.

Yeah, this statement is confusing as-is because there is no
dependency with the timing of a transaction commit, records may be
shipped before or after depending on how your system balances your IO
and/or CPU.  I am not sure if this is worth applying on its own, TBH,
because this stuff needs much more rework than a simple sentence.  If
somebody takes the time to write a patch, I'd be OK to step in this
time for review and doing some reorganization of the whole section,
even if that would mean a HEAD-only change.  I had the attached staged
at some point, for reference.

Adding David Steele in CC, I recall that he may have done a proposal
around all that for the docs, and he's involved in backrest.
--
Michael
From 542db9e02f5aaafe4c831797133acb1aff5d7828 Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Thu, 4 Sep 2025 10:22:08 +0900
Subject: [PATCH] doc: Remove confusing sentence about async log shipping

The original sentence is old, as of 5e550acbc4d1, referring to a
dependency with transaction commit and the timing of the records
flushed, which may not be always true.

Reported-by: Artem Gavrilov <[email protected]>
Reviewed-by: Laurenz Albe <[email protected]>
Reviewed-by: Robert Treat <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
 doc/src/sgml/high-availability.sgml | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index b47d8b4106ef..ffeff3f2b247 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -527,9 +527,8 @@ protocol to make nodes agree on a serializable transactional order.
   </para>
 
   <para>
-   It should be noted that log shipping is asynchronous, i.e., the WAL
-   records are shipped after transaction commit. As a result, there is a
-   window for data loss should the primary server suffer a catastrophic
+   It should be noted that log shipping is asynchronous. As a result, there
+   is a window for data loss should the primary server suffer a catastrophic
    failure; transactions not yet shipped will be lost.  The size of the
    data loss window in file-based log shipping can be limited by use of the
    <varname>archive_timeout</varname> parameter, which can be set as low
-- 
2.51.0

Attachment: signature.asc
Description: PGP signature

Reply via email to