I noticed that the "WAL reliability" documentation says that we use CRC-32
for WAL records and two-phase state files, but we've actually used CRC-32C
since v9.5 (commit 5028f22).  I've attached a short patch to fix this that
I think should be back-patched to all supported versions.

I've attached a second patch that standardizes how we refer to these kinds
of algorithms in our docs.  Specifically, it adds dashes (e.g., "CRC-32C"
instead of "CRC32C").  Wikipedia uses this style pretty consistently [0]
[1] [2], and so I think we should, too.  I don't think this one needs to be
back-patched.

Thoughts?

[0] https://en.wikipedia.org/wiki/SHA-1
[1] https://en.wikipedia.org/wiki/SHA-2
[2] https://en.wikipedia.org/wiki/Cyclic_redundancy_check

-- 
nathan
>From 9bec83c8b78144216d7a80c978af741f51f8b8e3 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nat...@postgresql.org>
Date: Thu, 8 Aug 2024 12:42:30 -0500
Subject: [PATCH v1 1/2] note correct CRC algorithm in WAL reliability docs

---
 doc/src/sgml/wal.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 05e2a8f8be..d5df65bc69 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -186,7 +186,7 @@
    <itemizedlist>
     <listitem>
      <para>
-      Each individual record in a WAL file is protected by a CRC-32 (32-bit) 
check
+      Each individual record in a WAL file is protected by a CRC-32C (32-bit) 
check
       that allows us to tell if record contents are correct. The CRC value
       is set when we write each WAL record and checked during crash recovery,
       archive recovery and replication.
@@ -212,7 +212,7 @@
     </listitem>
     <listitem>
      <para>
-      Individual state files in <filename>pg_twophase</filename> are protected 
by CRC-32.
+      Individual state files in <filename>pg_twophase</filename> are protected 
by CRC-32C.
      </para>
     </listitem>
     <listitem>
-- 
2.39.3 (Apple Git-146)

>From fdcbade1275df08a262451ca97b3ff033b22ab4b Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nat...@postgresql.org>
Date: Thu, 8 Aug 2024 12:43:04 -0500
Subject: [PATCH v1 2/2] standardize style of CRC/SHA algorithms in docs

---
 doc/src/sgml/backup-manifest.sgml   | 6 +++---
 doc/src/sgml/pgcrypto.sgml          | 4 ++--
 doc/src/sgml/ref/pg_basebackup.sgml | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/backup-manifest.sgml 
b/doc/src/sgml/backup-manifest.sgml
index d5ec244834..594e216bcb 100644
--- a/doc/src/sgml/backup-manifest.sgml
+++ b/doc/src/sgml/backup-manifest.sgml
@@ -87,10 +87,10 @@
     <listitem>
      <para>
       This key is always present on the last line of the backup manifest file.
-      The associated value is a SHA256 checksum of all the preceding lines.
+      The associated value is a SHA-256 checksum of all the preceding lines.
       We use a fixed checksum method here to make it possible for clients
-      to do incremental parsing of the manifest. While a SHA256 checksum
-      is significantly more expensive than a CRC32C checksum, the manifest
+      to do incremental parsing of the manifest. While a SHA-256 checksum
+      is significantly more expensive than a CRC-32C checksum, the manifest
       should normally be small enough that the extra computation won't matter
       very much.
      </para>
diff --git a/doc/src/sgml/pgcrypto.sgml b/doc/src/sgml/pgcrypto.sgml
index b8b89696e7..396c67f0cd 100644
--- a/doc/src/sgml/pgcrypto.sgml
+++ b/doc/src/sgml/pgcrypto.sgml
@@ -106,7 +106,7 @@ hmac(data bytea, key bytea, type text) returns bytea
 
   <para>
    The algorithms in <function>crypt()</function> differ from the usual
-   MD5 or SHA1 hashing algorithms in the following respects:
+   MD5 or SHA-1 hashing algorithms in the following respects:
   </para>
 
   <orderedlist>
@@ -525,7 +525,7 @@ gen_salt(type text [, iter_count integer ]) returns text
    </listitem>
    <listitem>
     <para>
-     A SHA1 hash of the random prefix and data is appended.
+     A SHA-1 hash of the random prefix and data is appended.
     </para>
    </listitem>
    <listitem>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml 
b/doc/src/sgml/ref/pg_basebackup.sgml
index 82d0c8e008..4f99340c1d 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -687,7 +687,7 @@ PostgreSQL documentation
        <para>
         Using a SHA hash function provides a cryptographically secure digest
         of each file for users who wish to verify that the backup has not been
-        tampered with, while the CRC32C algorithm provides a checksum that is
+        tampered with, while the CRC-32C algorithm provides a checksum that is
         much faster to calculate; it is good at catching errors due to 
accidental
         changes but is not resistant to malicious modifications.  Note that, to
         be useful against an adversary who has access to the backup, the backup
-- 
2.39.3 (Apple Git-146)

Reply via email to