Re: Non-text mode for pg_dumpall

Andrew Dunstan Sun, 30 Mar 2025 17:56:34 -0700


On 2025-03-30 Su 12:50 PM, Andrew Dunstan wrote:


On 2025-03-29 Sa 1:17 AM, Mahendra Singh Thalor wrote:

On Sat, 29 Mar 2025 at 03:50, Andrew Dunstan <and...@dunslane.net>wrote:


On 2025-03-27 Th 5:15 PM, Andrew Dunstan wrote:

On 2025-03-19 We 2:41 AM, Mahendra Singh Thalor wrote:

On Wed, 12 Mar 2025 at 21:18, Andrew Dunstan <and...@dunslane.net>
wrote:

On 2025-03-12 We 3:03 AM, jian he wrote:

On Wed, Mar 12, 2025 at 1:06 AM Álvaro Herrera
<alvhe...@alvh.no-ip.org> wrote:

Hello,

On 2025-Mar-11, Mahendra Singh Thalor wrote:

In map.dat file, I tried to fix this issue by adding number of
characters
in dbname but as per code comments, as of now, we are not
supporting \n\r
in dbnames so i removed handling.
I will do some more study to fix this issue.

Yeah, I think this is saying that you should not consider the
contents
of map.dat as a shell string.  After all, you're not going to
_execute_
that file via the shell.

Maybe for map.dat you need to escape such characters somehow,so that

they don't appear as literal newlines/carriage returns.

I am confused.

currently pg_dumpall plain format will abort when encounteringdbname

containing newline.
the left dumped plain file does not contain all the cluster
databases data.


if pg_dumpall non-text format aborts earlier,
it's aligned with pg_dumpall plain format?

it's also an improvement since aborts earlier, nothing will bedumped?



am i missing something?

I think we should fix that.

But for the current proposal, Álvaro and I were talking thismorning,

and we thought the simplest thing here would be to have the one line
format and escape NL/CRs in the database name.


cheers

Okay. As per discussions, we will keep one line entry for each
database into map.file.

Thanks all for feedback and review.

Here, I am attaching updated patches for review and testing. These
patches can be applied on commit a6524105d20b.



I'm working through this patch set with a view to committing it.
Attached is some cleanup which is where I got to today, although there
is more to do. One thing I am wondering is why not put the
SimpleDatabaseOidList stuff in fe_utils/simle_list.{c,h} ? That's
where all the similar stuff belongs, and it feels strange to have this
inline in pg_restore.c. (I also don't like the name much -
SimpleOidStringList or maybe SimpleOidPlusStringList might be better).


OK, I have done that, so here is the result. The first two are you
original patches. patch 3 adds the new list type to fe-utils, and patch
4 contains my cleanups and use of the new list type. Apart from some
relatively minor cleanup, the one thing I would like to change is how

dumps are named. If we are producing tar or custom format dumps, Ithink

the file names should reflect that (oid.dmp and oid.tar rather than a
bare oid as the filename), and pg_restore should look for those. I'm
going to work on that tomorrow - I don't think it will be terribly
difficult.

Thanks Andrew.

Here, I am attaching a delta patch for oid.tar and oid.dmp format.



OK, looks good, I have incorporated that.

There are a couple of rough edges, though.

First, I see this:

andrew@ub22arm:inst $ bin/pg_restore -C -d postgres--exclude-database=regression_dummy_seclabel--exclude-database=regression_test_extensions--exclude-database=regression_test_pg_dump destpg_restore: error: could not execute query: "ERROR: role "andrew"already exists

"
Command was: "

--
-- Roles
--

CREATE ROLE andrew;"
pg_restore: warning: errors ignored on global.dat file restore: 1

pg_restore: error: could not execute query: ERROR: database"template1" already existsCommand was: CREATE DATABASE template1 WITH TEMPLATE = template0ENCODING = 'SQL_ASCII' LOCALE_PROVIDER = libc LOCALE = 'C';



pg_restore: warning: errors ignored on database "template1" restore: 1

pg_restore: error: could not execute query: ERROR: database"postgres" already existsCommand was: CREATE DATABASE postgres WITH TEMPLATE = template0ENCODING = 'SQL_ASCII' LOCALE_PROVIDER = libc LOCALE = 'C';



pg_restore: warning: errors ignored on database "postgres" restore: 1
pg_restore: warning: errors ignored on restore: 3

It seems pointless to be trying to create the rolw that we areconnected as, and we also expect template1 and postgres to exist.

In a similar vein, I don't see why we are setting the --create flag inpg_dumpall for those databases. I'm attaching a patch that is designedto stop that, but it doesn't solve the above issues.


I also notice a bunch of these in globals.dat:


--
-- Databases
--

--
-- Database "template1" dump
--

--
-- Database "andrew" dump
--

--
-- Database "isolation_regression_brin" dump
--

--
-- Database "isolation_regression_delay_execution" dump
--

 ...


The patch also tries to fix this.

Lastly, this badly needs some TAP tests written.

I'm going to work on reviewing the documentation next.




I have reworked the documentation some. See attached.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index 765b30a3a66..43fdab2d77e 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -16,7 +16,7 @@ PostgreSQL documentation
 
  <refnamediv>
   <refname>pg_dumpall</refname>
-  <refpurpose>extract a <productname>PostgreSQL</productname> database cluster 
into a script file</refpurpose>
+  <refpurpose>extract a <productname>PostgreSQL</productname> database cluster 
using a specified dump format</refpurpose>
  </refnamediv>
 
  <refsynopsisdiv>
@@ -33,7 +33,7 @@ PostgreSQL documentation
   <para>
    <application>pg_dumpall</application> is a utility for writing out
    (<quote>dumping</quote>) all <productname>PostgreSQL</productname> databases
-   of a cluster into one script file.  The script file contains
+   of a cluster into an archive.  The archive contains
    <acronym>SQL</acronym> commands that can be used as input to <xref
    linkend="app-psql"/> to restore the databases.  It does this by
    calling <xref linkend="app-pgdump"/> for each database in the cluster.
@@ -52,11 +52,16 @@ PostgreSQL documentation
   </para>
 
   <para>
-   The SQL script will be written to the standard output.  Use the
+   Plain text SQL scripts will be written to the standard output.  Use the
    <option>-f</option>/<option>--file</option> option or shell operators to
    redirect it into a file.
   </para>
 
+  <para>
+   Archives in other formats will be placed in a directory named using the
+   <option>-f</option>/<option>--file</option>, which is required in this case.
+  </para>
+
   <para>
   <application>pg_dumpall</application> needs to connect several
   times to the <productname>PostgreSQL</productname> server (once per
@@ -121,10 +126,85 @@ PostgreSQL documentation
        <para>
         Send output to the specified file.  If this is omitted, the
         standard output is used.
+        Note: This option can only be omitted when <option>--format</option> 
is plain
        </para>
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-F <replaceable 
class="parameter">format</replaceable></option></term>
+      <term><option>--format=<replaceable 
class="parameter">format</replaceable></option></term>
+      <listitem>
+       <para>
+        Specify the format of dump files.  In plain format, all the dump data 
is
+        sent in a single text stream. This is the default.
+
+        In all other modes, <application>pg_dumpall</application> first 
creates two files:
+        <filename>global.dat</filename> and <filename>map.dat</filename>, in 
the directory
+        specified by <option>--file</option>.
+        The first file contains global data, such as roles and tablespaces. 
The second
+        contains a mapping between database oids and names. These files are 
used by
+        <application>pg_restore</application>. Data for individual databases 
is placed in
+        <filename>databases</filename> subdirectory, named using the 
database's <type>oid</type>.
+
+       <variablelist>
+        <varlistentry>
+         <term><literal>d</literal></term>
+         <term><literal>directory</literal></term>
+         <listitem>
+          <para>
+           Output directory-format archives for each database,
+           suitable for input into pg_restore. The directory
+           will have database <type>oid</type> as its name.
+          </para>
+         </listitem>
+        </varlistentry>
+
+        <varlistentry>
+         <term><literal>p</literal></term>
+         <term><literal>plain</literal></term>
+         <listitem>
+          <para>
+           Output a plain-text SQL script file (the default).
+          </para>
+         </listitem>
+        </varlistentry>
+
+        <varlistentry>
+         <term><literal>c</literal></term>
+         <term><literal>custom</literal></term>
+         <listitem>
+          <para>
+           Output a custom-format archive for each database,
+           suitable for input into pg_restore. The archive
+           will be named <filename>dboid.dmp</filename> where 
<type>dboid</type> is the
+           <type>oid</type> of the database.
+          </para>
+         </listitem>
+        </varlistentry>
+
+         <varlistentry>
+         <term><literal>t</literal></term>
+         <term><literal>tar</literal></term>
+         <listitem>
+          <para>
+           Output a tar-format archive for each database,
+           suitable for input into pg_restore. The archive
+           will be named <filename>dboid.tar</filename> where 
<type>dboid</type> is the
+           <type>oid</type> of the database.
+          </para>
+         </listitem>
+        </varlistentry>
+
+        </variablelist>
+
+       Note: see <xref linkend="app-pgdump"/> for details
+       of how the various non plain text archives work.
+
+        </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--filter=<replaceable 
class="parameter">filename</replaceable></option></term>
       <listitem>
diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml
index c840a807ae9..f14e5866f6c 100644
--- a/doc/src/sgml/ref/pg_restore.sgml
+++ b/doc/src/sgml/ref/pg_restore.sgml
@@ -18,8 +18,9 @@ PostgreSQL documentation
   <refname>pg_restore</refname>
 
   <refpurpose>
-   restore a <productname>PostgreSQL</productname> database from an
-   archive file created by <application>pg_dump</application>
+   restore a <productname>PostgreSQL</productname> database or cluster
+   from an archive created by <application>pg_dump</application> or
+   <application>pg_dumpall</application>
   </refpurpose>
  </refnamediv>
 
@@ -38,13 +39,14 @@ PostgreSQL documentation
 
   <para>
    <application>pg_restore</application> is a utility for restoring a
-   <productname>PostgreSQL</productname> database from an archive
-   created by <xref linkend="app-pgdump"/> in one of the non-plain-text
+   <productname>PostgreSQL</productname> database or cluster from an archive
+   created by <xref linkend="app-pgdump"/> or
+   <xref linkend="app-pg-dumpall"/> in one of the non-plain-text
    formats.  It will issue the commands necessary to reconstruct the
-   database to the state it was in at the time it was saved.  The
-   archive files also allow <application>pg_restore</application> to
+   database or cluster to the state it was in at the time it was saved. The
+   archives also allow <application>pg_restore</application> to
    be selective about what is restored, or even to reorder the items
-   prior to being restored. The archive files are designed to be
+   prior to being restored. The archive formats are designed to be
    portable across architectures.
   </para>
 
@@ -52,10 +54,17 @@ PostgreSQL documentation
    <application>pg_restore</application> can operate in two modes.
    If a database name is specified, <application>pg_restore</application>
    connects to that database and restores archive contents directly into
-   the database.  Otherwise, a script containing the SQL
-   commands necessary to rebuild the database is created and written
+   the database.
+   When restoring from a dump made by<application>pg_dumpall</application>,
+   each database will be created and then the restoration will be run in that
+   database.
+
+   Otherwise, when a database name is not specified, a script containing the 
SQL
+   commands necessary to rebuild the database or cluster is created and written
    to a file or standard output.  This script output is equivalent to
-   the plain text output format of <application>pg_dump</application>.
+   the plain text output format of <application>pg_dump</application> or
+   <application>pg_dumpall</application>.
+
    Some of the options controlling the output are therefore analogous to
    <application>pg_dump</application> options.
   </para>
@@ -140,6 +149,8 @@ PostgreSQL documentation
         commands that mention this database.
         Access privileges for the database itself are also restored,
         unless <option>--no-acl</option> is specified.
+        <option>--create</option> is required when restoring multiple databases
+        from an archive created by <application>pg_dumpall</application>.
        </para>
 
        <para>
@@ -166,6 +177,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--exclude-database=<replaceable 
class="parameter">pattern</replaceable></option></term>
+      <listitem>
+       <para>
+        Do not restore databases whose name matches
+        <replaceable class="parameter">pattern</replaceable>.
+        Multiple patterns can be excluded by writing multiple
+        <option>--exclude-database</option> switches.  The
+        <replaceable class="parameter">pattern</replaceable> parameter is
+        interpreted as a pattern according to the same rules used by
+        <application>psql</application>'s <literal>\d</literal>
+        commands (see <xref linkend="app-psql-patterns"/>),
+        so multiple databases can also be excluded by writing wildcard
+        characters in the pattern.  When using wildcards, be careful to
+        quote the pattern if needed to prevent shell wildcard expansion.
+       </para>
+       <para>
+        This option is only relevant when restoring from an archive made using 
<application>pg_dumpall</application>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-e</option></term>
       <term><option>--exit-on-error</option></term>
@@ -315,6 +348,19 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-g</option></term>
+      <term><option>--globals-only</option></term>
+      <listitem>
+       <para>
+        Restore only global objects (roles and tablespaces), no databases.
+       </para>
+       <para>
+        This option is only relevant when restoring from an archive made using 
<application>pg_dumpall</application>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-I <replaceable 
class="parameter">index</replaceable></option></term>
       <term><option>--index=<replaceable 
class="parameter">index</replaceable></option></term>

Re: Non-text mode for pg_dumpall

Reply via email to