On 2020/03/11 3:39, Magnus Hagander wrote:
On Tue, Mar 10, 2020 at 6:19 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
On 2020/03/10 22:43, Amit Langote wrote:
On Tue, Mar 10, 2020 at 6:09 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
So, I will make the patch adding support for --no-estimate-size option
in pg_basebackup.
Patch attached.
Like the idea and the patch looks mostly good.
Thanks for reviewing the patch!
+ total size. If the estimation is disabled in
+ <application>pg_basebackup</application>
+ (i.e., <literal>--no-estimate-size</literal> option is specified),
+ this is always <literal>0</literal>.
"always" seems unnecessary.
Fixed.
+ This option prevents the server from estimating the total
+ amount of backup data that will be streamed. In other words,
+ <literal>backup_total</literal> column in the
+ <structname>pg_stat_progress_basebackup</structname>
+ view always indicates <literal>0</literal> if this option is enabled.
Here too.
Fixed.
Attached is the updated version of the patch.
Would it perhaps be better to return NULL instead of 0 in the
statistics view if there is no data?
Also, should it really be the server version that decides how this
feature behaves, and not the pg_basebackup version? Given that the
implementation is entirely in the client, it seems that's more
logical?
Yeah, you're right. I changed the patch that way.
Attached is the updated version of the patch.
and a few docs nitpicks:
<para>
Whether this is enabled or not, the
<structname>pg_stat_progress_basebackup</structname> view
- report the progress of the backup in the server side. But note
- that the total amount of data that will be streamed is estimated
- and reported only when this option is enabled. In other words,
- <literal>backup_total</literal> column in the view always
- indicates <literal>0</literal> if this option is disabled.
+ report the progress of the backup in the server side.
+ </para>
+ <para>
+ This option is not allowed when using
+ <option>--no-estimate-size</option>.
</para>
I think you should just remove that whole paragraph. The details are
now listed under the disable parameter.
Fixed.
+ This option prevents the server from estimating the total
+ amount of backup data that will be streamed. In other words,
+ <literal>backup_total</literal> column in the
+ <structname>pg_stat_progress_basebackup</structname>
+ view indicates <literal>0</literal> if this option is enabled.
I suggest just "This option prevents the server from estimating the
total amount of backup data that will be streamed, resulting in the
ackup_total column in pg_stat_progress_basebackup to be (zero or NULL
depending on above)".
(Markup needed on that of course ,but you get the idea)
Yes, fixed.
+ When this is disabled, the backup will start by enumerating
I'd try to avoid the double negation, with something "without this
parameter, the backup will start..."
Fixed. I used "Without using this option ...".
+ <para>
+ <application>pg_basebackup</application> asks the server to estimate
+ the total amount of data that will be streamed by default (unless
+ <option>--no-estimate-size</option> is specified) in version 13 or later,
+ and does that only when <option>--progress</option> is specified in
+ the older versions.
+ </para>
That's an item for the release notes, not for the reference page, I
think. It's already explained under the --disable parameter, so I
suggest removing this paragraph as well.
Fixed.
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 987580d6df..2c9f51daf4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4388,10 +4388,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
<entry><structfield>backup_total</structfield></entry>
<entry><type>bigint</type></entry>
<entry>
- Total amount of data that will be streamed. If progress reporting
- is not enabled in <application>pg_basebackup</application>
- (i.e., <literal>--progress</literal> option is not specified),
- this is <literal>0</literal>. Otherwise, this is estimated and
+ Total amount of data that will be streamed. This is estimated and
reported as of the beginning of
<literal>streaming database files</literal> phase. Note that
this is only an approximation since the database
@@ -4399,7 +4396,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
and WAL log may be included in the backup later. This is always
the same value as <structfield>backup_streamed</structfield>
once the amount of data streamed exceeds the estimated
- total size.
+ total size. If the estimation is disabled in
+ <application>pg_basebackup</application>
+ (i.e., <literal>--no-estimate-size</literal> option is specified),
+ this is <literal>0</literal>.
</entry>
</row>
<row>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml
b/doc/src/sgml/ref/pg_basebackup.sgml
index 29bf2f9b97..90638aad0e 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -460,21 +460,6 @@ PostgreSQL documentation
in this case the estimated target size will increase once it passes the
total estimate without WAL.
</para>
- <para>
- When this is enabled, the backup will start by enumerating the size of
- the entire database, and then go back and send the actual contents.
- This may make the backup take slightly longer, and in particular it
- will take longer before the first data is sent.
- </para>
- <para>
- Whether this is enabled or not, the
- <structname>pg_stat_progress_basebackup</structname> view
- report the progress of the backup in the server side. But note
- that the total amount of data that will be streamed is estimated
- and reported only when this option is enabled. In other words,
- <literal>backup_total</literal> column in the view always
- indicates <literal>0</literal> if this option is disabled.
- </para>
</listitem>
</varlistentry>
@@ -552,6 +537,30 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><option>--no-estimate-size</option></term>
+ <listitem>
+ <para>
+ This option prevents the server from estimating the total
+ amount of backup data that will be streamed, resulting in the
+ <literal>backup_total</literal> column in the
+ <structname>pg_stat_progress_basebackup</structname>
+ to be <literal>0</literal>.
+ </para>
+ <para>
+ Without this option, the backup will start by enumerating
+ the size of the entire database, and then go back and send
+ the actual contents. This may make the backup take slightly
+ longer, and in particular it will take longer before the first
+ data is sent. This option is useful to avoid such estimation
+ time if it's too long.
+ </para>
+ <para>
+ This option is not allowed when using <option>--progress</option>.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
diff --git a/src/bin/pg_basebackup/pg_basebackup.c
b/src/bin/pg_basebackup/pg_basebackup.c
index 48bd838803..c5d95958b2 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -121,6 +121,7 @@ static char *label = "pg_basebackup base backup";
static bool noclean = false;
static bool checksum_failure = false;
static bool showprogress = false;
+static bool estimatesize = true;
static int verbose = 0;
static int compresslevel = 0;
static IncludeWal includewal = STREAM_WAL;
@@ -386,6 +387,7 @@ usage(void)
printf(_(" --no-slot prevent creation of temporary
replication slot\n"));
printf(_(" --no-verify-checksums\n"
" do not verify checksums\n"));
+ printf(_(" --no-estimate-size do not estimate backup size in
server side\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nConnection options:\n"));
printf(_(" -d, --dbname=CONNSTR connection string\n"));
@@ -1741,7 +1743,7 @@ BaseBackup(void)
basebkp =
psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s %s",
escaped_label,
- showprogress ? "PROGRESS" : "",
+ estimatesize ? "PROGRESS" : "",
includewal == FETCH_WAL ? "WAL" : "",
fastcheckpoint ? "FAST" : "",
includewal == NO_WAL ? "" : "NOWAIT",
@@ -2066,6 +2068,7 @@ main(int argc, char **argv)
{"waldir", required_argument, NULL, 1},
{"no-slot", no_argument, NULL, 2},
{"no-verify-checksums", no_argument, NULL, 3},
+ {"no-estimate-size", no_argument, NULL, 4},
{NULL, 0, NULL, 0}
};
int c;
@@ -2234,6 +2237,9 @@ main(int argc, char **argv)
case 3:
verify_checksums = false;
break;
+ case 4:
+ estimatesize = false;
+ break;
default:
/*
@@ -2356,6 +2362,14 @@ main(int argc, char **argv)
}
#endif
+ if (showprogress && !estimatesize)
+ {
+ pg_log_error("--progress and --no-estimate-size are
incompatible options");
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+ progname);
+ exit(1);
+ }
+
/* connection in replication mode to server */
conn = GetConnection();
if (!conn)