On 2020/02/06 11:35, Amit Langote wrote:
On Wed, Feb 5, 2020 at 4:29 PM Amit Langote <amitlangot...@gmail.com> wrote:
On Wed, Feb 5, 2020 at 3:36 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
Yeah, I understand your concern. The pg_basebackup document explains
the risk when --progress is specified, as follows. Since I imagined that
someone may explicitly disable --progress to avoid this risk, I made
the server estimate the total size only when --progress is specified.
But you think that this overhead by --progress is negligibly small?

--------------------
When this is enabled, the backup will start by enumerating the size of
the entire database, and then go back and send the actual contents.
This may make the backup take slightly longer, and in particular it will
take longer before the first data is sent.
--------------------

Sorry, I hadn't read this before.  So, my proposal would make this a lie.

Still, if "streaming database files" is the longest phase, then not
having even an approximation of how much data is to be streamed over
doesn't much help estimating progress,  at least as long as one only
has this view to look at.

That said, the overhead of checking the size before sending any data
may be worse for some people than others, so having the option to
avoid that might be good after all.

By the way, if calculating backup total size can take significantly
long in some cases, that is when requested by specifying --progress,
then it might be a good idea to define a separate phase for that, like
"estimating backup size" or some such.  Currently, it's part of
"starting backup", which covers both running the checkpoint and size
estimation which run one after another.

OK, I added this phase in the latest patch that I posted upthread.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters


Reply via email to