Hi,
thanks for the review!
On Wed, Sep 19, 2018 at 05:17:05PM +0200, Fabien COELHO wrote:
> >>This optionally prints the progress of pg_verify_checksums via read
> >>kilobytes to the terminal with the new command line argument -P.
> >>
> >>If -P was forgotten and pg_verify_checksums operates on a large cluster,
> >>the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> >>status reporting on during runtime.
> >
> >Version 2 of the patch is attached. This is rebased to master as of
> >422952ee and changes the signal handling to be a toggle as suggested by
> >Alvaro.
>
> Patch applies cleanly and compiles.
>
> About tests: "make check" is okay, but ITSM that the command is not started
> once, ever, in any test:-( It is unclear whether any test triggers checksums
> anyway...
Yeah, this was discussed on another thread and some basic tap tests for
pg_verify_checksums were proposed (also by me), but this hasn't been
further addressed.
Personally, I think this would be a good thing to add to v11 even.
In any case, this particular feature might not be very easy to tap test,
but I am open to suggestions, of course.
> The time() granularity to the second makes the result awkward on small
> tests:
>
> 8/1554552 kB (0%, 8 kB/s)
> 1040856/1554552 kB (66%, 1040856 kB/s)
> 1554552/1554552 kB (100%, 1554552 kB/s)
>
> I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
> much better precision.
>
> The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying
> 1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are
> about storage which is usually measured in powers of 1,000, so I'd suggest
> to use thousands.
>
> Another reserve I have is that on any hardware it is likely that there will
> be a lot of kilos, so maybe megas (MB) should be used instead.
The display is exactly the same as in pg_basebackup (except the
bandwith is shown as well), so right now I think it is more useful to be
consistent here. So if we change that, pg_basebackup should be changed
as well I think.
Maybe the code could be factored out to some common file in the future.
> I'm wondering whether the bandwidth display should be local (from the last
> display) or global (from the start of the command), but for the last forced
> one. This is an open question.
> Maybe it would be nice to show elapsed time and expected completion time
> based on the current speed.
Maybe; this could be added to the factored out common code I mentioned
above.
> I would be in favor or having this turned on by default, and silenced with
> some option. I'm not sure other people would agree, though, so it is an open
> question as well.
If this runs in a script, it would be pretty annoying, so everybody
would have to add --no-progress so I am not convinced. Also, both
pg_basebackup and pgbench don't show progress by default.
> About the code:
>
> + if (show_progress)
> + show_progress = false;
> + else
> + show_progress = true;
>
> Why not a simple:
>
> /* invert current state */
> show_progress = !show_progress;
Fair enough.
> I do not see much advantage in using intermediate string buffers for values:
>
> + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
> + total_size / 1024);
> + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
> + current_size / 1024);
> + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
> + (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 :
> (time(NULL) - scan_started)));
> + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
> + currentstr, totalstr, total_percent, currspeedstr);
>
> ISTM that just one simple fprintf would be fine:
>
> fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
> formulas for each slot...);
That code is adopted from pg_basebackup.c and the comment there says:
| * Separate step to keep platform-dependent format code out of
| * translatable strings. And we only test for INT64_FORMAT
| * availability in snprintf, not fprintf.
So that sounds legit to me.
> ISTM that this line overwriting trick deserves a comment:
>
> + if (isatty(fileno(stderr)))
> + fprintf(stderr, "\r");
> + else
> + fprintf(stderr, "\n");
I agree a comment should be in there. But that code is also taken
verbatim from pg_basebackup.c (but this time, there's no comment there,
either).
How about this:
| * If we are reporting to a terminal, send a carriage return so that
| * we stay on the same line. If not, send a newline.
> And it runs a little amok with "-v".
Do you suggest we should make those mutually exlusive? I agree that in
general, -P is not very useful if -v is on, but if you have a really big
table, it should still be, no?
> + memset(&act, '\0', sizeof(act));
>
> pg uses 0 instead of '\0' everywhere else.
Ok.
> + /* Make sure we just report at least once a second */
> + if ((now == last_progress_update) && !force) return;
>
> The comments seems contradictory, I understand the code makes sure that it
> is "at most" once a second.
I guess you're right as the identical code in pg_basebackup.c has a
comment saying "Max once per second".
> Pgbench uses "-P XXX" to strigger a progress
> report every XXX second. I'm unsure whether it would be useful to allow the
> user to change the 1 second display interval.
I think pgbench writes a new line for each report? In that case, having
it every second for longer runs might be annoying and I can see the
point in having in configurable.
In the case of pg_basebackup/pg_verify_checksums, I guess 1 second is
fine.
> I'm not sure why you removed one unrelated line:
>
> #include "storage/checksum.h"
> #include "storage/checksum_impl.h"
>
> -
> static int64 files = 0;
> static int64 blocks = 0;
> static int64 badblocks = 0;
Merge error on my side I guess, will fix.
> There is a problem in the scan_file code: The added sizeonly parameter is
> not used. It should be removed.
Right, this might have been a leftover from an earlier version of the
code, I'll check back with Bernd to make sure that was not a
porting/merge error on my side.
I've attached V3 of the patch, addressing some of your comments.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: [email protected]
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c
index 1bc020ab6c..2e8b14dcb7 100644
--- a/src/bin/pg_verify_checksums/pg_verify_checksums.c
+++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c
@@ -10,6 +10,8 @@
#include "postgres_fe.h"
#include <dirent.h>
+#include <signal.h>
+#include <time.h>
#include <sys/stat.h>
#include <unistd.h>
@@ -29,9 +31,18 @@ static ControlFileData *ControlFile;
static char *only_relfilenode = NULL;
static bool verbose = false;
+static bool show_progress = false;
static const char *progname;
+/*
+ * Progress status information.
+ */
+int64 total_size = 0;
+int64 current_size = 0;
+pg_time_t last_progress_update;
+pg_time_t scan_started;
+
static void
usage(void)
{
@@ -42,6 +53,7 @@ usage(void)
printf(_(" [-D, --pgdata=]DATADIR data directory\n"));
printf(_(" -v, --verbose output verbose messages\n"));
printf(_(" -r RELFILENODE check only relation with specified relfilenode\n"));
+ printf(_(" -P, --progress show progress information\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nIf no data directory (DATADIR) is specified, "
@@ -57,6 +69,71 @@ static const char *const skip[] = {
NULL,
};
+static void
+toggle_progress_report(int signo,
+ siginfo_t * siginfo,
+ void *context)
+{
+
+ /* we handle SIGUSR1 only, and toggle the value of show_progress */
+ if (signo == SIGUSR1)
+ show_progress = !show_progress;
+
+}
+
+/*
+ * Report current progress status. Parts borrowed from
+ * PostgreSQLs' src/bin/pg_basebackup.c
+ */
+static void
+report_progress(bool force)
+{
+ pg_time_t now = time(NULL);
+ int total_percent = 0;
+
+ char totalstr[32];
+ char currentstr[32];
+ char currspeedstr[32];
+
+ /* Make sure we report at most once a second */
+ if ((now == last_progress_update) && !force)
+ return;
+
+ /* Save current time */
+ last_progress_update = now;
+
+ /*
+ * Calculate current percent done, based on KiB...
+ */
+ total_percent = total_size ? (int64) ((current_size / 1024) * 100 / (total_size / 1024)) : 0;
+
+ /* Don't display larger than 100% */
+ if (total_percent > 100)
+ total_percent = 100;
+
+ /* The same for total size */
+ if (current_size > total_size)
+ total_size = current_size / 1024;
+
+ snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
+ total_size / 1024);
+ snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
+ current_size / 1024);
+ snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
+ (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : (time(NULL) - scan_started)));
+ fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
+ currentstr, totalstr, total_percent, currspeedstr);
+
+ /*
+ * If we are reporting to a terminal, send a carriage return so that we
+ * stay on the same line. If not, send a newline.
+ */
+ if (isatty(fileno(stderr)))
+ fprintf(stderr, "\r");
+ else
+ fprintf(stderr, "\n");
+}
+
static bool
skipfile(const char *fn)
{
@@ -110,6 +187,7 @@ scan_file(const char *fn, BlockNumber segmentno)
continue;
csum = pg_checksum_page(buf.data, blockno + segmentno * RELSEG_SIZE);
+ current_size += r;
if (csum != header->pd_checksum)
{
if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
@@ -117,6 +195,9 @@ scan_file(const char *fn, BlockNumber segmentno)
progname, fn, blockno, csum, header->pd_checksum);
badblocks++;
}
+
+ if (show_progress)
+ report_progress(false);
}
if (verbose)
@@ -126,9 +207,10 @@ scan_file(const char *fn, BlockNumber segmentno)
close(f);
}
-static void
-scan_directory(const char *basedir, const char *subdir)
+static int64
+scan_directory(const char *basedir, const char *subdir, bool sizeonly)
{
+ int64 dirsize = 0;
char path[MAXPGPATH];
DIR *dir;
struct dirent *de;
@@ -191,16 +273,22 @@ scan_directory(const char *basedir, const char *subdir)
/* Relfilenode not to be included */
continue;
- scan_file(fn, segmentno);
+ dirsize += st.st_size;
+
+ if (!sizeonly)
+ {
+ scan_file(fn, segmentno);
+ }
}
#ifndef WIN32
else if (S_ISDIR(st.st_mode) || S_ISLNK(st.st_mode))
#else
else if (S_ISDIR(st.st_mode) || pgwin32_is_junction(fn))
#endif
- scan_directory(path, de->d_name);
+ dirsize += scan_directory(path, de->d_name, sizeonly);
}
closedir(dir);
+ return dirsize;
}
int
@@ -208,6 +296,7 @@ main(int argc, char *argv[])
{
static struct option long_options[] = {
{"pgdata", required_argument, NULL, 'D'},
+ {"progress", no_argument, NULL, 'P'},
{"verbose", no_argument, NULL, 'v'},
{NULL, 0, NULL, 0}
};
@@ -217,6 +306,8 @@ main(int argc, char *argv[])
int option_index;
bool crc_ok;
+ struct sigaction act; /* to turn progress status info on */
+
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_verify_checksums"));
progname = get_progname(argv[0]);
@@ -235,7 +326,7 @@ main(int argc, char *argv[])
}
}
- while ((c = getopt_long(argc, argv, "D:r:v", long_options, &option_index)) != -1)
+ while ((c = getopt_long(argc, argv, "D:r:vP", long_options, &option_index)) != -1)
{
switch (c)
{
@@ -253,6 +344,9 @@ main(int argc, char *argv[])
}
only_relfilenode = pstrdup(optarg);
break;
+ case 'P':
+ show_progress = true;
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit(1);
@@ -306,10 +400,54 @@ main(int argc, char *argv[])
exit(1);
}
+
+ /*
+ * Assign SIGUSR1 signal handler to toggle progress status information
+ */
+ memset(&act, '0', sizeof(act));
+ act.sa_sigaction = &toggle_progress_report;
+ act.sa_flags = SA_SIGINFO;
+
+ /*
+ * Enable signal handler, but don't treat it as severe if we don't
+ * succeed here. Just give a message on STDERR.
+ */
+ if (sigaction(SIGUSR1, &act, NULL) < 0)
+ {
+ fprintf(stderr, _("%s: could not set signal handler to toggle progress\n"),
+ progname);
+ }
+
+ /*
+ * Iff progress status information is requested, we need to scan the
+ * directory tree(s) twice, once to get the idea how much data we need
+ * to scan and finally to do the real legwork.
+ */
+ total_size = scan_directory(DataDir, "global", true);
+ total_size += scan_directory(DataDir, "base", true);
+ total_size += scan_directory(DataDir, "pg_tblspc", true);
+
+ /*
+ * Remember start time. Required to calculate the current speed in
+ * report_progress()
+ */
+ scan_started = time(NULL);
+
/* Scan all files */
- scan_directory(DataDir, "global");
- scan_directory(DataDir, "base");
- scan_directory(DataDir, "pg_tblspc");
+ scan_directory(DataDir, "global", false);
+ scan_directory(DataDir, "base", false);
+ scan_directory(DataDir, "pg_tblspc", false);
+
+ /*
+ * Done. Move to next line in case progress information was shown.
+ * Otherwise we clutter the summary output.
+ */
+ if (show_progress)
+ {
+ report_progress(true);
+ if (isatty(fileno(stderr)))
+ fprintf(stderr, "\n");
+ }
printf(_("Checksum scan completed\n"));
printf(_("Data checksum version: %d\n"), ControlFile->data_checksum_version);