po 8. 2. 2021 v 19:35 odesílatel Matthias van de Meent
<boekewurm+postg...@gmail.com> napsal:
>
> Hi,
>
> With [0] we got COPY progress reporting. Before the column names of
> this newly added view are effectively set in stone with the release of
> pg14, I propose the following set of relatively small patches. These
> are v2, because it is a patchset that is based on a set of patches
> that I previously posted in [0].
>
> 0001 Adds a column to pg_stat_progress_copy which details the amount
> of tuples that were excluded from insertion by the WHERE clause of the
> COPY FROM command.
>
> 0002 alters pg_stat_progress_copy to use 'tuple'-terminology instead
> of 'line'-terminology. 'Line' doesn't make sense in the binary copy
> case, and only for the 'text' copy format there can be a guarantee
> that the source / output file actually contains the reported amount of
> lines, whereas the amount of data tuples (which is also what it's
> called internally) is guaranteed to equal for all data types.
>
> There was some discussion about this in [0] where the author thought
> 'line' is more consistent with the CSV documentation, and where I
> argued that 'tuple' is both more consistent with the rest of the
> progress reporting tables and more consistent with the actual counted
> items: these are the tuples serialized / inserted (as noted in the CSV
> docs; "Thus the files are not strictly one line per table row like
> text-format files.").
>
> Patch 0003 adds backlinks to the progress reporting docs from the docs
> of the commands that have progress reporting (re/index, cluster,
> vacuum, etc.) such that progress reporting is better discoverable from
> the relevant commands, and removes the datname column from the
> progress_copy view (that column was never committed). This too should
> be fairly trivial and uncontroversial.
>
> 0004 adds the 'command' column to the progress_copy view; which
> distinguishes between COPY FROM and COPY TO. The two commands are (in
> my opinion) significantly different enough to warrant this column;
> similar to the difference between CREATE INDEX/REINDEX [CONCURRENTLY]
> which also report that information. I believe that this change is
> appropriate; as the semantics of the columns change depending on the
> command being executed.
>
> Lastly, 0005 adds 'io_target' to the reported information, that is,
> FILE, PROGRAM, STDIO or CALLBACK. Although this can relatively easily
> be determined based on the commands in pg_stat_activity, it is
> reasonably something that a user would want to query on, as the
> origin/target of COPY has security and performance implications,
> whereas other options (e.g. format) are less interesting for clients
> that are not executing that specific COPY command.

I took a little deeper look and I'm not sure if I understand FILE and
STDIO. I have finally tried to finalize some initial regress testing
of COPY command progress using triggers. I have attached the initial
patch  applicable to your changes. As you can see COPY FROM STDIN is
reported as FILE. That's probably expected, but it is a little
confusing for me since STDIN and STDIO sound similar. What is the
purpose of STDIO? When is the COPY command reported with io_target of
STDIO?

> Of special interest in 0005 is that it reports the io_target for the
> logical replications' initial tablesyncs' internal COPY. This would
> otherwise be measured, but no knowledge about the type of copy (or its
> origin) would be available on the worker's side. I'm not married to
> this patch 0005, but I believe it could be useful, and therefore
> included it in the patchset.
>
>
> With regards,
>
> Matthias van de Meent.
>
>
> [0] 
> https://www.postgresql.org/message-id/flat/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg%2BFEKet4XaTGSW%3DLitKQ%40mail.gmail.com
From d91fbe3a80c1858e9cbc126e4ade44945cea9ae9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.sima...@gmail.com>
Date: Tue, 9 Feb 2021 09:27:21 +0100
Subject: [PATCH] Add initial regress test for COPY progress reporting.

---
 src/test/regress/expected/progress.out | 43 ++++++++++++++++++++++++++
 src/test/regress/parallel_schedule     |  2 +-
 src/test/regress/sql/progress.sql      | 39 +++++++++++++++++++++++
 3 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100644 src/test/regress/expected/progress.out
 create mode 100644 src/test/regress/sql/progress.sql

diff --git a/src/test/regress/expected/progress.out b/src/test/regress/expected/progress.out
new file mode 100644
index 000000000000..2c745b946366
--- /dev/null
+++ b/src/test/regress/expected/progress.out
@@ -0,0 +1,43 @@
+-- setup for COPY progress testing
+CREATE TEMP TABLE test_progress_with_trigger (
+  a int,
+  b text
+) ;
+CREATE OR REPLACE function notice_after_progress_reporting() RETURNS trigger AS
+$$
+DECLARE report record;
+BEGIN
+  SELECT INTO report * FROM pg_stat_progress_copy report WHERE pid = pg_backend_pid();
+  raise info 'progress datname: %', report.datname;
+  raise info 'progress command: %', report.command;
+  raise info 'progress io_target: %', report.io_target;
+  raise info 'progress bytes_processed: %', report.bytes_processed;
+  raise info 'progress bytes_total: %', report.bytes_total;
+  raise info 'progress tuples_processed: %', report.tuples_processed;
+  raise info 'progress tuples_excluded: %', report.tuples_excluded;
+
+  RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+CREATE OR REPLACE TRIGGER check_after_progress_reporting
+    AFTER INSERT ON test_progress_with_trigger
+    FOR EACH ROW
+    EXECUTE FUNCTION notice_after_progress_reporting();
+-- simple COPY from STDIN
+COPY test_progress_with_trigger (a, b) FROM STDIN;
+INFO:  progress datname: regression
+INFO:  progress command: COPY FROM
+INFO:  progress io_target: FILE
+INFO:  progress bytes_processed: 9
+INFO:  progress bytes_total: 0
+INFO:  progress tuples_processed: 1
+INFO:  progress tuples_excluded: 0
+-- COPY from STDIN with WHERE skipping lines
+COPY test_progress_with_trigger (a, b) FROM STDIN WHERE a > 1;
+INFO:  progress datname: regression
+INFO:  progress command: COPY FROM
+INFO:  progress io_target: FILE
+INFO:  progress bytes_processed: 18
+INFO:  progress bytes_total: 0
+INFO:  progress tuples_processed: 1
+INFO:  progress tuples_excluded: 1
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 12bb67e49118..e445ad24f9ed 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -97,7 +97,7 @@ test: publication subscription
 # ----------
 # Another group of parallel tests
 # ----------
-test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock indirect_toast equivclass
+test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock indirect_toast equivclass progress
 
 # ----------
 # Another group of parallel tests (JSON related)
diff --git a/src/test/regress/sql/progress.sql b/src/test/regress/sql/progress.sql
new file mode 100644
index 000000000000..ef5d95056197
--- /dev/null
+++ b/src/test/regress/sql/progress.sql
@@ -0,0 +1,39 @@
+-- setup for COPY progress testing
+CREATE TEMP TABLE test_progress_with_trigger (
+  a int,
+  b text
+) ;
+
+CREATE OR REPLACE function notice_after_progress_reporting() RETURNS trigger AS
+$$
+DECLARE report record;
+BEGIN
+  SELECT INTO report * FROM pg_stat_progress_copy report WHERE pid = pg_backend_pid();
+  raise info 'progress datname: %', report.datname;
+  raise info 'progress command: %', report.command;
+  raise info 'progress io_target: %', report.io_target;
+  raise info 'progress bytes_processed: %', report.bytes_processed;
+  raise info 'progress bytes_total: %', report.bytes_total;
+  raise info 'progress tuples_processed: %', report.tuples_processed;
+  raise info 'progress tuples_excluded: %', report.tuples_excluded;
+
+  RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+CREATE OR REPLACE TRIGGER check_after_progress_reporting
+    AFTER INSERT ON test_progress_with_trigger
+    FOR EACH ROW
+    EXECUTE FUNCTION notice_after_progress_reporting();
+
+-- simple COPY from STDIN
+COPY test_progress_with_trigger (a, b) FROM STDIN;
+1	test_1
+\.
+
+-- COPY from STDIN with WHERE skipping lines
+COPY test_progress_with_trigger (a, b) FROM STDIN WHERE a > 1;
+1	test_1
+2	test_2
+\.
+

Reply via email to