čt 7. 1. 2021 v 19:51 odesílatel Matthias van de Meent <boekewurm+postg...@gmail.com> napsal: > > On Wed, 6 Jan 2021 at 22:45, Tomas Vondra <tomas.von...@enterprisedb.com> > wrote: > > > > On 1/5/21 11:02 AM, Josef Šimánek wrote: > > > I'm attaching the whole patch since commitfest failed to ingest the > > > last incremental on CI. > > > > > > > Yeah, the whole patch needs to be attached for the commitfest tester to > > work correctly - it can't apply pieces from multiple messages, etc. > > > > Anyway, I pushed this last version of patch, after a couple more tweaks, > > mainly to the docs - one place used pg_stat_copy_progress, the section > > was not indexed properly, and so on. > > Thank you all, I'd love to use this in the future to keep track of > (e.g.) my backup/restore progress. > > For my previously mentioned extension to keep track of filtered tuples > that are excluded by the WHERE-clause, PFA patch 0001 that keeps track > of that, in line with the current column name style of lines.
If I understand it well, this column could be used on COPY TO to track skipped lines because of BEFORE TRIGGER, right? I can include this in my following patch keeping lines_processed incremented even for skipped lines as well. > If so desired, I'll split this off into a different thread & CF entry. > > > I see Matthias proposed to change "lines" to "tuples" - I only saw the > > message after pushing, but I probably wouldn't make that change anyway. > > The CSV docs seem to talk about lines, newlines etc. so it seems fine. > > If not, we can change that. > > The CSV docs, sure. But copy doesn't only process CSVs; it also has > text (which does have a # lines = # tuples / rows guarantee) and > binary (in which the 'line' vocabulary doesn't make sense, and in > which the 'tuples' vocabulary is used). Additionally, most mentions of > postgres' logical rows/tuples in the COPY documentation use the 'rows' > terminology ('tuples' for FORMAT BINARY), and use 'line' only for > external format's textual representation's strings delimited by > newlines (which I believe is not exactly what we're counting). > > One common user of COPY is the pg_dump tool, and that uses binary > format by default (iirc). > > Additionally, all comments surrounding the *LINES_PROCESSED updates > only mention 'tuples', so I'd like to strongly suggest (a variant of) > attached patch 0002 to keep the vocabulary consistent by using > 'tuples' instead of 'lines'. > > > With regards, > > Matthias van de Meent