On Fri, Jan 4, 2019 at 3:08 PM David Rowley <david.row...@2ndquadrant.com> wrote:
> On Mon, 31 Dec 2018 at 18:58, Surafel Temesgen <surafel3...@gmail.com> > wrote: > > On Fri, Dec 28, 2018 at 6:46 PM Fabien COELHO <coe...@cri.ensmp.fr> > wrote: > >> > At first i also try to do it like that but it seems the function will > >> > became long and more complex to me > >> > >> Probably. But calling it with size 100 should result in the same > behavior, > >> so it is really just an extension of the preceeding one? Or am I missing > >> something? > >> > > > > Specifying table data using single value insert statement and user > specified values insert statement > > have enough deference that demand to be separate function and they are > not the same thing that should implement > > with the same function. Regarding code duplication i think the solution > is making those code separate function > > and call at appropriate place. > > I don't really buy this. I've just hacked up a version of > dumpTableData_insert() which supports a variable number rows per > statement. It seems fairly clean and easy to me. Likely the fact that > this is very possible greatly increases the chances of this getting in > since it gets rid of the code duplication. I did also happen to move > the row building code out of the function into its own function, but > that's not really required. I just did that so I could see all the > code in charge of terminating each statement on my screen without > having to scroll. I've not touched any of the plumbing work to plug > the rows_per_statement variable into the command line argument. So > it'll need a bit of merge work with the existing patch. There will > need to be some code that ensures that the user does not attempt to > have 0 rows per statement. The code I wrote won't behave well if that > happens. > > The attache patch use your method mostly ... Checks existing patch ... > > I see you added a test, but missed checking for 0. That needs to be fixed. > > + if (dopt.dump_inserts_multiple < 0) > + { > + write_msg(NULL, "argument of --insert-multi must be positive number\n"); > + exit_nicely(1); > + } > > fixed I also didn't adopt passing the rows-per-statement into the FETCH. I > think that's a very bad idea and we should keep that strictly at 100. > I don't see any reason to tie the two together. If a user wants 10 > rows per statement, do we really want to FETCH 10 times more often? > And what happens when they want 1 million rows per statement? We've no > reason to run out of memory from this since we're just dumping the > rows out to the archive on each row. > > okay > > + Specify the number of values per <command>INSERT</command> > command. > + This will make the dump file smaller than > <option>--inserts</option> > + and it is faster to reload but lack per row data lost on error > + instead entire affected insert statement data lost. > > Unsure what you mean about "data lost". It also controls the number > of "rows" per <command>INSERT</command> statement, not the number of > values. > > I think it would be fine just to say: > > + When using <option>--inserts</option>, this allows the maximum > number > + of rows per <command>INSERT</command> statement to be specified. > + This setting defaults to 1. > > > i change it too except "This setting defaults to 1" because it doesn't have default value. 1 row per statement means --inserts option . > > 2. Is --insert-multi a good name? What if they do --insert-multi=1? > That's not very "multi". Is --rows-per-insert better? > > --rows-per-insert is better for me . regards Surafel
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index 9e0bb93f08..4195fb81a2 100644 --- a/doc/src/sgml/ref/pg_dump.sgml +++ b/doc/src/sgml/ref/pg_dump.sgml @@ -775,6 +775,16 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>--rows-per-insert</option></term> + <listitem> + <para> + When using <option>--rows-per-insert</option>, this allows the maximum number + of rows per <command>INSERT</command> statement to be specified. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>--load-via-partition-root</option></term> <listitem> diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h index 4a2e122e2d..73a243ecb0 100644 --- a/src/bin/pg_dump/pg_backup.h +++ b/src/bin/pg_dump/pg_backup.h @@ -72,6 +72,7 @@ typedef struct _restoreOptions int dropSchema; int disable_dollar_quoting; int dump_inserts; + int dump_inserts_multiple; int column_inserts; int if_exists; int no_comments; /* Skip comments */ @@ -144,6 +145,7 @@ typedef struct _dumpOptions /* flags for various command-line long options */ int disable_dollar_quoting; int dump_inserts; + int dump_inserts_multiple; int column_inserts; int if_exists; int no_comments; diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 0e129f9654..e49e2206e7 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -313,6 +313,7 @@ main(int argc, char **argv) int plainText = 0; ArchiveFormat archiveFormat = archUnknown; ArchiveMode archiveMode; + char *p; static DumpOptions dopt; @@ -359,6 +360,7 @@ main(int argc, char **argv) {"exclude-table-data", required_argument, NULL, 4}, {"if-exists", no_argument, &dopt.if_exists, 1}, {"inserts", no_argument, &dopt.dump_inserts, 1}, + {"rows-per-insert", required_argument, NULL, 8}, {"lock-wait-timeout", required_argument, NULL, 2}, {"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1}, {"quote-all-identifiers", no_argument, "e_all_identifiers, 1}, @@ -557,6 +559,27 @@ main(int argc, char **argv) dosync = false; break; + case 8: /* inserts row number */ + errno = 0; + dopt.dump_inserts_multiple = strtol(optarg, &p, 10); + if (p == optarg || *p != '\0') + { + write_msg(NULL, "argument of --rows-per-insert must be a number\n"); + exit_nicely(1); + } + if (errno == ERANGE) + { + write_msg(NULL, "argument of --rows-per-insert exceeds integer range.\n"); + exit_nicely(1); + } + if (dopt.dump_inserts_multiple <= 0) + { + write_msg(NULL, "argument of --rows-per-insert must be positive number\n"); + exit_nicely(1); + } + + break; + default: fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); exit_nicely(1); @@ -607,8 +630,9 @@ main(int argc, char **argv) if (dopt.if_exists && !dopt.outputClean) exit_horribly(NULL, "option --if-exists requires option -c/--clean\n"); - if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts)) - exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n"); + if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts || + dopt.dump_inserts_multiple)) + exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n"); /* Identify archive format to emit */ archiveFormat = parseArchiveFormat(format, &archiveMode); @@ -877,6 +901,7 @@ main(int argc, char **argv) ropt->use_setsessauth = dopt.use_setsessauth; ropt->disable_dollar_quoting = dopt.disable_dollar_quoting; ropt->dump_inserts = dopt.dump_inserts; + ropt->dump_inserts_multiple = dopt.dump_inserts_multiple; ropt->no_comments = dopt.no_comments; ropt->no_publications = dopt.no_publications; ropt->no_security_labels = dopt.no_security_labels; @@ -967,6 +992,7 @@ help(const char *progname) printf(_(" --exclude-table-data=TABLE do NOT dump data for the named table(s)\n")); printf(_(" --if-exists use IF EXISTS when dropping objects\n")); printf(_(" --inserts dump data as INSERT commands, rather than COPY\n")); + printf(_(" --rows-per-insert number of row per INSERT command\n")); printf(_(" --load-via-partition-root load partitions via the root table\n")); printf(_(" --no-comments do not dump comments\n")); printf(_(" --no-publications do not dump publications\n")); @@ -1886,6 +1912,8 @@ dumpTableData_insert(Archive *fout, void *dcontext) int tuple; int nfields; int field; + int number_of_row = 1; + int end_of_statement = 0; appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR " "SELECT * FROM ONLY %s", @@ -1900,67 +1928,82 @@ dumpTableData_insert(Archive *fout, void *dcontext) res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor", PGRES_TUPLES_OK); nfields = PQnfields(res); + + /* + * First time through, we build as much of the INSERT statement as + * possible in "insertStmt", which we can then just print for each + * line. If the table happens to have zero columns then this will + * be a complete statement, otherwise it will end in "VALUES(" and + * be ready to have the row's column values appended. + */ + if (insertStmt == NULL) + { + TableInfo *targettab; + + insertStmt = createPQExpBuffer(); + + /* + * When load-via-partition-root is set, get the root table + * name for the partition table, so that we can reload data + * through the root table. + */ + if (dopt->load_via_partition_root && tbinfo->ispartition) + targettab = getRootTableInfo(tbinfo); + else + targettab = tbinfo; + + appendPQExpBuffer(insertStmt, "INSERT INTO %s ", + fmtQualifiedDumpable(targettab)); + + /* corner case for zero-column table */ + if (nfields == 0) + { + appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n"); + } + else + { + /* append the list of column names if required */ + if (dopt->column_inserts) + { + appendPQExpBufferChar(insertStmt, '('); + for (field = 0; field < nfields; field++) + { + if (field > 0) + appendPQExpBufferStr(insertStmt, ", "); + appendPQExpBufferStr(insertStmt, + fmtId(PQfname(res, field))); + } + appendPQExpBufferStr(insertStmt, ") "); + } + + if (tbinfo->needs_override) + appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE "); + + appendPQExpBufferStr(insertStmt, "VALUES "); + } + } + for (tuple = 0; tuple < PQntuples(res); tuple++) { - /* - * First time through, we build as much of the INSERT statement as - * possible in "insertStmt", which we can then just print for each - * line. If the table happens to have zero columns then this will - * be a complete statement, otherwise it will end in "VALUES(" and - * be ready to have the row's column values appended. - */ - if (insertStmt == NULL) - { - TableInfo *targettab; - - insertStmt = createPQExpBuffer(); - - /* - * When load-via-partition-root is set, get the root table - * name for the partition table, so that we can reload data - * through the root table. - */ - if (dopt->load_via_partition_root && tbinfo->ispartition) - targettab = getRootTableInfo(tbinfo); - else - targettab = tbinfo; - - appendPQExpBuffer(insertStmt, "INSERT INTO %s ", - fmtQualifiedDumpable(targettab)); - - /* corner case for zero-column table */ - if (nfields == 0) - { - appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n"); - } - else - { - /* append the list of column names if required */ - if (dopt->column_inserts) - { - appendPQExpBufferChar(insertStmt, '('); - for (field = 0; field < nfields; field++) - { - if (field > 0) - appendPQExpBufferStr(insertStmt, ", "); - appendPQExpBufferStr(insertStmt, - fmtId(PQfname(res, field))); - } - appendPQExpBufferStr(insertStmt, ") "); - } - - if (tbinfo->needs_override) - appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE "); - - appendPQExpBufferStr(insertStmt, "VALUES ("); - } - } - - archputs(insertStmt->data, fout); /* if it is zero-column table then we're done */ if (nfields == 0) + { + archputs(insertStmt->data, fout); continue; + } + + if (number_of_row == 1 || dopt->dump_inserts || end_of_statement) + { + archputs(insertStmt->data, fout); + archputs("(", fout); + end_of_statement = 0; + } + + if (number_of_row > 1) + { + archputs(", ( ", fout); + } for (field = 0; field < nfields; field++) { @@ -2027,12 +2070,40 @@ dumpTableData_insert(Archive *fout, void *dcontext) } } + if (dopt->dump_inserts) + { if (!dopt->do_nothing) archputs(");\n", fout); else archputs(") ON CONFLICT DO NOTHING;\n", fout); + } + + if (dopt->dump_inserts_multiple) + { + if (number_of_row == dopt->dump_inserts_multiple) + { + number_of_row = 1; + end_of_statement = 1; + if (!dopt->do_nothing) + archputs(");\n", fout); + else + archputs(") ON CONFLICT DO NOTHING;\n", fout); + } + else + { + archputs(")\n", fout); + number_of_row++; + } + } } + if (number_of_row > 1 && PQntuples(res) == 0) + { + if (!dopt->do_nothing) + archputs(";\n", fout); + else + archputs(" ON CONFLICT DO NOTHING;\n", fout); + } if (PQntuples(res) <= 0) { PQclear(res); @@ -2091,7 +2162,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo) char *copyStmt; const char *copyFrom; - if (!dopt->dump_inserts) + if (!dopt->dump_inserts && !dopt->dump_inserts_multiple) { /* Dump/restore using COPY */ dumpFn = dumpTableData_copy; diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl index a875d540b8..ebd83922dd 100644 --- a/src/bin/pg_dump/t/001_basic.pl +++ b/src/bin/pg_dump/t/001_basic.pl @@ -118,8 +118,8 @@ command_fails_like( command_fails_like( [ 'pg_dump', '--on-conflict-do-nothing' ], - qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/, - 'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts'); + qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/, + 'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts'); # pg_dumpall command-line argument checks command_fails_like(