Re: Typo in pgbench messages.
Bonjour Michaël, I think it's better to back-patch this to stable branches if there's no objection. Thought? That's only cosmetic, so I would just bother about HEAD if I were to change something like that (I would not bother at all, personally). One argument against a backpatch is that this would be disruptive with tools that parse and analyze the output generated by pgbench. Fabien, don't you have some tools and/or wrappers doing exactly that? Yep, I like to "| cut -d' ' -fx" and other "line.split(' ')" or whatever. I think that the break of typographical rules is intentional to allow such simplistic low-level stream handling through pipes or scripts. I'd prefer that the format is not changed. Maybe a comment could be added to explain the reason behind it. -- Fabien.
Re: Typo in pgbench messages.
Hello Daniel, I think that the break of typographical rules is intentional to allow such simplistic low-level stream handling through pipes or scripts. I'd prefer that the format is not changed. Maybe a comment could be added to explain the reason behind it. That doesn't sound like an overwhelmingly convincing argument to print some messages with 'X %' and others with 'X%'. Indeed. The no-space % are for database loading progress and the final report which I happen not to process through pipes and are more user-facing interfaces/reports. The stream piping need applies more to repeated lines such as those output by progress, which happen to avoid percentages anyway so the questions does not arise. So fine with me wrt having a more homogeneous report. -- Fabien.
Re: Typo in pgbench messages.
So fine with me wrt having a more homogeneous report. So are you fine with Kawamoto-san's patch? Yes. Patch applies cleanly (hmmm, it would have been better to have it as an attachement). Make & make check ok. Fine with me. -- Fabien.
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Hello Tatsuo-san, It seems the patch is ready for committer except below. Do you guys want to do more on below? I'm planning a new review of this significant patch, possibly over the next week-end, or the next. -- Fabien.
Re: Commitfest 2022-03 Patch Triage Part 1b
Hello Greg, Peter posted an updated version of Fabiens patch about a month ago (which at this point no longer applies) Attached a v15 which is a rebase, after some minor changes in the source and some new test cases added (good!). fixing a few issues, but also point at old review comments still unaddressed. ISTM that all comments have been addressed. However, the latest patch raises issues about work around libpq corner case behaviors which are really just that, corner cases. Since this was pushed, but had to be reverted, I assume there is a desire for the feature but it seems to need more work still. It looks like Peter and Fabien were debating the merits of a libpq change and probably that won't happen this release cycle. ISTM these are really very minor issues that could be resolved in this cycle. Is there a kernel of psql functionality that can be extracted from this without the libpq change in this release cycle or should it wait until we add the functionality to libpq? The patch can wait for the issues to be resolved one way or an other before proceeding, *or* it can be applied, maybe with a small tweak, and the libpq issues be fixed separately. For a reminder, there are two actual "issues"features" or "bug" with libpq, which are made visible by the patch, but are pre-existing: (1) under some circumstances a infinite stream of empty results is returned, that has to be filtered out manually. (2) under some special circumstances some error messages may be output twice because of when libpq decides to reset the error buffer. (1) has been there for ever, and (2) is under investigation to possibly improve the situation, so as to remove a hack in the code to avoid it. The alternative which IMO would be ok is to admit that under some very special conditions the same error message may be output twice, and if it is resolved later on then fine. If it's the latter then perhaps we should move this to 16? I'm not that pessimistic! I may be proven wrong:-) -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of p
Re: psql - add SHOW_ALL_RESULTS option
Are you planning to send a rebased patch for this commit fest? Argh, I did it in a reply in another thread:-( Attached v15. So as to help moves things forward, I'd suggest that we should not to care too much about corner case repetition of some error messages which are due to libpq internals, so I could remove the ugly buffer reset from the patch and have the repetition, and if/when the issue is fixed later in libpq then the repetition will be removed, fine! The issue is that we just expose the strange behavior of libpq, which is libpq to solve, not psql. What do you think? -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index d65b9a124f..3b2f6305b4 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -573,7 +587,7 @@ PSQLexec(const char *que
Re: Commitfest 2022-03 Patch Triage Part 1b
Just FYI. Better to follow up to the thread for the patch that's already in the CF. Otherwise your patch will missed by someone who looks at the CF entry to see the latest patch. Indeed. Done. -- Fabien.
Re: cpluspluscheck complains about use of register
It seems we should just remove the use of register? I have a vague idea that it was once important to say "register" if you are going to use the variable in an asm snippet that requires it to be in a register. That might be wrong, or it might be obsolete even if once true. We could try taking these out and seeing if the buildfarm complains. We have several inline asm statements not using register despite using variables in a register (e.g. pg_atomic_compare_exchange_u32_impl()), so I wouldn't expect a problem with compilers we support. Should we make configure test for -Wregister? There's at least one additional use of register that we'd have to change (pg_regexec). From a compilation perspective, "register" tells the compiler that you cannot have a pointer on a variable, i.e. it generates an error if someone adds something like: void * p = ®ister_variable; Removing the "register" declaration means that such protection would be removed, and creating such a pointer could reduce drastically compiler optimization opportunities. -- Fabien.
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Hello Justin, I hope to look at it over the week-end. -- Fabien Coelho - CRI, MINES ParisTech
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Hello Justin, Review about v34, up from v32 which I reviewed in January. v34 is an updated version of v32, without the part about lstat at the end of the series. All 7 patches apply cleanly. # part 01 One liner doc improvement about the directory flag. OK. # part 02 Add tests for various options on pg_ls_dir, and for pg_stat_file, which were not exercised before. "make check" is ok. OK. # part 03 This patch adds a new pg_ls_dir_metadata. Internally, this is an extension of pg_ls_dir_files function which is used by other pg_ls functions. Doc ok. New tests are added which check that the result columns are configured as required, including error cases. "make check" is ok. OK. # part 04 Add a new "isdir" column to "pg_ls_tmpdir" output. This is a small behavioral change. I'm ok with that, however I must say that I'm still unsure why we would not jump directly to a "type" char column. What is wrong with outputing 'd' or '-' instead of true or false? This way, the interface needs not change if "lstat" is used later? ISTM that the interface issue should be somehow independent of the implementation issue, and we should choose directly the right/best interface? Independently, the documentation may be clearer about what happens to "isdir" when the file is a link? It may say that the behavior may change in the future? About homogeneity, I note that some people may be against adding "isdir" to other ls functions. I must say that I cannot see a good argument not to do it, and that I hate dealing with systems which are not homogeneous because it creates surprises and some loss of time. "make check" ok. OK. # part 05 Add isdir to other ls functions. Doc is updated. Same as above, I'd prefer a char instead of a bool, as it is more extendable and future-proof. OK. # part 06 This part extends and adds a test for pg_ls_logdir. "make check" is ok. OK. # part 07 This part extends pg_stat_file with more date informations. "make check" is ok. OK. # doc Overall doc generation is OK. -- Fabien.
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Hello Yugo-san, About Pgbench error handling v16: This patch set needs a minor rebase because of 506035b0. Otherwise, patch compiles, global and local "make check" are ok. Doc generation is ok. This patch is in good shape, the code and comments are clear. Some minor remarks below, including typos and a few small suggestions. ## About v16-1 This refactoring patch adds a struct for managing pgbench variables, instead of mixing fields into the client state (CState) struct. Patch compiles, global and local "make check" are both ok. Although this patch is not necessary to add the feature, I'm fine with it as improves pgbench source code readability. ## About v16-2 This last patch adds handling of serialization and deadlock errors to pgbench transactions. This feature is desirable because it enlarge performance testing options, and makes pgbench behave more like a database client application. Possible future extension enabled by this patch include handling deconnections errors by trying to reconnect, for instance. The documentation is clear and well written, at least for my non-native speaker eyes and ears. English: "he will be aborted" -> "it will be aborted". I'm fine with renaming --report-latencies to --report-per-command as the later is clearer about what the options does. I'm still not sure I like the "failure detailed" option, ISTM that the report could be always detailed. That would remove some complexity and I do not think that people executing a bench with error handling would mind having the details. No big deal. printVerboseErrorMessages: I'd make the buffer static and initialized only once so that there is no significant malloc/free cycle involved when calling the function. advanceConnectionState: I'd really prefer not to add new variables (res, status) in the loop scope, and only declare them when actually needed in the state branches, so as to avoid any unwanted interaction between states. typo: "fullowing" -> "following" Pipeline cleaning: the advance function is already s long, I'd put that in a separate function and call it. I think that the report should not remove data when they are 0, otherwise it makes it harder to script around it (in failures_detailed on line 6284). The test cover the different cases. I tried to suggest a simpler approach in a previous round, but it seems not so simple to do so. They could be simplified later, if possible. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Are you planning to send a rebased patch for this commit fest? Argh, I did it in a reply in another thread:-( Attached v15. So as to help moves things forward, I'd suggest that we should not to care too much about corner case repetition of some error messages which are due to libpq internals, so I could remove the ugly buffer reset from the patch and have the repetition, and if/when the issue is fixed later in libpq then the repetition will be removed, fine! The issue is that we just expose the strange behavior of libpq, which is libpq to solve, not psql. See attached v16 which removes the libpq workaround. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index d65b9a124f..7903075975 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -573,7 +
Re: [PATCH] pgbench: add multiconnect option
Hello Greg, It looks like David sent a patch and Fabien sent a followup patch. But there hasn't been a whole lot of discussion or further patches. It sounds like there are some basic questions about what the right interface should be. Are there specific questions that would be helpful for moving forward? Review the designs and patches and tell us what you think? Personnaly, I think that allowing multiple connections is a good thing, especially if the code impact is reduced, which is the case with the version I sent. Then for me the next step would be to have a reconnection on errors so as to implement a client-side failover policy that could help testing a server-failover performance impact. I have done that internally but it requires that "Pgbench Serialization and deadlock errors" to land, as it would just be another error that can be handled. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, See attached v16 which removes the libpq workaround. I suppose this depends on https://www.postgresql.org/message-id/flat/ab4288f8-be5c-57fb-2400-e3e857f53e46%40enterprisedb.com getting committed, because right now this makes the psql TAP tests fail because of the duplicate error message. How should we handle that? Ok, it seems I got the patch wrong. Attached v17 is another try. The point is to record the current status, whatever it is, buggy or not, and to update the test when libpq fixes things, whenever this is done. In this part of the patch, there seems to be part of a sentence missing: Indeed! The missing part was put back in v17. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index d65b9a124f..2f3dd91602 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always
Re: [PATCH] pgbench: add multiconnect option
Hi Sami, Pgbench is a simple benchmark tool by design, and I wonder if adding a multiconnect feature will cause pgbench to be used incorrectly. Maybe, but I do not see how it would be worse that what pgbench already allows. A real world use-case will be helpful for this thread. Basically more versatile testing for non single host setups. For instance, it would allow testing directly a multi-master setup, such as bucardo, symmetricds or coackroachdb. It would be a first step on the path to allow interesting features such as: - testing failover setup, on connection error a client could connect to another host. - testing a primary/standby setup, with write transactions sent to the primary and read transactions sent to the standbyes. Basically I have no doubt that it can be useful. For the current patch, Should the report also cover per-database statistics (tps/latency/etc.) ? That could be a "per-connection" option. If there is a reasonable use case I think that it would be an easy enough feature to implement. Attached a rebased version. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index be1896fa99..69bd5b76f1 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -29,7 +29,7 @@ PostgreSQL documentation pgbench option - dbname + dbname or conninfo @@ -160,6 +160,9 @@ pgbench options d not specified, the environment variable PGDATABASE is used. If that is not set, the user name specified for the connection is used. +Alternatively, the dbname can be +a standard connection information string. +Several connections can be provided. @@ -843,6 +846,21 @@ pgbench options d + + --connection-policy=policy + + +Set the connection policy when multiple connections are available. +Default is round-robin provided (ro). +Possible values are: +first (f), +random (ra), +round-robin (ro), +working (w). + + + + -h hostname --host=hostname diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 000ffc4a5c..5006e21766 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -278,13 +278,39 @@ bool is_connect; /* establish connection for each transaction */ bool report_per_command; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ +char *logfile_prefix = NULL; + +/* main connection definition */ const char *pghost = NULL; const char *pgport = NULL; const char *username = NULL; -const char *dbName = NULL; -char *logfile_prefix = NULL; const char *progname; +/* multi connections */ +typedef enum mc_policy_t +{ + MC_UNKNOWN = 0, + MC_FIRST, + MC_RANDOM, + MC_ROUND_ROBIN, + MC_WORKING +} mc_policy_t; + +/* connection info list */ +typedef struct connection_t +{ + const char *connection; /* conninfo or dbname */ + int errors; /* number of connection errors */ +} connection_t; + +static intn_connections = 0; +static connection_t *connections = NULL; +static mc_policy_t mc_policy = MC_ROUND_ROBIN; + +/* last used connection */ +// FIXME per thread? +static int current_connection = 0; + #define WSEP '@'/* weight separator */ volatile bool timer_exceeded = false; /* flag from signal handler */ @@ -694,7 +720,7 @@ usage(void) { printf("%s is a benchmarking tool for PostgreSQL.\n\n" "Usage:\n" - " %s [OPTION]... [DBNAME]\n" + " %s [OPTION]... [DBNAME or CONNINFO ...]\n" "\nInitialization options:\n" " -i, --initialize invokes initialization mode\n" " -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n" @@ -749,6 +775,7 @@ usage(void) " -h, --host=HOSTNAME database server host or socket directory\n" " -p, --port=PORT database server port number\n" " -U, --username=USERNAME connect as specified database user\n" + " --connection-policy=Sset multiple connection policy (\"first\", \"rand\", \"round-robin\", \"working\")\n" " -V, --versionoutput version information, then exit\n" " -?, --help show this help, then exit\n" "\n" @@ -1323,13 +1350,89 @@ tryExecuteStatement(PGconn *con, const char *sql) PQclear(res); } +/* store a new connection information string */ +static void +push_connection(const char *c) +{ + connections = pg_realloc(connections, sizeof(connection_t) * (n_connections+1)); + connections[n_connections].connection = pg_strdup(c); + connections[n_connections].errors = 0; + n_connections++; +} + +/* switch connection */ +static int +next_connection(int *pci) +{ + int ci; + + ci = ((*pci) + 1) % n_connections; + *pci = ci; + + return ci; +} + +/* return the connection index to use for next attempt */ +static int +choose_con
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, Attached v17 is another try. The point is to record the current status, whatever it is, buggy or not, and to update the test when libpq fixes things, whenever this is done. [...] The expected output (which passes) contains this line twice: psql::2: FATAL: terminating connection due to administrator command psql::2: FATAL: terminating connection due to administrator command If I paste this test case into current master without your patch, I only get this line once. So your patch is changing this output. The whole point of the libpq fixes was to not have this duplicate output. So I think something is still wrong somewhere. Hmmm. Yes and no:-) The previous path inside libpq silently ignores intermediate results, it skips all results to keep only the last one. The new approach does not discard resultss silently, hence the duplicated output, because they are actually there and have always been there in the first place, they were just ignored: The previous "good" result is really a side effect of a bad implementation in a corner case, which just becomes apparent when opening the list of results. So my opinion is still to dissociate the libpq "bug/behavior" fix from this feature, as they are only loosely connected, because it is a very corner case anyway. An alternative would be to remove the test case, but I'd prefer that it is kept. If you want to wait for libpq to provide a solution for this corner case, I'm afraid that "never" is the likely result, especially as no test case exercices this path to show that there is a problem somewhere, so nobody should care to fix it. I'm not sure it is even worth it given the highly special situation which triggers the issue, which is not such an actual problem (ok, the user is told twice that there was a connection loss, no big deal). -- Fabien.
Re: PG vs LLVM 12 on seawasp, next round
Hello Thomas, Since seawasp's bleeding edge LLVM installation moved to "trunk 20201114 c8f4e06b 12.0.0" ~16 days ago, it has been red. Further updates didn't help it and it's now on "trunk 20201127 6ee22ca6 12.0.0". I wonder if there is something in Fabien's scripting that needs to be tweaked, perhaps a symlink name or similar. The compiler compilation script is quite straightforward (basically, get sources, configure and compile), even for a such a moving target… The right approach is to wait for some time before looking at the issue, typically one week for the next recompilation, in case the problem evaporates, so you were right not to jump on it right away:-) Andres investigated a few days ago, managed to reproduce the issue locally, and has one line patch. I'm unsure if it should be prevently back-patched, though. -- Fabien.
Re: Add table access method as an option to pgbench
Hello David, Some feedback about v4. It looks that the option is *silently* ignored when creating a partitionned table, which currently results in an error, and not passed to partitions, which would accept them. This is pretty weird. The input check is added with an error message when both partitions and table access method are used. Hmmm. If you take the resetting the default, I do not think that you should have to test anything? AFAICT the access method is valid on partitions, although not on the partitioned table declaration. So I'd say that you could remove the check. They should also trigger failures, eg "--table-access-method=no-such-table-am", to be added to the @errors list. No sure how to address this properly, since the table access method potentially can be *any* name. I think it is pretty unlikely that someone would chose the name "no-such-table-am" when developing a new storage engine for Postgres inside Postgres, so that it would make this internal test fail. There are numerous such names used in tests, eg "no-such-database", "no-such-command". So I'd suggest to add such a test to check for the expected failure. I do not understand why you duplicated all possible option entry. Test with just table access method looks redundant if the feature is make to work orthonogonally to partitions, which should be the case. Only one positive test case added using *heap* as the table access method to verify the initialization. Yep, only "heap" if currently available for tables. -- Fabien.
Re: PG vs LLVM 12 on seawasp, next round
Hello Andres, I hadn't checked that before, but for the last few days there's been a different failure than the one I saw earlier: +ERROR: could not load library "/home/fabien/pg/build-farm-11/buildroot/HEAD/pgsql.build/tmp_install/home/fabien/pg/build-farm-11/buildroot/HEAD/inst/lib/postgresql/llvmjit.so": libLLVMOrcJIT.so.12git: cannot open shared object file: No such file or directory whereas what I fixed is about: +ERROR: could not load library "/home/fabien/pg/build-farm-11/buildroot/HEAD/pgsql.build/tmp_install/home/fabien/pg/build-farm-11/buildroot/HEAD/inst/lib/postgresql/llvmjit.so": /home/fabien/pg/build-farm-11/buildroot/HEAD/pgsql.build/tmp_install/home/fabien/pg/build-farm-11/buildroot/HEAD/inst/lib/postgresql/llvmjit.so: undefined symbol: LLVMInitializeX86Target Changed somewhere between https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=seawasp&dt=2020-11-20%2009%3A17%3A10 and https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=seawasp&dt=2020-11-21%2023%3A17%3A11 The "no such file" error seems more like a machine local issue to me. I'll look into it when I have time, which make take some time. Hopefully over the holidays. -- Fabien.
Re: \gsetenv
Hello David, We have \gset to set some parameters, but not ones in the environment, so I fixed this with a new analogous command, \gsetenv. I considered refactoring SetVariable to include environment variables, but for a first cut, I just made a separate function and an extra if. My 0.02€: ISTM that you do not really need that, it can already be achieved with gset, so I would not bother to add a gsetenv. sh> psql SELECT 'Calvin' AS foo \gset \setenv FOO :foo \! echo $FOO Calvin -- Fabien.
Re: pgbench failed when -f option contains a char '@'
Hello, pgbench use -f filename[@weight] to receive a sql script file with a weight, ISTM that I thought of this: "pgbench -f filen@me@1" does work. sh> touch foo@bla sh> pgbench -f foo@bla@1 pgbench: fatal: empty command list for script "foo@bla" The documentation could point this out, though. -- Fabien.
Re: pgbench failed when -f option contains a char '@'
Hello Tom, I think we should just leave this as it is. The user can simply rename the file. Yeah. The assumption when we defined the script-weight syntax was that there's no particular reason to use "@" in a script file name, and I don't see why that's a bad assumption. The "parser" looks for the last @ in the argument, so the simple workaround is to append "@1". I suggest the attached doc update, or anything in better English. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 7180fedd65..bba3cf05b0 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -461,6 +461,8 @@ pgbench options d the list of executed scripts. An optional integer weight after @ allows to adjust the probability of drawing the test. +If the filename includes a @ character, append a weight so +that there is no ambiguity: filen@me@1. See below for details.
Re: [PATCH] Automatic HASH and LIST partition creation
CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES IN (1,2),(3,4) DEFAULT PARTITION foo_def); I would like to disagree with this syntactic approach because it would very specific to each partition method. IMHO the syntax should be as generic as possible. I'd suggest (probably again) a keyword/value list which would allow to be quite adaptable without inducing any pressure on the parser. -- Fabien.
Re: [PATCH] Automatic HASH and LIST partition creation
HEllo. CREATE TABLE foo(a int) PARTITION BY LIST(a) CONFIGURATION (FOR VALUES IN (1,2),(3,4) DEFAULT PARTITION foo_def); I would like to disagree with this syntactic approach because it would very specific to each partition method. IMHO the syntax should be as generic as possible. I'd suggest (probably again) a keyword/value list which would allow to be quite adaptable without inducing any pressure on the parser. If I remember your proposal correctly it is something like CREATE TABLE foo(...) PARTITION BY HASH AUTOMATIC (MODULUS 10); Yep, that would be the spirit. It is still possible but there are some caveats: 1. We'll need to add keyword MODULUS (and probably AUTOMATIC) to the parser's list. Why? We could accept anything in the list? i.e.: (ident =? value[, ident =? value]*) I don't against this but as far as I've heard there is some opposition among PG community against new keywords. Maybe I am wrong. the ident is a keyword that can be interpreted later on, not a "reserved keyword" from a parser perspective, which is the only real issue? The parser does not need to know about it, only the command interpreter which will have to interpret it. AUTOMATIC is a nice parser cue to introduce such a ident-value list. 2. The existing syntax for declarative partitioning is different to your proposal. Yep. I think that it was not so good a design choice from a language/extensibility perspective. It is still not a big problem and your proposal makes query shorter for several words. I'd just like to see some consensus on the syntax. Now I must admit there are too many contradictions in opinions which make progress slow. Also I think it is important to have a really convenient syntaх. 2a Maybe we all who participated in the thread can vote for some variant? 2b Maybe the existing syntax for declarative partitioniong should be given some priority as it is already committed into CREATE TABLE ... PARTITION OF ... FOR VALUES IN.. etc. I'd be happy if everyone will join some version of the proposed syntaх in this thread and in the previous discussion [1]. If we have a variant with more than one supporter, sure we can develop patch based on it. Thank you very much and Merry Christmas! [1] https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1907150711080.22273%40lancre -- Fabien.
Re: [PATCH] Automatic HASH and LIST partition creation
BTW could you tell me a couple of words about pros and cons of c-code syntax parsing comparing to parsing using gram.y trees? I'd rather use an automatic tool (lexer/parser) if possible instead of doing it by hand if I can. If you want a really nice syntax with clever tricks, then you may need to switch to manual though, but pg/sql is not in that class. I think both are possible but my predisposition was that we'd better use the later if possible. I agree. -- Fabien.
Re: [PATCH] Automatic HASH and LIST partition creation
Fabien, do you consider it possible to change the syntax of declarative partitioning too? My 0.02 €: What I think does not matter much, what committers think is the way to pass something. However, I do not think that such an idea would pass a committer:-) It is problematic as it is already committed but also is very tempting to have the same type of syntax both in automatic partitioning and in manual (PARTITION OF...) I think that if a "common" syntax, for a given meaning of common, can be thought of, and without breaking backward compatibility, then there may be an argument to provide such a syntax, but I would not put too much energy into that if I were you. I see 3 cases: - partition declaration but no actual table generated, the current version. - partition declaration with actual sub-tables generated, eg for hash where it is pretty straightforward to know what would be needed, or for a bounded range. - partition declaration without generated table, but they are generated on demand, when needed, for a range one may want weekly or monthly without creating tables in advance, esp. if it is unbounded. ISTM that the syntax should be clear and if possible homogeneous for all three use cases, even if they are not implemented yet. It should also allow easy extensibility, hence something without a strong syntax, key/value pairs to be interpreted later. -- Fabien.
timestamp bogus parser?
Hello, I tried: psql> SELECT TIMESTAMP '2020-12-23Z19:28:45'; The result of which is: 2020-12-23 00:00:00 This is disappointing. Ok, my fault, I should have written TIMESTAMPTZ, or use T instead of Z, or whatever. Anyway, is there a rational for this behavior? I would have expected either the time to be taken into account or an error message, but certainly not half of my string being silently ignored. Skimming through the doc (Section 8.5.1.3) did not provide an answer. Any clues? Is this a bug? -- Fabien.
Re: Add table access method as an option to pgbench
Hello Justin, src/test/regress/sql/create_am.sql:CREATE ACCESS METHOD heap2 TYPE TABLE HANDLER heap_tableam_handler; ... src/test/regress/sql/create_am.sql:DROP ACCESS METHOD heap2; Or maybe using SET default_tablespace instead of modifying the CREATE sql. That'd need to be done separately for indexes, and RESET after creating the tables, to avoid accidentally affecting indexes, too. Why should it not affect indexes? @Fabien: I think the table should be created and populated within the same transaction, to allow this optimization in v13 to apply during the "initialization" phase. c6b92041d Skip WAL for new relfilenodes, under wal_level=minimal. Possibly. This would be a change of behavior: currently only filling tables is under an explicit transaction. That would be a matter for another patch, probably with an added option. As VACUUM is not transaction compatible, it might be a little bit tricky to add such a feature. Also I'm not sure about some constraint declarations. ISTM that I submitted a patch a long time ago to allow "()" to control transaction from the command line, but somehow it got lost or rejected. I redeveloped it, see attached. I cannot see reliable performance improvement by playing with (), though. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index b03d0cc50f..8d9b642fdd 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -170,7 +170,7 @@ pgbench options d init_steps specifies the initialization steps to be performed, using one character per step. Each step is invoked in the specified order. -The default is dtgvp. +The default is dt(g)vp. The available steps are: @@ -226,6 +226,22 @@ pgbench options d + + ( (Begin) + + +Begin a transaction. + + + + + ) (Commit) + + +Commit a transaction. + + + v (Vacuum) diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 3057665bbe..f77e33056c 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -132,8 +132,8 @@ static int pthread_join(pthread_t th, void **thread_return); / * some configurable parameters */ -#define DEFAULT_INIT_STEPS "dtgvp" /* default -I setting */ -#define ALL_INIT_STEPS "dtgGvpf" /* all possible steps */ +#define DEFAULT_INIT_STEPS "dt(g)vp" /* default -I setting */ +#define ALL_INIT_STEPS "dtgGvpf()" /* all possible steps */ #define LOG_STEP_SECONDS 5 /* seconds between log messages */ #define DEFAULT_NXACTS 10 /* default nxacts */ @@ -3823,12 +3823,6 @@ initGenerateDataClientSide(PGconn *con) fprintf(stderr, "generating data (client-side)...\n"); - /* - * we do all of this in one transaction to enable the backend's - * data-loading optimizations - */ - executeStatement(con, "begin"); - /* truncate away any old data */ initTruncateTables(con); @@ -3940,8 +3934,6 @@ initGenerateDataClientSide(PGconn *con) } termPQExpBuffer(&sql); - - executeStatement(con, "commit"); } /* @@ -3958,12 +3950,6 @@ initGenerateDataServerSide(PGconn *con) fprintf(stderr, "generating data (server-side)...\n"); - /* - * we do all of this in one transaction to enable the backend's - * data-loading optimizations - */ - executeStatement(con, "begin"); - /* truncate away any old data */ initTruncateTables(con); @@ -3989,8 +3975,6 @@ initGenerateDataServerSide(PGconn *con) executeStatement(con, sql.data); termPQExpBuffer(&sql); - - executeStatement(con, "commit"); } /* @@ -4076,6 +4060,8 @@ initCreateFKeys(PGconn *con) static void checkInitSteps(const char *initialize_steps) { + bool in_tx = false; + if (initialize_steps[0] == '\0') { pg_log_fatal("no initialization steps specified"); @@ -4090,6 +4076,36 @@ checkInitSteps(const char *initialize_steps) pg_log_info("Allowed step characters are: \"" ALL_INIT_STEPS "\"."); exit(1); } + + if (*step == '(') + { + if (in_tx) + { +pg_log_fatal("opening a transaction inside a transaction"); +exit(1); + } + in_tx = true; + } + else if (*step == ')') + { + if (!in_tx) + { +pg_log_fatal("closing a transaction without opening it"); +exit(1); + } + in_tx = false; + } + else if (*step == 'v' && in_tx) + { + pg_log_fatal("cannot run vacuum within a transaction"); + exit(1); + } + } + + if (in_tx) + { + pg_log_fatal("uncommitted transaction"); + exit(1); } } @@ -4150,6 +4166,14 @@ runInitSteps(const char *initialize_steps) op = "foreign keys"; initCreateFKeys(con); break; + case '(': +op = "begin"; +executeStatement(con, "begin"); +break; + case ')': +op = "commit"; +
Re: pgsql: Add pg_alterckey utility to change the cluster key
Hello Bruce, I put the thread back on hackers. The first two keys are stored in pg_cryptokeys/ in the data directory, while the third one is retrieved using a GUC for validation at server startup for the other two. Do we necessarily have to store the first level keys within the data directory? I guess that this choice has been made for performance, but is that really something that a user would want all the time? AES256 is the only option available for the data keys. What if somebody wants to roll in their own encryption? To clarify, we encrypt the data keys using AES256, but the data keys themselves can be 128, 192, or 256 bits. Companies can have many requirements in terms of accepting the use of one option or another. I think ultimately we will need three commands to control the keys. First, there is the cluster_key_command, which we have now. Second, I think we will need an optional command which returns random bytes --- this would allow users to get random bytes from a different source than that used by the server code. Third, we will probably need a command that returns the data encryption keys directly, either heap/index or WAL keys, probably based on key number --- you pass the key number you want, and the command returns the data key. There would not be a cluster key in this case, but the command could still prompt the user for perhaps a password to the KMS server. It could not be used if any of the previous two commands are used. I assume an HMAC would still be stored in the pg_cryptokeys directory to check that the right key has been returned. Yep, my point is that it should be possible to have the whole key management outside of postgres. This said, postgres should provide a reasonable default implementation, obviously, preferably by using the provided mechanism (*NOT* a direct internal implementation and a possible switch to something else, IMHO, because then it would not be tested for whether it provides the right level of usability). I agree that keys need to be identified. I somehow disagree with the naming of the script and the implied usage. ISTM that there could be an external interface: - to initialize something. It may start a suid process, it may connect to a remote host, it may ask for a master password, who knows? /path/to/init --options arguments… the init process would return something which would be reused later on, eg an authentication token, or maybe a path to a socket for communication, or a file which contains something, or even a master/cluster key, but not necessarily. It may be anything. How it is passed to the next process/connection is an open question. Maybe on its stdin? - to start a process (?) which provide keys, either created (new) or existing (get), and possibly destroy them (or not?). The init process result should/could be passed somehow to this process, which may be suid something else. Another option would be to rely on some IPC mechanism. I'm not sure what the best choice is. ISTM that this process/connection could/should be persistent, with a simplistic text or binary based client/server interface. What this process/connection does it beyond postgres. In my mind, it could implement getting random data as well. I'd suggest that under no circumstances should the postgres process create cryptographic keys, although it should probably name them with some predefine length limit. /path/to/run --options arguments… Then there should be an postgres internal interface to store the results for local processing, retrieve them when needed, and so on, ok. ISTM that there should also be an internal interface to load the cryptographic primitives. Possibly a so/dll would do, or maybe just an extension mechanism which would provide the necessary functions, but this raise the issue of bootstraping, so maybe not so great an idea. The functions should probably be able to implement a counter mode, so that actual keys depend on the page position in file position, but what is really does is not postgres concern. A cryptographic concern for me is whether it would be possible to have authentication/integrity checks associated to each page. This means having the ability to reserve some space somewhere, possibly 8-16 bytes, in a page. Different algorithm could have different space requirements. The same interface should be used by other back-end commands (pg_upgrade, whatever). Somehow, the design should be abstract, without implying much, so that very different interfaces could be provided in term of whether there exists a master key, how keys are derived, what key sizes are, what algorithms are used, and so on. Postgres itself should not store keys, only key identifiers. I'm wondering whether replication should be able to work without some/all keys, so that a streaming replication could be implemented without the remote host being fully aware of the cryptographic keys. Another functional
Re: Proposed patch for key managment
I want to repeat here what I said in another thread: I think ultimately we will need three commands to control the keys. First, there is the cluster_key_command, which we have now. Second, I think we will need an optional command which returns random bytes --- this would allow users to get random bytes from a different source than that used by the server code. Third, we will probably need a command that returns the data encryption keys directly, either heap/index or WAL keys, probably based on key number --- you pass the key number you want, and the command returns the data key. There would not be a cluster key in this case, but the command could still prompt the user for perhaps a password to the KMS server. It could not be used if any of the previous two commands are used. I assume an HMAC would still be stored in the pg_cryptokeys directory to check that the right key has been returned. I thought we should implement the first command, because it will probably be the most common and easiest to use, and then see what people want added. There is also a fourth option where the command returns multiple keys, one per line of hex digits. That could be written in shell script, but it would be fragile and complex. It could be written in Perl, but that would add a new language requirement for this feature. It could be written in C, but that would limits its flexibility because changes would require a recompile, and you would probably need that C file to call external scripts to tailor input like we do now from the server. You could actually write a full implemention of what we do on the server side in client code, but pg_alterckey would not work, since it would not know the data format, so we would need another cluster key alter for that. It could be written as a C extension, but that would be also have C's limitations. In summary, having the server do most of the complex work for the default case seems best, and eventually allowing the ability for the client to do everything seems ideal. I think we need more input before we go beyond what we do now. As I said in the commit thread, I disagree with this approach because it pushes for no or partial or maybe bad design. I think that an API should be carefully thought about, without assumption about the underlying cryptography (algorithm, key lengths, modes, how keys are derived and stored, and so on), and its usefulness be demonstrated by actually being used for one implementation which would be what is currently being proposed in the patch, and possibly others thrown in for free. The implementations should not have to be in any particular language: Shell, Perl, Python, C should be possible. After giving it more thought during the day, I think that only one command and a basic protocol is needed. Maybe something as simple as /path/to/command --options arguments… With a basic (text? binary?) protocol on stdin/stdout (?) for the different functions. What the command actually does (connect to a remote server, ask for a master password, open some other database, whatever) should be irrelevant to pg, which would just get and pass bunch of bytes to functions, which could use them for keys, secrets, whatever, and be easily replaceable. The API should NOT make assumptions about the cryptographic design, what depends about what, where things are stored… ISTM that Pg should only care about naming keys, holding them when created/retrieved (but not create them), basically interacting with the key manager, passing the stuff to functions for encryption/decryption seen as black boxes. I may have suggested something along these lines at the beginning of the key management thread, probably. Not going this way implicitely implies making some assumptions which may or may not suit other use cases, so makes them specific not generic. I do not think pg should do that. -- Fabien.
Re: trailing junk in numeric literals
Hello Peter, My 0.02€: So strictly speaking this SQL code is nonstandard anyway. But our lexer has always been forgiving about not requiring space if it's not logically necessary to separate tokens. I doubt trying to change that would improve matters. Well, the idea is to diagnose potential typos better. But if there is no interest, then that's fine. ISTM that silently accepting bogus syntax hides bugs rather than helps users. I'm personaly all for fixing these, especially when I'm said user. My latest catch was: SELECT TIMESTAMP '2020-12-29Z06:16:18'; # 2020-12-29 00:00:00 But: SELECT TIMESTAMPTZ '2020-12-29Z06:16:18'; # 2020-12-29 07:16:18+01 SELECT TIMESTAMP '2020-12-29T06:16:18'; # 2020-12-29 06:16:18 I happen to type a O which is close to 0 for which the shift key is also needed on the French keyboard. This makes the unhelpful: SELECT 12O; # 12 as O I think that the policy should be to help user by detecting mistyped entries, not trying to interpret them out of the norm and expectations. -- Fabien.
Re: Proposed patch for key management
Hello Bruce, The API should NOT make assumptions about the cryptographic design, what depends about what, where things are stored… ISTM that Pg should only care about naming keys, holding them when created/retrieved (but not create them), basically interacting with the key manager, passing the stuff to functions for encryption/decryption seen as black boxes. I may have suggested something along these lines at the beginning of the key management thread, probably. Not going this way implicitely implies making some assumptions which may or may not suit other use cases, so makes them specific not generic. I do not think pg should do that. I am not sure what else I can add to this discussion. Having something that is completely external might be a nice option, but I don't think it is the common use-case, and will make the most common use cases more complex. I'm unsure about what is the "common use case". ISTM that having the postgres process holding the master key looks like a "no" for me in any use case with actual security concern: as the database must be remotely accessible, a reasonable security assumption is that a pg account could be compromised, and the "master key" (if any, that is just one particular cryptographic design) should not be accessible in that case. The first barrier would be pg admin account. With external, the options are that the key is hold by another id, so the host must be compromised as well, and could even be managed remotely on another host (I agree that this later option would be fragile because of the implied network connection, but it would be a tradeoff depending on the security concerns, but it pretty much correspond to the kerberos model). Adding a pre-defined system will not prevent future work on a completely external option. Yes and no. Having two side-by-side cluster-encryption scheme in core, the first that could be implemented on top of the second without much effort, would look pretty weird. Moreover, required changes in core to allow an API are the very same as the one in this patch. The other concern I have with doing the cleaning work afterwards is that the API would be somehow in the middle of the existing functions if it has not been thought of before. I know archive_command is completely external, but that is much simpler than what would need to be done for key management, key rotation, and key verification. I'm not sure of the design of the key rotation stuff as I recall it from the patches I looked at, but the API design looks like the more pressing issue. First, the mere existence of an "master key" is a cryptographic choice which is debatable, because it creates issues if one must be able to change it, hence the whole "key rotation" stuff. ISTM that what core needs to know is that one particular key is changed by a new one and the underlying file is rewritten. It should not need to know anything about master keys and key derivation at all. It should need to know that the user wants to change file keys, though. I will say that if the community feels external-only should be the only option, I will stop working on this feature because I feel the result would be too fragile to be reliable, I'm do not see why it would be the case. I'm just arguing to have key management in a separate, possibly suid something-else, process, which given the security concerns which dictates the feature looks like a must have, or at least must be possible. From a line count point of view, it should be a small addition to the current code. and I would not want to be associated with it. You do as you want. I'm no one, you can ignore me and proceed to commit whatever you want. I'm only advising to the best of my ability, what I think should be the level of API pursued for such a feature in pg, at the design level. I feel that the current proposal is built around one particular use case with many implicit and unnecessary assumptions without giving much thoughts to the generic API which should really be pg concern, and I do not think that it can really be corrected once the ship has sailed, hence my attempt at having some thoughts given to the matter before that. IMHO, the *minimum* should be to have the API clearly designed, and the implementation compatible with it at some level, including not having assumptions about underlying cryptographic functions and key derivations (I mean at the API level, the code itself can do it obviously). If you want a special "key_mgt_command = internal" because you feel that processes are fragile and unreliable and do not bring security, so be it, but I think that the API should be designed beforehand nevertheless. Note that if people think that I'm wrong, then I'm wrong by definition. -- Fabien.
Re: Proposed patch for key managment
Hello Stephen, The implementations should not have to be in any particular language: Shell, Perl, Python, C should be possible. I disagree that it makes any sense to pass actual encryption out to a shell script. Yes, sure. I'm talking about key management. For actual crypto functions, ISTM that they should be "replaceable", i.e. if some institution does not believe in AES then they could switch to something else. After giving it more thought during the day, I think that only one command and a basic protocol is needed. Maybe something as simple as /path/to/command --options arguments… This is what's proposed- a command is run to acquire the KEK (as a hex encoded set of bytes, making it possible to use a shell script, which is what the patch does too). Yep, but that is not what I'm trying to advocate. The "KEK" (if any), would stay in the process, not be given back to the database or command using it. Then the database/tool would interact with the command to get the actual per-file keys when needed. [...] The API to fetch the KEK doesn't care at all about where it's stored or how it's derived or anything like that. There's a relatively small change which could be made to have PG request all of the keys that it'll need on startup, if we want to go there as has been suggested elsewhere, but even if we do that, PG needs to be able to do that itself too, otherwise it's not a complete capability and there seems little point in doing something that's just a pass-thru to something else and isn't able to really be used. I do not understand. Postgres should provide a working implementation, whether the functions are directly inside pg or though an external process, they need to be there anyway. I'm trying to fix a clean, possibly external API so that it is possible to have something different from the choices made in the patch. -- Fabien.
Re: Proposed patch for key managment
Hello, The API to fetch the KEK doesn't care at all about where it's stored or how it's derived or anything like that. There's a relatively small change which could be made to have PG request all of the keys that it'll need on startup, if we want to go there as has been suggested elsewhere, but even if we do that, PG needs to be able to do that itself too, otherwise it's not a complete capability and there seems little point in doing something that's just a pass-thru to something else and isn't able to really be used. Right now, the script returns a cluster key (KEK), and during initdb the server generates data encryption keys and wraps them with the KEK. During server start, the server validates the KEK and decrypts the data keys. pg_alterckey allows changing the KEK. I think Fabien is saying this all should _only_ be done using external tools --- that's what I don't agree with. Yep. I could compromise on "could be done using an external tool", but that requires designing the API and thinking about where and how things are done before everything is hardwired. Designing afterwards is too late. ISTM that the current patch does not separate API design and cryptographic design, so both are deeply entwined, and would be difficult to disentangle. -- Fabien.
Re: pgbench: option delaying queries till connections establishment?
It looks like macOS doesn't have pthread barriers (via cfbot 2021, now with more operating systems): Indeed:-( I'll look into that. -- Fabien.
Re: Proposed patch for key management
Hello Stephen, I'm unsure about what is the "common use case". ISTM that having the postgres process holding the master key looks like a "no" for me in any use case with actual security concern: as the database must be remotely accessible, a reasonable security assumption is that a pg account could be compromised, and the "master key" (if any, that is just one particular cryptographic design) should not be accessible in that case. The first barrier would be pg admin account. With external, the options are that the key is hold by another id, so the host must be compromised as well, and could even be managed remotely on another host (I agree that this later option would be fragile because of the implied network connection, but it would be a tradeoff depending on the security concerns, but it pretty much correspond to the kerberos model). No, this isn't anything like the Kerberos model and there's no case where the PG account won't have access to the DEK (data encryption key) during normal operation (except with the possibility of off-loading to a hardware crypto device which, while interesting, is definitely something that's of very limited utility and which could be added on as a capability later anyway. This is also well understood by those who are interested in this capability and it's perfectly acceptable that PG will have, in memory, the DEK. I'm sorry that I'm not very clear. My ranting is having a KEK "master key" in pg memory. I think I'm fine at this point with having DEKs available on the host to code/decode files. What I meant about kerberos is that the setup I was describing was making the database dependent on a remote host, which is somehow what keroberos does. The keys which are stored on disk with the proposed patch are encrypted using a KEK which is external to PG- that's all part of the existing patch and design. Yep. My point is that the patch hardwires a cryptographic design with many assumptions, whereas I think it should design an API compatible with a larger class of designs, and possibly implement one with KEK and so on, I'm fine with that. Adding a pre-defined system will not prevent future work on a completely external option. Yes and no. Having two side-by-side cluster-encryption scheme in core, the first that could be implemented on top of the second without much effort, would look pretty weird. Moreover, required changes in core to allow an API are the very same as the one in this patch. The other concern I have with doing the cleaning work afterwards is that the API would be somehow in the middle of the existing functions if it has not been thought of before. This has been considered and the functions which are proposed are intentionally structured to allow for multiple implementations already, Does it? This is unclear to me from looking at the code, and the limited documentation does not point to that. I can see that the "kek provider" and the "random provider" can be changed, but the overall cryptographic design seems hardwired. so it's unclear to me what the argument here is. The argument is that professional cryptophers do wrong designs, and I do not see why you would do different. So instead of hardwiring choice that you think are THE good ones, ISTM that pg should design a reasonably flexible API, and also provide an implementation, obviously. Further, we haven't even gotten to actual cluster encryption yet- this is all just the key management side of things, With hardwired choices about 1 KEK and 2 DEK that are debatable, see below. which is absolutely a necessary and important part but argueing that it doesn't address the cluster encryption approach in core simply doesn't make any sense as that's not a part of this patch. Let's have that discussion when we actually get to the point of talking about cluster encryption. I know archive_command is completely external, but that is much simpler than what would need to be done for key management, key rotation, and key verification. I'm not sure of the design of the key rotation stuff as I recall it from the patches I looked at, but the API design looks like the more pressing issue. First, the mere existence of an "master key" is a cryptographic choice which is debatable, because it creates issues if one must be able to change it, hence the whole "key rotation" stuff. ISTM that what core needs to know is that one particular key is changed by a new one and the underlying file is rewritten. It should not need to know anything about master keys and key derivation at all. It should need to know that the user wants to change file keys, though. No, it's not debatable that a KEK is needed, I disagree entirely. ISTM that there is cryptographic design which suits your needs and you seem to decide that it is the only possible way to do it It seems that you know. As a researcher, I know that I do not know:-) As a software engineer, I'm trying to suggest enabling more with the
Re: pgbench: option delaying queries till connections establishment?
It looks like macOS doesn't have pthread barriers (via cfbot 2021, now with more operating systems): Indeed:-( I'll look into that. Just for fun, the attached 0002 patch is a quick prototype of a replacement for that stuff that seems to work OK on a Mac here. (I'm not sure if the Windows part makes sense or works.) Thanks! That will definitely help because I do not have a Mac. I'll do some cleanup. -- Fabien.
Re: PG vs LLVM 12 on seawasp, next round
Hello Thomas, Thanks for looking at this, I'm currently far behind on many things and not very responsive:-( Here is the creation of llvmjit.so: g++ ... -o llvmjit.so ... -L/home/fabien/clgtk/lib ... -lLLVMOrcJIT ... That'd be from llvm-config --ldflags or similar, from this binary: checking for llvm-config... (cached) /home/fabien/clgtk/bin/llvm-config So what does ls -slap /home/fabien/clgtk/lib show? Plenty files and links, eg: 0 lrwxrwxrwx 1 fabien fabien 21 janv. 23 09:40 /home/fabien/clgtk/lib/libLLVMMCJIT.so -> libLLVMMCJIT.so.12git 2140 -rw-r--r-- 1 fabien fabien 2190824 janv. 23 09:28 /home/fabien/clgtk/lib/libLLVMMCJIT.so.12git 0 lrwxrwxrwx 1 fabien fabien 22 janv. 23 09:40 /home/fabien/clgtk/lib/libLLVMOrcJIT.so -> libLLVMOrcJIT.so.12git 40224 -rw-r--r-- 1 fabien fabien 41187208 janv. 23 09:34 /home/fabien/clgtk/lib/libLLVMOrcJIT.so.12git Hmmm, clang recompilation has been failing for some weeks. Argh, this is due to the recent "master" to "main" branch renaming. What does ldd /home/fabien/pg/build-farm-11/buildroot/HEAD/pgsql.build/tmp_install/home/fabien/pg/build-farm-11/buildroot/HEAD/inst/lib/postgresql/llvmjit.so No such file of directory:-) because the directory is cleaned up. show? What do you have in seawap's LD_LIBRARY_PATH, and is the problem that you need to add /home/fabien/clgtk/lib to it Argh. Would it be so stupid? :-( I thought the configuration stuff would manage the link path automatically, which may be quite naïve, indeed. PS Could you try blowing away the accache directory so we can rule out bad cached configure stuff? Hmmm. I've tried that before. I can do it again. I've added an explicit LD_LIBRARY_PATH, which will be triggered at some point later. -- Fabien Coelho - CRI, MINES ParisTech
Re: PG vs LLVM 12 on seawasp, next round
I've added an explicit LD_LIBRARY_PATH, which will be triggered at some point later. This seems to have fixed the issue. I'm sorry for the noise and quite baffled anyway, because according to my change logs it does not seem that I modified anything from my side about the dynamic library path when compiling with clang. At least I do not see a trace of that. You can also do something like LDFLAGS="$LDFLAGS -Wl,-rpath,$(llvm-config --libdir)" or such. I've resorted to just hardcode LD_LIBRARY_PATH alongside PATH when compiling with clang in my buildfarm script. Thanks for the tip anyway. And thanks Thomas for pointing out the fix! -- Fabien.
Re: Using COPY FREEZE in pgbench
Hello Tatsuo-san, Currently pgbench uses plain COPY to populate pgbench_accounts table. With adding FREEZE option to COPY, the time to perform "pgbench -i" will be significantly reduced. Curent master: pgbench -i -s 100 done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s). Using FREEZE: done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s). That looks good! As COPY FREEZE was introduced in 9.3, this means that loading data would break with previous versions. Pgbench attempts at being compatible with older versions. I'm wondering whether we should not care or if we should attempt some compatibility layer. It seems enough to test "PQserverVersion() >= 90300"? -- Fabien.
Re: pgbench - add pseudo-random permutation function
What are your current thoughts? Thanks for prodding. I still think it's a useful feature. However I don't think I'll have to time to get it done on the current commitfest. I suggest to let it sit in the commitfest to see if somebody else will pick it up -- and if not, we move it to the next one, with apologies to author and reviewers. I may have time to become familiar or at least semi-comfortable with all that weird math in it by then. Yep. Generating a parametric good-quality low-cost (but not cryptographically-secure) pseudo-random permutations on arbitrary sizes (not juste power of two sizes) is not a trivial task, I had to be quite creative to achieve it, hence the "weird" maths. I had a lot of bad not-really-working ideas before the current status of the patch. The code could be simplified if we assume that PG_INT128_TYPE will be available on all relevant architectures, and accept the feature not to be available if not. -- Fabien.
Re: pgbench - add pseudo-random permutation function
Hello Dean, The implementation looks plausible too, though it adds quite a large amount of new code. A significant part of this new code the the multiply-modulo implementation, which can be dropped if we assume that the target has int128 available, and accept that the feature is not available otherwise. Also, there are quite a lot of comments which add to the code length. The main thing that concerns me is justifying the code. With this kind of thing, it's all too easy to overlook corner cases and end up with trivial sequences in certain special cases. I'd feel better about that if we were implementing a standard algorithm with known pedigree. Yep. I did not find anything convincing with the requirements: generate a permutation, can be parametric, low constant cost, good quality, work on arbitrary sizes… Thinking about the use case for this, it seems that it's basically designed to turn a set of non-uniform random numbers (produced by random_exponential() et al.) into another set of non-uniform random numbers, where the non-uniformity is scattered so that the more/less common values aren't all clumped together. Yes. I'm wondering if that's something that can't be done more simply by passing the non-uniform random numbers through the uniform random number generator to scatter them uniformly across some range -- e.g., given an integer n, return the n'th value from the sequence produced by random(), starting from some initial seed -- i.e., implement nth_random(lb, ub, seed, n). That would actually be pretty straightforward to implement using O(log(n)) time to execute (see the attached python example), though it wouldn't generate a permutation, so it'd need a bit of thought to see if it met the requirements. Indeed, this violates two requirements: constant cost & permutation. -- Fabien.
Re: pgbench: option delaying queries till connections establishment?
Hello Thomas, David Rowley kindly tested this for me on Windows and told me how to fix one of the macros that had incorrect error checking on that OS. So here's a new version. I'm planning to commit 0001 and 0002 soon, if there are no objections. 0003 needs some more review. I made a few mostly cosmetic changes, pgindented and pushed all these patches. Thanks a lot for pushing all that, and fixing issues raised by buildfarm animals pretty unexpected and strange failures… I must say that I'm not a big fan of the macro-based all-in-capitals API for threads because it exposes some platform specific uglyness (eg THREAD_FUNC_CC) and it does not look much like clean C code when used. I liked the previous partial pthread implementation better, even if it was not the real thing, obviously. ISTM that with the current approach threads are always used on Windows, i.e. pgbench does not comply to "ENABLE_THREAD_SAFETY" configuration on that platform. Not sure whether this is an issue that need to be addressed, though. -- Fabien.
Re: pgbench - add pseudo-random permutation function
Hello Alvaro, That doesn't sound like a bad option to me, if it makes this much simpler. The main modern system without it seems to be MSVC. The Linux, BSD, Apple, illumos, AIX systems using Clang/GCC with Intel/AMD/ARM/PowerPC CPUs have it, and the Windows systems using open source compilers have it. Hmm. Can we go a bit further, and say that if you don't have 128-bit ints, then you can use pr_perm but only to a maximum of 32-bit ints? Then you can do the calculations in 64-bit ints. That's much less bad than desupporting the feature altogether. This looks like a good compromise, which can even be a little better because we still have 64 bits ints. As there is already a performance shortcut in the code relying on the fact that x is 24 bits and that no overflow occurs if y & m are up to 48 bits (we are using unsigned int), the simplification is just to abort if int128 is not available, because we would have called the function only if it was really needed. Attached a simplified patch which does that, removing 1/3 of the code. What do you think? Note that this creates one issue though: tests now fail if 128 bits ints are not available. I'm not sure how to detect & skip that from the tap tests. I'm not keen on removing the tests either. Will give it some thoughts if you want to proceed in that direction. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 299d93b241..3fe06b1fe3 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -1057,7 +1057,7 @@ pgbench options d default_seed - seed used in hash functions by default + seed used in hash and pseudo-random permutation functions by default @@ -1874,6 +1874,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset + + +pr_perm ( i, size [, seed ] ) +integer + + +pseudo-random permutation in [0,size) + + +pr_perm(0, 4) +an integer between 0 and 3 + + + random ( lb, ub ) @@ -1960,6 +1974,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset shape of the distribution. + + + When designing a benchmark which selects rows non-uniformly, be aware + that using pgbench non-uniform random functions + directly leads to unduly correlations: rows with close ids come out with + similar probability, skewing performance measures because they also reside + in close pages. + + + You must use pr_perm to shuffle the selected ids, or + some other additional step with similar effect, to avoid these skewing impacts. + + + @@ -2054,24 +2082,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / hash_fnv1a accept an input value and an optional seed parameter. In case the seed isn't provided the value of :default_seed is used, which is initialized randomly unless set by the command-line --D option. Hash functions can be used to scatter the -distribution of random functions such as random_zipfian or -random_exponential. For instance, the following pgbench -script simulates possible real world workload typical for social media and +-D option. + + + +Function pr_perm implements a pseudo-random permutation +on an arbitrary size. +It allows to shuffle output of non uniform random functions so that +values drawn more often are not trivially correlated. +It permutes integers in [0, size) using a seed by applying rounds of +simple invertible functions, similarly to an encryption function, +although beware that it is not at all cryptographically secure. +Compared to hash functions discussed above, the function +ensures that a perfect permutation is applied: there are neither collisions +nor holes in the output values. +Values outside the interval are interpreted modulo the size. +The function raises an error if size is not positive. +If no seed is provided, :default_seed is used. + +For instance, the following pgbench script +simulates possible real world workload typical for social media and blogging platforms where few accounts generate excessive load: -\set r random_zipfian(0, 1, 1.07) -\set k abs(hash(:r)) % 100 +\set size 100 +\set r random_zipfian(1, :size, 1.07) +\set k 1 + pr_perm(:r, :size) In some cases several distinct distributions are needed which don't correlate with each other and this is when implicit seed parameter comes in handy: -\set k1 abs(hash(:r, :default_seed + 123)) % 100 -\set k2 abs(hash(:r, :default_seed + 321)) % 100 +\set k1 1 + pr_perm(:r, :size, :default_seed + 123) +\set k2 1 + pr_perm(:r, :size, :default_seed + 321) + +An similar behavior can also be approximated with +hash: + + +\set size 100 +\set r random_zipfian(1, 100 * :size, 1.0
Re: pgbench: option delaying queries till connections establishment?
Hello Thomas, I must say that I'm not a big fan of the macro-based all-in-capitals API for threads because it exposes some platform specific uglyness (eg THREAD_FUNC_CC) and it does not look much like clean C code when used. I liked the previous partial pthread implementation better, even if it was not the real thing, obviously. But we were using macros already, to support --disable-thread-safety builds. Yep, but the look and feel was still C code. I just changed them to upper case and dropped the 'p', because I didn't like to pretend to do POSIX threads, but do it so badly. Hmmm. From the source code point of view it was just like actually using posix threads, even if the underlying machinery was not quite that on some systems. I value looking at "beautiful" and "standard" code if possible, even if there is some cheating involved, compared to exposing macros. I made some effort to remove the pretty ugly and inefficient INSTR_TIME macros from pgbench, replaced with straightforward arithmetic and inlined functions. Now some other macros just crept back in:-) Anyway, this is just "les goûts et les couleurs" (just a matter of taste), as we say here. Perhaps we should drop --disable-thread-safety soon, and perhaps it is nearly time to create a good thread abstraction in clean C code, for use in the server and here? Then we won't need any ugly macros. +1. ISTM that with the current approach threads are always used on Windows, i.e. pgbench does not comply to "ENABLE_THREAD_SAFETY" configuration on that platform. Not sure whether this is an issue that need to be addressed, though. The idea of that option, as I understand it, is that in ancient times there were Unix systems with no threads (that's of course the reason PostgreSQL is the way it is). I don't think that was ever the case for Windows NT, and we have no build option for that on Windows AFAICS. Ok, fine with me. -- Fabien.
Re: pgbench - add pseudo-random permutation function
Hello Alvaro, Clearly we got rid of a lot of code. About the tests, maybe the easiest thing to do is "skip ... if $Config{'osname'} eq 'MSWin32'". I tried that. One comment in pseudorandom_perm talks about "whitening" while the other talks about "scramble". I think they should both use the same term to ease understanding. Good point. You kept routine nbits() which is unneeded now. Indeed. pgbench.c gains too many includes for my taste. Can we put pseudorandom_perm() in a separate file? The refactoring I was thinking about for some time now is to separate expression evaluation entirely, not just one function, and possibly other parts as well. I suggest that this should wait for another submission. The function name pr_perm() is much too short; I think we should use a more descriptive name; maybe \prand_perm is sufficient? Though I admit the abbreviation "perm" evokes "permission" more than "permutation" to me, so maybe \prand_permutation is better? (If you're inclined to abbreviate permutation, maybe \prand_permut?) What about simply "permute"? Pgbench is unlikely to get another permute function anyway. I think the reference docs are not clear enough. What do you think of something like this? I agree that the documentation is not clear enough. The proposal would not help me to understand what it does, though. I've tried to improve the explanation based on wikipedia explanations about permutations. See attached v22. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 299d93b241..c46272fd50 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -1057,7 +1057,7 @@ pgbench options d default_seed - seed used in hash functions by default + seed used in hash and pseudo-random permutation functions by default @@ -1842,6 +1842,22 @@ SELECT 4 AS four \; SELECT 5 AS five \aset + + +permute ( i, size [, seed ] ) +integer + + +Permuted value of i, in [0,size). +This is the new position of i in a pseudo-random rearrangement of +size first integers parameterized by seed. + + +permute(0, 4) +an integer between 0 and 3 + + + pi () @@ -1960,6 +1976,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset shape of the distribution. + + + When designing a benchmark which selects rows non-uniformly, be aware + that using pgbench non-uniform random functions + directly leads to unduly correlations: rows with close ids come out with + similar probability, skewing performance measures because they also reside + in close pages. + + + You must use pr_perm to shuffle the selected ids, or + some other additional step with similar effect, to avoid these skewing impacts. + + + @@ -2054,24 +2084,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / hash_fnv1a accept an input value and an optional seed parameter. In case the seed isn't provided the value of :default_seed is used, which is initialized randomly unless set by the command-line --D option. Hash functions can be used to scatter the -distribution of random functions such as random_zipfian or -random_exponential. For instance, the following pgbench -script simulates possible real world workload typical for social media and +-D option. + + + +Function pr_perm implements a pseudo-random permutation +on an arbitrary size. +It allows to shuffle output of non uniform random functions so that +values drawn more often are not trivially correlated. +It permutes integers in [0, size) using a seed by applying rounds of +simple invertible functions, similarly to an encryption function, +although beware that it is not at all cryptographically secure. +Compared to hash functions discussed above, the function +ensures that a perfect permutation is applied: there are neither collisions +nor holes in the output values. +Values outside the interval are interpreted modulo the size. +The function raises an error if size is not positive. +If no seed is provided, :default_seed is used. + +For instance, the following pgbench script +simulates possible real world workload typical for social media and blogging platforms where few accounts generate excessive load: -\set r random_zipfian(0, 1, 1.07) -\set k abs(hash(:r)) % 100 +\set size 100 +\set r random_zipfian(1, :size, 1.07) +\set k 1 + pr_perm(:r, :size) In some cases several distinct distributions are needed which don't correlate with each other and this is when implicit seed parameter comes in handy: -\set k1 abs(hash(:r, :default_seed + 123)) % 100 -\set k2 abs(hash(:r, :default_seed + 321)) % 100 +\se
Re: Using COPY FREEZE in pgbench
Hello Tatsuo-san, So I think adding "freeze" to the copy statement should only happen in PostgreSQL 14 or later. Probably the test should be "PQserverVersion() >= 14" I think. Attached is the patch doing what you suggest. I have created a CommitFest entry for this. https://commitfest.postgresql.org/33/3034/ My 0.02 € After giving it some thought, ISTM that there could also be a performance improvement with copy freeze from older version, so I'd suggest to add it after 9.3 where the option was added, i.e. 90300. Also, I do not think it is worth to fail on a 0 pqserverversion, just keep the number and consider it a very old version? Question: should there be a word about copy using the freeze option? I'd say yes, in the section describing "g". -- Fabien.
Re: Using COPY FREEZE in pgbench
Hello, After giving it some thought, ISTM that there could also be a performance improvement with copy freeze from older version, so I'd suggest to add it after 9.3 where the option was added, i.e. 90300. You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not contribute a performance improvement. See discussions for more details. https://www.postgresql.org/message-id/camku%3d1w3osjj2fneelhhnrlxfzitdgp9fphee08nt2fqfmz...@mail.gmail.com https://www.postgresql.org/message-id/flat/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg%40mail.gmail.com Ok. ISTM that the option should be triggered as soon as it is available, but you do as you wish. Also, I do not think it is worth to fail on a 0 pqserverversion, just keep the number and consider it a very old version? If pqserverversion fails, then following PQ calls are likely fail too. What's a benefit to continue after pqserverversion returns 0? I'm unsure how this could happen ever. The benefit of not caring is less lines of codes in pgbench and a later call will catch the issue anyway, if libpq is corrupted. Question: should there be a word about copy using the freeze option? I'd say yes, in the section describing "g". Can you elaborate about "section describing "g"? I am not sure what you mean. The "g" item in the section describing initialization steps (i.e. option -I). I'd suggest just to replace "COPY" with "COPY FREEZE" in the sentence. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, This reply was two months ago, and nothing has happened, so I have marked the patch as RwF. Given the ongoing work on returning multiple result sets from stored procedures[0], I went to dust off this patch. Based on the feedback, I put back the titular SHOW_ALL_RESULTS option, but set the default to on. I fixed the test failure in 013_crash_restart.pl. I also trimmed back the test changes a bit so that the resulting test output changes are visible better. (We could make those stylistic changes separately in another patch.) I'll put this back into the commitfest for another look. Thanks a lot for the fixes and pushing it forward! My 0.02€: I tested this updated version and do not have any comment on this version. From my point of view it could be committed. I would not bother to separate the test style ajustments. -- Fabien.
Re: pgbench - add pseudo-random permutation function
See attached v22. v23: - replaces remaining occurences of "pr_perm" in the documentation - removes duplicated includes -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 299d93b241..9f49a6a78d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -1057,7 +1057,7 @@ pgbench options d default_seed - seed used in hash functions by default + seed used in hash and pseudo-random permutation functions by default @@ -1842,6 +1842,22 @@ SELECT 4 AS four \; SELECT 5 AS five \aset + + +permute ( i, size [, seed ] ) +integer + + +Permuted value of i, in [0,size). +This is the new position of i in a pseudo-random rearrangement of +size first integers parameterized by seed. + + +permute(0, 4) +an integer between 0 and 3 + + + pi () @@ -1960,6 +1976,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset shape of the distribution. + + + When designing a benchmark which selects rows non-uniformly, be aware + that using pgbench non-uniform random functions + directly leads to unduly correlations: rows with close ids come out with + similar probability, skewing performance measures because they also reside + in close pages. + + + You must use permute to shuffle the selected ids, or + some other additional step with similar effect, to avoid these skewing impacts. + + + @@ -2054,24 +2084,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / hash_fnv1a accept an input value and an optional seed parameter. In case the seed isn't provided the value of :default_seed is used, which is initialized randomly unless set by the command-line --D option. Hash functions can be used to scatter the -distribution of random functions such as random_zipfian or -random_exponential. For instance, the following pgbench -script simulates possible real world workload typical for social media and +-D option. + + + +Function permute implements a parametric pseudo-random permutation +on an arbitrary size. +It allows to shuffle output of non uniform random functions so that +values drawn more often are not trivially correlated. +It permutes integers in [0, size) using a seed by applying rounds of +simple invertible functions, similarly to an encryption function, +although beware that it is not at all cryptographically secure. +Compared to hash functions discussed above, the function +ensures that a perfect permutation is applied: there are neither collisions +nor holes in the output values. +Values outside the interval are interpreted modulo the size. +The function raises an error if size is not positive. +If no seed is provided, :default_seed is used. + +For instance, the following pgbench script +simulates possible real world workload typical for social media and blogging platforms where few accounts generate excessive load: -\set r random_zipfian(0, 1, 1.07) -\set k abs(hash(:r)) % 100 +\set size 100 +\set r random_zipfian(1, :size, 1.07) +\set k 1 + permute(:r, :size) In some cases several distinct distributions are needed which don't correlate with each other and this is when implicit seed parameter comes in handy: -\set k1 abs(hash(:r, :default_seed + 123)) % 100 -\set k2 abs(hash(:r, :default_seed + 321)) % 100 +\set k1 1 + permute(:r, :size, :default_seed + 123) +\set k2 1 + permute(:r, :size, :default_seed + 321) + +An similar behavior can also be approximated with +hash: + + +\set size 100 +\set r random_zipfian(1, 100 * :size, 1.07) +\set k 1 + abs(hash(:r)) % :size + + +As this hash-modulo version generates collisions, some +id would not be reachable and others would come more often +than the target distribution. diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y index 4d529ea550..73514a0a47 100644 --- a/src/bin/pgbench/exprparse.y +++ b/src/bin/pgbench/exprparse.y @@ -19,6 +19,7 @@ #define PGBENCH_NARGS_VARIABLE (-1) #define PGBENCH_NARGS_CASE (-2) #define PGBENCH_NARGS_HASH (-3) +#define PGBENCH_NARGS_PERMUTE (-4) PgBenchExpr *expr_parse_result; @@ -370,6 +371,9 @@ static const struct { "hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A }, + { + "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE + }, /* keep as last array element */ { NULL, 0, 0 @@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args) } break; + /* pseudo-random permutation function with optional seed argument */ + case PGBENCH_NARGS_PERMUTE: + if (len < 2 || len > 3) +expr_yyerror_more(yyscanner, "unexpected number of arguments"
Re: pgbench - add pseudo-random permutation function
Hello Alvaro, + /*- +* Apply 4 rounds of bijective transformations using key updated +* at each stage: +* +* (1) whiten: partial xors on overlapping power-of-2 subsets +* for instance with v in 0 .. 14 (i.e. with size == 15): +* if v is in 0 .. 7 do v = (v ^ k) % 8 +* if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8 +* note that because of the overlap (here 7), v may be changed twice. +* this transformation if bijective because the condition to apply it +* is still true after applying it, and xor itself is bijective on a +* power-of-2 size. +* +* (2) scatter: linear modulo +* v = (v * p + k) % size +* this transformation is bijective is p & size are prime, which is +* ensured in the code by the while loop which discards primes when +* size is a multiple of it. +* +*/ My main question on this now is, do you have a scholar reference for this algorithm? Nope, otherwise I would have put a reference. I'm a scholar though, if it helps:-) I could not find any algorithm that fitted the bill. The usual approach (eg benchmarking designs) is too use some hash function and assume that it is not a permutation, too bad. Basically the algorithm mimics a few rounds of cryptographic encryption adapted to any size and simple operators, whereas encryption function are restricted to power of two blocks (eg the Feistel network). The structure is the same AES with its AddRoundKey the xor-ing stage (adapted to non power of two in whiten above), MixColumns which does the scattering, and for key expansion, I used Donald Knuth generator. Basically one could say that permute is an inexpensive and insecure AES:-) We could add a reference to AES for the structure of the algorithm itself, but otherwise I just iterated designs till I was satisfied with the result (again: inexpensive and constant cost, any size, permutation…). -- Fabien.
RE: pgbench: option delaying queries till connections establishment?
Hello, Please complete fixes for the documentation. At least the following sentence should be fixed: ``` The last two lines report the number of transactions per second, figured with and without counting the time to start database sessions. ``` Indeed. I scanned the file but did not find other places that needed updating. -starting vacuum...end. I think any other options should be disabled in the example, therefore please leave this line. Yes. + for (int i = 0; i < nstate; i++) + { + state[i].state = CSTATE_CHOOSE_SCRIPT; + } I'm not sure but I think braces should be removed in our coding conventions. Not sure either. I'm not for having too many braces anyway, so I removed them. + /* GO */ + pthread_barrier_wait(&barrier); The current implementation is too simple. If nthreads >= 2 and connection fails in the one thread, the other one will wait forever. Some special treatments are needed in the `done` code block and here. Indeed. I took your next patch with an added explanation. I'm unclear whether proceeding makes much sense though, that is some thread would run the test and other would have aborted. Hmmm. (that is, it can be disabled) On reflection, I'm not sure I see a use case for not running the barrier if it is available. If the start point changes and there is no way to disable this feature, the backward compatibility will be completely violated. It means that tps cannot be compared to older versions under the same conditions, and It may hide performance-related issues. I think it's not good. ISTM that there is another patch in the queue which needs barriers to delay some initialization so as to fix a corner case bug, in which case the behavior would be mandatory. The current submission could add an option to skip the barrier synchronization, but I'm not enthousiastic to add an option and remove it shortly later. Moreover, the "compatibility" is with nothing which does not make much sense. When testing with many threads and clients, the current implementation make threads start when they are ready, which means that you can have threads issuing transactions while others are not yet connected or not even started, so that the actually measured performance is quite awkward for a short bench. ISTM that allowing such a backward compatible strange behavior does not serve pg users. If the user want the old unreliable behavior, they can use a old pgbench, and obtain unreliable figures. For these two reasons, I'm inclined not to add an option to skip these barriers, but this can be debatted further. Attached 2 updated patches. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 7180fedd65..f02721da0d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -58,8 +58,10 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 -tps = 85.184871 (including connections establishing) -tps = 85.296346 (excluding connections establishing) +latency average = 11.013 ms +latency stddev = 7.351 ms +initial connection time = 45.758 ms +tps = 896.967014 (without initial connection establishing) The first six lines report some of the most important parameter @@ -68,8 +70,7 @@ tps = 85.296346 (excluding connections establishing) and number of transactions per client); these will be equal unless the run failed before completion. (In -T mode, only the actual number of transactions is printed.) - The last two lines report the number of transactions per second, - figured with and without counting the time to start database sessions. + The last line reports the number of transactions per second. @@ -2234,22 +2235,22 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 -latency average = 15.844 ms -latency stddev = 2.715 ms -tps = 618.764555 (including connections establishing) -tps = 622.977698 (excluding connections establishing) +latency average = 10.870 ms +latency stddev = 7.341 ms +initial connection time = 30.954 ms +tps = 907.949122 (without initial connection establishing) statement latencies in milliseconds: -0.002 \set aid random(1, 10 * :scale) -0.005 \set bid random(1, 1 * :scale) -0.002 \set tid random(1, 10 * :scale) -0.001 \set delta random(-5000, 5000) -0.326 BEGIN; -0.603 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; -0.454 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; -5.528 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; -7.335 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; -0.371 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bi
Re: pgbench stopped supporting large number of client connections on Windows
Hello Marina, While trying to test a patch that adds a synchronization barrier in pgbench [1] on Windows, Thanks for trying that, I do not have a windows setup for testing, and the sync code I wrote for Windows is basically blind coding:-( I found that since the commit "Use ppoll(2), if available, to wait for input in pgbench." [2] I cannot use a large number of client connections in pgbench on my Windows virtual machines (Windows Server 2008 R2 and Windows 2019), for example: bin\pgbench.exe -c 90 -S -T 3 postgres starting vacuum...end. ISTM that 1 thread with 90 clients is a bad idea, see below. The almost same thing happens with reindexdb and vacuumdb (build on commit [3]): Windows fd implementation is somehow buggy because it does not return the smallest number available, and then with the assumption that select uses a dense array indexed with them (true on linux, less so on Windows which probably uses a sparse array), so that the number gets over the limit, even if less are actually used, hence the catch, as you noted. Another point is windows has a hardcoded number of objects one thread can really wait for, typically 64, so that waiting for more requires actually forking threads to do the waiting. But if you are ready to fork threads just to wait, then probaly you could have started pgbench with more threads in the first place. Now it would probably not make the problem go away because fd numbers would be per process, not per thread, but it really suggests that one should not load a thread is more than 64 clients. IIUC the checks below are not correct on Windows, since on this system sockets can have values equal to or greater than FD_SETSIZE (see Windows documentation [4] and pgbench debug output in attached pgbench_debug.txt). Okay. But then, how may one detect that there are too many fds in the set? I think that an earlier version of the code needed to make assumptions about the internal implementation of windows (there is a counter somewhere in windows fd_set struct), which was rejected because if was breaking the interface. Now your patch is basically resurrecting that. Why not if there is no other solution, but this is quite depressing, and because it breaks the interface it would be broken if windows changed its internals for some reason:-( Doesn't windows has "ppoll"? Should we implement the stuff above windows polling capabilities and coldly skip its failed posix portability attempts? This raises again the issue that you should not have more that 64 clients per thread anyway, because it is an intrinsic limit on windows. I think that at one point it was suggested to error or warn if nclients/nthreads is too great, but that was not kept in the end. I tried to fix this, see attached fix_max_client_conn_on_Windows.patch (based on commit [3]). I checked it for reindexdb and vacuumdb, and it works for simple databases (1025 jobs are not allowed and 1024 jobs is ok). Unfortunately, pgbench was getting connection errors when it tried to use 1000 jobs on my virtual machines, although there were no errors for fewer jobs (500) and the same number of clients (1000)... It seems that the max number of threads you can start depends on available memory, because each thread is given its own stack, so it would depend on your vm settings? Any suggestions are welcome! Use ppoll, and start more threads but not too many? -- Fabien.
RE: pgbench: option delaying queries till connections establishment?
Hello, Indeed. I took your next patch with an added explanation. I'm unclear whether proceeding makes much sense though, that is some thread would run the test and other would have aborted. Hmmm. Your comment looks good, thanks. In the previous version pgbench starts benchmarking even if some connections fail. And users can notice the connection failure by stderr output. Hence the current specification may be enough. Usually I run many pgbench through scripts, so I'm probably not there to check a lone stderr failure at the beginning if performance figures are actually reported. If you agree, please remove the following lines: ``` +* It is unclear whether it is worth doing anything rather than +* coldly exiting with an error message. ``` I can remove the line, but I strongly believe that reporting performance figures if some client connection failed thus the bench could not run as prescribed is a bad behavior. The good news is that it is probably quite unlikely. So I'd prefer to keep it and possibly submit a patch to change the behavior. ISTM that there is another patch in the queue which needs barriers to delay some initialization so as to fix a corner case bug, in which case the behavior would be mandatory. The current submission could add an option to skip the barrier synchronization, but I'm not enthousiastic to add an option and remove it shortly later. Could you tell me which patch you mention? Basically I agree what you say, but I want to check it. Should be this one: https://commitfest.postgresql.org/30/2624/, -- Fabien.
Re: pgbench stopped supporting large number of client connections on Windows
Hello Tom, Use ppoll, and start more threads but not too many? Does ppoll exist on Windows? Some g*gling suggest that the answer is no. There was a prior thread on this topic, which seems to have drifted off into the sunset: Indeed. I may have contributed to this dwindling by not adding a CF entry for this thread, so that there was no reminder anywhere. https://www.postgresql.org/message-id/flat/BL0PR1901MB1985F219C46C61EDE7036C34ED8E0%40BL0PR1901MB1985.namprd19.prod.outlook.com It seems that there is no simple good solution around windows wait event implementations: - timeout seems to be milliseconds on all things I saw - 64 is an intrinsic limit, probably because of the underlying n² implementations Maybe we could provide a specific windows implementation limited to 64 fd (which is not a bad thing bench-wise) but with a rounded-down timeout, so that it could end-up on an activate spinning wait in some cases, which is probably not a bug issue, all things considered. -- Fabien.
Re: Useless string ouput in error message
I think I found a typo for the output of an error message which may cause building warning. Please refer to the attachment for the detail. Indeed. Thanks for the fix! -- Fabien.
RE: pgbench: option delaying queries till connections establishment?
Hello, I can remove the line, but I strongly believe that reporting performance figures if some client connection failed thus the bench could not run as prescribed is a bad behavior. The good news is that it is probably quite unlikely. So I'd prefer to keep it and possibly submit a patch to change the behavior. I agree such a situation is very bad, and I understood you have a plan to submit patches for fix it. If so leaving lines as a TODO is OK. Thanks. Should be this one: https://commitfest.postgresql.org/30/2624/ This discussion is still on-going, but I can see that the starting time may be delayed for looking up all pgbench-variables. Yep, that's it. (I think the status of this thread might be wrong. it should be 'Needs review,' but now 'Waiting on Author.') I changed it to "Needs review". This patch is mostly good and can change a review status soon, however, I think it should wait that related patch. Hmmm. Please discuss how to fix it with Tom, I would not have the presumption to pressure Tom's agenda in any way! and this will commit soon. and this will wait till its time comes. In the mean time, I think that you should put the patch status as you see fit, independently of the other patch: it seems unlikely that they would be committed together, and I'll have to merge the remaining one anyway. -- Fabien.
Re: pgbench: option delaying queries till connections establishment?
Hello Marina, 1) It looks like pgbench will no longer support Windows XP due to the function DeleteSynchronizationBarrier. From https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-deletesynchronizationbarrier Minimum supported client: Windows 8 [desktop apps only] Minimum supported server: Windows Server 2012 [desktop apps only] Thanks for the test and precise analysis! Sigh. I do not think that putting such version requirements are worth it just for the sake of pgbench. In the attached version, I just comment out the call and add an explanation about why it is commented out. If pg overall version requirements are changed on windows, then it could be reinstated. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 7180fedd65..f02721da0d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -58,8 +58,10 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 -tps = 85.184871 (including connections establishing) -tps = 85.296346 (excluding connections establishing) +latency average = 11.013 ms +latency stddev = 7.351 ms +initial connection time = 45.758 ms +tps = 896.967014 (without initial connection establishing) The first six lines report some of the most important parameter @@ -68,8 +70,7 @@ tps = 85.296346 (excluding connections establishing) and number of transactions per client); these will be equal unless the run failed before completion. (In -T mode, only the actual number of transactions is printed.) - The last two lines report the number of transactions per second, - figured with and without counting the time to start database sessions. + The last line reports the number of transactions per second. @@ -2234,22 +2235,22 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 -latency average = 15.844 ms -latency stddev = 2.715 ms -tps = 618.764555 (including connections establishing) -tps = 622.977698 (excluding connections establishing) +latency average = 10.870 ms +latency stddev = 7.341 ms +initial connection time = 30.954 ms +tps = 907.949122 (without initial connection establishing) statement latencies in milliseconds: -0.002 \set aid random(1, 10 * :scale) -0.005 \set bid random(1, 1 * :scale) -0.002 \set tid random(1, 10 * :scale) -0.001 \set delta random(-5000, 5000) -0.326 BEGIN; -0.603 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; -0.454 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; -5.528 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; -7.335 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; -0.371 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); -1.212 END; +0.001 \set aid random(1, 10 * :scale) +0.001 \set bid random(1, 1 * :scale) +0.001 \set tid random(1, 10 * :scale) +0.000 \set delta random(-5000, 5000) +0.046 BEGIN; +0.151 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; +0.107 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; +4.241 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; +5.245 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; +0.102 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); +0.974 END; diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 3057665bbe..b8a3e28690 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -292,9 +292,9 @@ typedef struct SimpleStats */ typedef struct StatsData { - time_t start_time; /* interval start time, for aggregates */ - int64 cnt; /* number of transactions, including skipped */ - int64 skipped; /* number of transactions skipped under --rate + pg_time_usec_t start_time; /* interval start time, for aggregates */ + int64 cnt; /* number of transactions, including skipped */ + int64 skipped; /* number of transactions skipped under --rate * and --latency-limit */ SimpleStats latency; SimpleStats lag; @@ -417,11 +417,11 @@ typedef struct int nvariables; /* number of variables */ bool vars_sorted; /* are variables sorted by name? */ - /* various times about current transaction */ - int64 txn_scheduled; /* scheduled start time of transaction (usec) */ - int64 sleep_until; /* scheduled start time of next cmd (usec) */ - instr_time txn_begin; /* used for measuring schedule lag times */ - instr_time stmt_begin; /* used for measuring statement latencies */ + /* various times about current transaction i
Re: pgbench: option delaying queries till connections establishment?
I think the issue really is that, independent of PG lock contention, it'll take a while to establish all connections, and that starting to benchmark with only some connections established will create pretty pointless numbers. Yes. This is why I think that if we have some way to synchronize it should always be used, i.e. not an option. -- Fabien.
Re: parsing pg_ident.conf
Hello Andrew, I noticed somewhat to my surprise as I was prepping the tests for the "match the whole DN" patch that pg_ident.conf is parsed using the same routines used for pg_hba.conf, and as a result the DN almost always needs to be quoted, because they almost all contain a comma e.g. "O=PGDG,OU=Testing". Even if we didn't break on commas we would probably need to quote most of them, because it's very common to include spaces e.g. "O=Acme Corp,OU=Marketing". Nevertheless it seems rather odd to break on commas, since nothing in the file is meant to be a list - this is a file with exactly three single-valued fields. Not sure if it's worth doing anything about this, or we should just live with it the way it is. My 0.02 €: ISTM that having to quote long strings which may contains space or other separators is a good thing from a readability point of view, even if it would be possible to parse it without the quotes. So I'm in favor of keeping it that way. Note that since 8f8154a503, continuations are allowed on "pg_hba.conf" and "pg_ident.conf", and that you may continuate within a quoted string, which may be of interest when considering LDAP links. -- Fabien.
Re: parsing pg_ident.conf
I noticed somewhat to my surprise as I was prepping the tests for the "match the whole DN" patch that pg_ident.conf is parsed using the same routines used for pg_hba.conf, and as a result the DN almost always needs to be quoted, because they almost all contain a comma e.g. "O=PGDG,OU=Testing". Even if we didn't break on commas we would probably need to quote most of them, because it's very common to include spaces Maybe we should add a comment at the top of the file about when quoting is needed. Yes, that is a good place to point that out. Possibly it would also be worth to add something in 20.2, including an example? -- Fabien.
Re: Add table access method as an option to pgbench
Hello David, The previous patch was based on branch "REL_13_STABLE". Now, the attached new patch v2 is based on master branch. I followed the new code structure using appendPQExpBuffer to append the new clause "using TABLEAM" in a proper position and tested. In the meantime, I also updated the pgbench.sqml file to reflect the changes. My 0.02€: I'm fine with the added feature. The patch lacks minimal coverage test. Consider adding something to pgbench tap tests, including failures (ie non existing method). The option in the help string is not at the right ab place. I would merge the tableam declaration to the previous one with a extended comments, eg "tablespace and access method selection". escape_tableam -> escaped_tableam ? -- Fabien.
Re: pgbench and timestamps (bounced)
CFM reminder. Hi, this entry is "Waiting on Author" and the thread was inactive for a while. I see this discussion still has some open questions. Are you going to continue working on it, or should I mark it as "returned with feedback" until a better time? IMHO the proposed fix is reasonable and addresses the issue. I have responded to Tom's remarks about it, and it is waiting for his answer to my counter arguments. So ISTM that the patch is currently still waiting for some feedback. -- Fabien.
Re: Add table access method as an option to pgbench
Hello David, Some feedback about v3: In the doc I find TABLEACCESSMETHOD quite hard to read. Use TABLEAM instead? Also, the next entry uses lowercase (tablespace), why change the style? Remove space after comma in help string. I'd use the more readable TABLEAM in the help string rather than the mouthful. It looks that the option is *silently* ignored when creating a partitionned table, which currently results in an error, and not passed to partitions, which would accept them. This is pretty weird. I'd suggest that the table am option should rather by changing the default instead, so that it would apply to everything relevant implicitely when appropriate. About tests : They should also trigger failures, eg "--table-access-method=no-such-table-am", to be added to the @errors list. I do not understand why you duplicated all possible option entry. Test with just table access method looks redundant if the feature is make to work orthonogonally to partitions, which should be the case. By the way, I saw the same style for other variables, such as escape_tablespace, should this be fixed as well? Nope, let it be. -- Fabien.
Re: pgbench - test whether a variable exists
The new status of this patch is: Waiting on Author This patch was inactive during the commitfest, so I am going to mark it as "Returned with Feedback". Fabien, are you planning to continue working on it? Not in the short term, but probably for the next CF. Can you park it there? -- Fabien.
Re: pgbench - test whether a variable exists
Sure, I'll move it to the next CF then. I also noticed, that the first message mentions the idea of refactoring to use some code it in both pgbench and psql code. Can you, please, share a link to the thread, if it exists? Unsure. I'll try to find it if it exists. -- Fabien.
Re: pgbench - test whether a variable exists
Bonjour Michaël, On Mon, Nov 30, 2020 at 12:53:20PM +0100, Fabien COELHO wrote: Not in the short term, but probably for the next CF. Can you park it there? IMO, I think that it would be better to leave this item marked as RwF for now if you are not sure, and register it again in the CF once there is an actual new patch. This approach makes for less bloat in the CF app. Yep. Although I was under water this Fall, I should have more time available in January, so I think I'll do something about it. Now if it is RWF by then, I'll resubmit it. -- Fabien.
Re: pgbench and timestamps (bounced)
Hello Tom, Hi, this entry is "Waiting on Author" and the thread was inactive for a while. I see this discussion still has some open questions. Are you going to continue working on it, or should I mark it as "returned with feedback" until a better time? IMHO the proposed fix is reasonable and addresses the issue. I have responded to Tom's remarks about it, and it is waiting for his answer to my counter arguments. So ISTM that the patch is currently still waiting for some feedback. It looks like my unhappiness with injecting a pthread dependency into pgbench is going to be overtaken by events in the "option delaying queries" thread [1]. However, by the same token there are some conflicts between the two patchsets, and also I prefer the other thread's approach to portability (i.e. do it honestly, not with a private portability layer in pgbench.c). So I'm inclined to put the parts of this patch that require pthreads on hold till that lands. Ok. This is fair enough. Portability is a pain thanks to Windows vs MacOS vs Linux approaches of implementing or not a standard. What remains that we could do now, and perhaps back-patch, is point (2) i.e. disallow digits as the first character of a pgbench variable name. I'm fine with that. That would be enough to "solve" the original bug report, and it does seem like it could be back-patched, while we're certainly not going to risk back-patching anything as portability-fraught as introducing mutexes. Sure. I'm unable to do much pg work at the moment, but this should be eased quite soon. [1] https://www.postgresql.org/message-id/flat/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de -- Fabien Coelho - CRI, MINES ParisTech
Re: PG vs LLVM 12 on seawasp, next round
Hello Alvaro, The "no such file" error seems more like a machine local issue to me. I'll look into it when I have time, which make take some time. Hopefully over the holidays. This is still happening ... Any chance you can have a look at it? Indeed. I'll try to look (again) into it soon. I had a look but did not find anything obvious in the short time frame I had. Last two months were a little overworked for me so I let slip quite a few things. If you want to disable the animal as Tom suggests, do as you want. -- Fabien.
Re: pgbench: option delaying queries till connections establishment?
Hello Thomas, 3 . Decide if it's sane for the Windows-based emulation to be in here too, or if it should stay in pgbench.c. Or alternatively, if we're emulating pthread stuff on Windows, why not also put the other pthread emulation stuff from pgbench.c into a "ports" file; that seems premature and overkill for your project. I dunno. I decided to solve only the macOS problem for now. So in this version, the A and B patches are exactly as you had them in your v7, except that B includes “port/pg_pthread.h” instead of . Maybe it’d make sense to move the Win32 pthread emulation stuff out of pgbench.c into src/port too (the pre-existing stuff, and the new barrier stuff you added), but that seems like a separate patch, one that I’m not best placed to write, and it’s not clear to me that we’ll want to be using pthread APIs as our main abstraction if/when thread usage increases in the PG source tree anyway. Other opinions welcome. I think it would be much more consistent to move all the thread complement stuff there directly: Currently (v8) the windows implementation is in pgbench and the MacOS implementation in port, which is quite messy. Attached is a patch set which does that. I cannot test it neither on Windows nor on MacOS. Path 1 & 2 are really independent. -- Fabien.diff --git a/configure b/configure index e202697bbf..54c5a7963f 100755 --- a/configure +++ b/configure @@ -11668,6 +11668,62 @@ if test "$ac_res" != no; then : fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing pthread_barrier_wait" >&5 +$as_echo_n "checking for library containing pthread_barrier_wait... " >&6; } +if ${ac_cv_search_pthread_barrier_wait+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_func_search_save_LIBS=$LIBS +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +/* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ +#ifdef __cplusplus +extern "C" +#endif +char pthread_barrier_wait (); +int +main () +{ +return pthread_barrier_wait (); + ; + return 0; +} +_ACEOF +for ac_lib in '' pthread; do + if test -z "$ac_lib"; then +ac_res="none required" + else +ac_res=-l$ac_lib +LIBS="-l$ac_lib $ac_func_search_save_LIBS" + fi + if ac_fn_c_try_link "$LINENO"; then : + ac_cv_search_pthread_barrier_wait=$ac_res +fi +rm -f core conftest.err conftest.$ac_objext \ +conftest$ac_exeext + if ${ac_cv_search_pthread_barrier_wait+:} false; then : + break +fi +done +if ${ac_cv_search_pthread_barrier_wait+:} false; then : + +else + ac_cv_search_pthread_barrier_wait=no +fi +rm conftest.$ac_ext +LIBS=$ac_func_search_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_pthread_barrier_wait" >&5 +$as_echo "$ac_cv_search_pthread_barrier_wait" >&6; } +ac_res=$ac_cv_search_pthread_barrier_wait +if test "$ac_res" != no; then : + test "$ac_res" = "none required" || LIBS="$ac_res $LIBS" + +fi + # Solaris: { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing fdatasync" >&5 $as_echo_n "checking for library containing fdatasync... " >&6; } @@ -15853,6 +15909,19 @@ esac fi +ac_fn_c_check_func "$LINENO" "pthread_barrier_wait" "ac_cv_func_pthread_barrier_wait" +if test "x$ac_cv_func_pthread_barrier_wait" = xyes; then : + $as_echo "#define HAVE_PTHREAD_BARRIER_WAIT 1" >>confdefs.h + +else + case " $LIBOBJS " in + *" pthread_barrier_wait.$ac_objext "* ) ;; + *) LIBOBJS="$LIBOBJS pthread_barrier_wait.$ac_objext" + ;; +esac + +fi + ac_fn_c_check_func "$LINENO" "pwrite" "ac_cv_func_pwrite" if test "x$ac_cv_func_pwrite" = xyes; then : $as_echo "#define HAVE_PWRITE 1" >>confdefs.h diff --git a/configure.ac b/configure.ac index a5ad072ee4..23ad861b27 100644 --- a/configure.ac +++ b/configure.ac @@ -1152,6 +1152,7 @@ AC_SEARCH_LIBS(getopt_long, [getopt gnugetopt]) AC_SEARCH_LIBS(shm_open, rt) AC_SEARCH_LIBS(shm_unlink, rt) AC_SEARCH_LIBS(clock_gettime, [rt posix4]) +AC_SEARCH_LIBS(pthread_barrier_wait, pthread) # Solaris: AC_SEARCH_LIBS(fdatasync, [rt posix4]) # Required for thread_test.c on Solaris @@ -1734,6 +1735,7 @@ AC_REPLACE_FUNCS(m4_normalize([ mkdtemp pread preadv + pthread_barrier_wait pwrite pwritev random diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index a4a3f40048..0ac0e186ea 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -66,6 +66,7 @@ #include "libpq-fe.h" #include "pgbench.h" #include "portability/instr_time.h" +#include "port/pg_pthread.h" #ifndef M_PI #define M_PI 3.14159265358979323846 @@ -109,26 +110,6 @@ typedef struct socket_set #endif /* POLL_USING_SELECT */ -/* - * Multi-platform pthread implementations - */ - -#ifdef WIN32 -/* Use native win32 threads on Windows */ -typedef struct win32_pthread *pthread_t; -typedef int pthread_attr_t; - -static int pthread_create(pthread_t
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, I finally took some time to look at this. Attached v11 is a rebase. This patch still has a few of the problems reported earlier this year. In [0], it was reported that certain replication commands result in infinite loops because of faulty error handling. This still happens. I wrote a test for it, attached here. (I threw in a few more basic tests, just to have some more coverage that was lacking, and to have a file to put the new test in.) Hmmm… For some unclear reason on errors on a PGRES_COPY_* state PQgetResult keeps on returning an empty result. PQexec manually ignores it, so I did the same with a comment, but for me the real bug is somehow in PQgetResult behavior… In [1], it was reported that server crashes produce duplicate error messages. This also still happens. I didn't write a test for it, maybe you have an idea. (Obviously, we could check whether the error message is literally there twice in the output, but that doesn't seem very general.) But it's easy to test manually: just have psql connect, shut down the server, then run a query. This is also a feature/bug of libpq which happens to be hidden by PQexec: when one command crashes PQgetResult actually returns *2* results. First one with the FATAL message, second one when libpq figures out that the connection was lost with the second message appended to the first. PQexec just happen to silently ignore the first result. I added a manual reset of the error message when first shown so that it is not shown twice. It is unclear to me whether the reset should be somewhere in libpq instead. I added a voluntary crash at the end of the psql test. Attached v12 somehow fixes these issues in "psql" code rather than in libpq. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index ae38d3dcc3..1d411ae124 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3564,10 +3557,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4154,6 +4143,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index ec975c3e2a..e06699878b 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -33,6 +33,8 @@ static bool DescribeQue
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Hello Justin, It seems that the v31 patch does not apply anymore: postgresql> git apply ~/v31-0001-Document-historic-behavior-of-links-to-directori.patch error: patch failed: doc/src/sgml/func.sgml:27410 error: doc/src/sgml/func.sgml: patch does not apply -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
[...] I agree that these two behaviors in libpq are dubious, especially the second one. I want to spend some time analyzing this more and see if fixes in libpq might be appropriate. Ok. My analysis is that fixing libpq behavior is not in the scope of a psql patch, and that if I was to do that it was sure delay the patch even further. Also these issues/features are corner cases that provably very few people bumped into. -- Fabien.
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Hello Justin, Happy new year! I think the 2nd half of the patches are still waiting for fixes to lstat() on windows. Not your problem? Here is my review about v32: All patches apply cleanly. # part 01 One liner doc improvement to tell that creation time is only available on windows. It is indeed not available on Linux. OK. # part 02 Add tests for various options on pg_ls_dir, and for pg_stat_file, which were not exercised before. "make check" is ok. OK. # part 03 This patch adds a new pg_ls_dir_metadata. Internally, this is an extension of pg_ls_dir_files function which is used by other pg_ls functions. Doc ok. About the code: ISTM that the "if (1) { if (2) continue; } else if(3) { if (4) continue; }" structure" may be simplified to "if (1 && 2) continue; if (3 && 4) continue;", at least if IS_DIR and IS_REG are incompatible? Otherwise, at least "else if (3 & 4) continue"? The ifdef WIN32 (which probably detects windows 64 bits…) overwrites values[3]. ISTM it could be reordered so that there is no overwrite, and simpler single assignements. #ifndef WIN32 v = ...; #else v = ... ? ... : ...; #endif New tests are added which check that the result columns are configured as required, including error cases. "make check" is ok. OK. # part 04 Add a new "isdir" column to "pg_ls_tmpdir" output. This is a small behavioral change. I'm ok with it, however I'm unsure why we would not jump directly to the "type" char column done later in the patch series. ISTM all such functions should be extended the same way for better homogeneity? That would also impact "waldir", "archive_status", "logical_*", "replslot" variants. "make check" ok. OK. # part 05 This patch applies my previous advice:-) ISTM that parts 4 and 5 should be one single patch. The test changes show that only waldir has a test. Would it be possible to add minimal tests to other variants as well? "make check" ok. I'd consider add such tests with part 02. OK. # part 06 This part extends and adds a test for pg_ls_logdir. ISTM that it should be merged with the previous patches. "make check" is ok. OK. # part 07 This part extends pg_stat_file with more date informations. ISTM that the documentation should be clear about windows vs unix/cygwin specific data provided (change/creation). The code adds a new value_from_stat function to avoid code duplication. Fine with me. All pg_ls_*dir functions are impacted. Fine with me. "make check" is ok. OK. # part 08 This part substitutes lstat to stat. Fine with me. "make check" is ok. I guess that lstat does something under windows even if the concept of link is somehow different there. Maybe the doc should say so somewhere? OK. # part 09 This part switches the added "isdir" to a "type" column. "make check" is ok. This is a definite improvement. OK. # part 10 This part adds a redundant "isdir" column. I do not see the point. "make check" is ok. NOT OK. # part 11 This part adds a recurse option. Why not. However, the true value does not seem to be tested? "make check" is ok. My opinion is unclear. Overall, ignoring part 10, this makes a definite improvement to postgres ls capabilities. I do not seen any reason why not to add those. -- Fabien.
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Here is my review about v32: I forgot to tell that doc generation for the cumulated changes also works. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, With this "voluntary crash", the regression test output is now psql ... ok (test process exited with exit code 2) 281 ms Normally, I'd expect this during development if there was a crash somewhere, but showing this during a normal run now, and moreover still saying "ok", Well, from a testing perspective, the crash is voluntary and it is indeed ok:-) is quite weird and confusing. Maybe this type of test should be done in the TAP framework instead. It could. Another simpler option: add a "psql_voluntary_crash.sql" with just that test instead of modifying the "psql.sql" test script? That would keep the test exit code information, but the name of the script would make things clearer? Also, if non zero status do not look so ok, should they be noted as bad? -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, quite weird and confusing. Maybe this type of test should be done in the TAP framework instead. Attached v13 where the crash test is moved to tap. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 1ab200a4ad..0a22850912 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index f210ccbde8..b8e8c2b245 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -33,6 +33,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -353,7 +355,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -384,7 +386,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -472,6 +474,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -572,7 +586,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if (!AcceptResult(res, true)) { ClearOrSaveResult(res); res = NULL; @@ -595,11 +609,8 @@ int PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout) { bool timing = pset.timing; - PGresult *res; double elapsed_msec = 0; - instr_time before; - instr_time after; - FILE *fout; + int res; if (!pset.db) { @@ -608,77 +619
Re: psql - add SHOW_ALL_RESULTS option
Hello Andres, The reason this test constantly fails on cfbot windows is a use-after-free bug. Indeed! Thanks a lot for the catch and the debug! The ClearOrSaveResult function is quite annoying because it may or may not clear the result as a side effect. Attached v14 moves the status extraction before the possible clear. I've added a couple of results = NULL after such calls in the code. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 1ab200a4ad..0a22850912 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 3503605a7d..47eabcbb8e 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -573,7 +587,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if (!AcceptResult(res, true)) { ClearOrSaveResult(res); res = NULL; @@ -596,11 +610,8 @@ int PSQLexecWatch(const char *query, c
Re: pgsql: Add libpq pipeline mode support to pgbench
Bonjour Daniel, Ola Alvaro, Add libpq pipeline mode support to pgbench New metacommands \startpipeline and \endpipeline allow the user to run queries in libpq pipeline mode. Author: Daniel Vérité Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/b4e34135-2bd9-4b8a-94ca-27d760da2...@manitou-mail.org I did not notice that the libpq pipeline mode thread had a pgbench patch attached, otherwise I would have looked at it. Some minor post-commit comments: For consistency with the existing \if … \endif, ISTM that it could have been named \batch … \endbatch or \pipeline … \endpipeline? ISTM that the constraint checks (nesting, no \[ag]set) could be added to CheckConditional that could be renamed to CheckScript. That could allow to simplify the checks in the command execution as mere Asserts. -- Fabien.
Re: Using COPY FREEZE in pgbench
Hello Tatsuo-san, I have looked in the code of PQprotocolVersion. The only case in which it returns 0 is there's no connection. Yes, you are right. Once the connection has been successfuly established, there's no chance it fails. So I agree with you. Attached v3 patch addresses this. The "g" item in the section describing initialization steps (i.e. option -I). I'd suggest just to replace "COPY" with "COPY FREEZE" in the sentence. Ok. The section is needed to be modified. This is also addressed in the patch. V3 works for me and looks ok. I changed it to ready in the CF app. -- Fabien.
Re: Using COPY FREEZE in pgbench
V3 works for me and looks ok. I changed it to ready in the CF app. Thank you for your review! Unfortunately it seems cfbot is not happy with the patch. Argh. Indeed, I did not thought of testing on a partitioned table:-( ISTM I did "make check" in pgbench to trigger tap tests, but possibly it was in a dream:-( The feature is a little disappointing because if someone has partition tables then probably they have a lot of data and probably they would like freeze to work there. Maybe freeze would work on table partitions themselves, but I do not think it is worth the effort to do that. About v4 there is a typo in the doc "portioning": pgbench uses FREEZE option with 14 or later version of PostgreSQL to speed up subsequent VACUUM if portioning option is not specified. I'd suggest: pgbench uses the FREEZE option with 14 or later version of PostgreSQL to speed up subsequent VACUUM, unless partitions are enabled. -- Fabien.
Re: Using COPY FREEZE in pgbench
Attached is the v5 patch. About v5: doc gen ok, global and local make check ok. I did a few tests on my laptop. Is seems that copying takes a little more time, say about 10%, but vacuum is indeed very significantly reduced, so that the total time for copying and vacuuming is reduced by 10% on overall. So it is okay for me. -- Fabien.
Re: Using COPY FREEZE in pgbench
Hello Tatsuo-san, 13.2 pgbench + master branch server: done in 15.47 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.07 s, vacuum 2.07 s, primary keys 4.13 s). With patch on master branch: done in 13.38 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.68 s, vacuum 0.23 s, primary keys 3.27 s). Yes, this is the kind of figures I got on my laptop. This time current pgbench performs much faster than I wrote (15.47 s vs. 70.78 s). I don't why. Duno. Anyway, this time total pgbench time is reduced by 14% over all here. I hope people agree that the patch is worth the gain. Yes, because (1) why not take +10% and (2) it exercises an option. -- Fabien.
Re: pgbench - add pseudo-random permutation function
Hello Dean, Thanks a lot for this work. This version looks much better than the previous one you sent, and has significant advantages over the one I sent, in particular avoiding the prime number stuff and large modular multiply. So this looks good! I'm happy that halves-xoring is back because it was an essential part of the design. ISTM that the previous version you sent only had linear/affine transforms which was a bad idea. The linear transform is moved inside halves, why not, and the any-odd-number multiplication is prime with power-of-two trick is neat on these. However I still have some reservations: First, I have a thing against erand48. The erand 48 bits design looks like something that made sense when computer architectures where 16/32 bits and large integers were 32 bits, so a 16 bits margin looked good enough. Using a int16[3] type now is really depressing and probably costly on modern architectures. With 64 bits architecture, and given that we are manipulating 64 bits integers (well 63 really), ISTM that the minimal state size for a PRNG should be at least 64 bits. It there a good reason NOT to use a 64 bits state prn generator? I took Knuth's because it is cheap and 64 bits. I'd accept anything which is at least 64 bits. I looked at xor-shift things, but there was some controversy around these so I kept it simple and just shifted small values to get rid of the obvious cycles on Knuth's generator. Second, I have a significant reservation about the very structure of the transformation in this version: loop 4 times : // FIRST HALF STEER m/r = pseudo randoms if v in first "half" v = ((v * m) ^ r) & mask; rotate1(v) // FULL SHIFT 1 r = pseudo random v = (v + r) % size // SECOND HALF STEER m/r = pseudo randoms if v in second "half" same as previous on second half // FULL SHIFT 2 r = pseudo random v = (v + r) % size I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of values are kept out of STEERING. Whole chunks of values could be kept unshuffled because they would only have SHIFTS apply to them and each time fall in the not steered half. It should be an essential part of the design that at least one steer is applied on a value at each round, and if two are applied then fine, but certainly not zero. So basically I think that the design would be significantly improved by removing "FULL SHIFT 1". Third, I think that the rotate code can be simplified, in particular the ?: should be avoided because it may induce branches quite damaging to processor performance. I'll give it some more thoughts. -- Fabien.
Re: pgbench - add pseudo-random permutation function
Hello Dean, First, I have a thing against erand48. Yeah, that's probably a fair point. However, all the existing pgbench random functions are using it, so I think it's fair enough for permute() to do the same (and actually 2^48 is pretty huge). Switching to a 64-bit PRNG might not be a bad idea, but I think that's something we'd want to do across the board, and so I think it should be out of scope for this patch. But less likely to pass, whereas here we have an internal function that we can set as we want. Also, there is a 64 bits seed provided to the function which instantly ignores 16 of them, which looks pretty silly to me. Also, the function is named everywhere erand48 with its hardcoded int16[3] state, which makes a poor abstraction. At least, I suggest that two 48-bits prng could be initialized with parts of the seed and used in different places, eg for r & m. Also, the seed could be used to adjust the rotation, maybe. I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of values are kept out of STEERING. [...] Ah, that's a good point. Something else that also concerned me there was that it might lead to 2 consecutive full shifts with nothing in between, which would lead to less uniform randomness (like the Irwin-Hall distribution). I just did a quick test without the first full shift, and the results do appear to be better, Indeed, it makes sense to me. so removing that looks like a good idea. Third, I think that the rotate code can be simplified, in particular the ?: should be avoided because it may induce branches quite damaging to processor performance. Yeah, I wondered about that. Perhaps there's a "trick" that can be used to simplify it. Pre-computing the number of bits in the mask would probably help. See pg_popcount64(). -- Fabien.
Re: pgbench - add pseudo-random permutation function
Hello Dean, OK, attached is an update making this change and simplifying the rotate code, which hopefully just leaves the question of what (if anything) to do with pg_erand48(). Yep. While looking at it, I have some doubts on this part: m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1; r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)); r = (uint64) (pg_erand48(random_state.xseed) * size); I do not understand why the random values are multiplied by anything in the first place… This one looks like a no-op : r = (uint64) (pg_erand48(random_state.xseed) * size); v = (v + r) % size; v = (v + r) % size = (v + rand * size) % size =? (v % size + rand * size % size) % size =? (v % size + 0) % size = v % size = v I'm also skeptical about this one: r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)); if (v <= mask) v = ((v * m) ^ r) & mask; v = ((v * m) ^ r) & mask = ((v * m) ^ r) % (mask+1) = ((v * m) ^ (rand * (mask+1))) % (mask+1) =? ((v * m) % (mask+1)) ^ (rand * (mask+1) % (mask+1)) =? ((v * m) % (mask+1)) ^ (0) = (v * m) & mask Or possibly I'm missing something obvious and I'm wrong with my arithmetic? -- Fabien.
Re: pgbench - add pseudo-random permutation function
r = (uint64) (pg_erand48(random_state.xseed) * size); I do not understand why the random values are multiplied by anything in the first place… These are just random integers in the range [0,mask] and [0,size-1], formed in exactly the same way as getrand(). Indeed, erand returns a double, this was the part I was missing. I did not realize that you had switched to doubles in your approach. I think that permute should only use integer operations. I'd suggest to use one of the integer variants instead of going through a double computation and casting back to int. The internal state is based on integers, I do not see the added value of going through floats, possibly enduring floating point issues (undeflow, rounding, normalization, whatever) on the way, whereas from start to finish we just need ints. See attached v27 proposal. I still think that *rand48 is a poor (relatively small state) and inefficient (the implementation includes packing and unpacking 16 bits ints to build a 64 bits int) choice. -- Fabien.
Re: pgbench - add pseudo-random permutation function
See attached v27 proposal. As usual, it is easier to see with the actual attachement:-) -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 50cf22ba6b..84d9566f49 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -1057,7 +1057,7 @@ pgbench options d default_seed - seed used in hash functions by default + seed used in hash and pseudorandom permutation functions by default @@ -1864,6 +1864,24 @@ SELECT 4 AS four \; SELECT 5 AS five \aset + + +permute ( i, size [, seed ] ) +integer + + +Permuted value of i, in the range +[0, size). This is the new position of +i (modulo size) in a +pseudorandom permutation of the integers 0...size-1, +parameterized by seed. + + +permute(0, 4) +an integer between 0 and 3 + + + pi () @@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / + + + When designing a benchmark which selects rows non-uniformly, be aware + that the rows chosen may be correlated with other data such as IDs from + a sequence or the physical row ordering, which may skew performance + measurements. + + + To avoid this, you may wish to use the permute + function, or some other additional step with similar effect, to shuffle + the selected rows and remove such correlations. + + + Hash functions hash, hash_murmur2 and hash_fnv1a accept an input value and an optional seed parameter. In case the seed isn't provided the value of :default_seed is used, which is initialized randomly unless set by the command-line --D option. Hash functions can be used to scatter the -distribution of random functions such as random_zipfian or -random_exponential. For instance, the following pgbench -script simulates possible real world workload typical for social media and -blogging platforms where few accounts generate excessive load: +-D option. + + + +permute accepts an input value, a size, and an optional +seed parameter. It generates a pseudorandom permutation of integers in +the range [0, size), and returns the index of the input +value in the permuted values. The permutation chosen is parameterized by +the seed, which defaults to :default_seed, if not +specified. Unlike the hash functions, permute ensures +that there are no collisions or holes in the output values. Input values +outside the interval are interpreted modulo the size. The function raises +an error if the size is not positive. permute can be +used to scatter the distribution of non-uniform random functions such as +random_zipfian or random_exponential +so that values drawn more often are not trivially correlated. For +instance, the following pgbench script +simulates a possible real world workload typical for social media and +blogging platforms where a few accounts generate excessive load: -\set r random_zipfian(0, 1, 1.07) -\set k abs(hash(:r)) % 100 +\set size 100 +\set r random_zipfian(1, :size, 1.07) +\set k 1 + permute(:r, :size) In some cases several distinct distributions are needed which don't correlate -with each other and this is when implicit seed parameter comes in handy: +with each other and this is when the optional seed parameter comes in handy: -\set k1 abs(hash(:r, :default_seed + 123)) % 100 -\set k2 abs(hash(:r, :default_seed + 321)) % 100 +\set k1 1 + permute(:r, :size, :default_seed + 123) +\set k2 1 + permute(:r, :size, :default_seed + 321) + +A similar behavior can also be approximated with hash: + + +\set size 100 +\set r random_zipfian(1, 100 * :size, 1.07) +\set k 1 + abs(hash(:r)) % :size + + +However, since hash generates collisions, some values +will not be reachable and others will be more frequent than expected from +the original distribution. diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y index 4d529ea550..56f75ccd25 100644 --- a/src/bin/pgbench/exprparse.y +++ b/src/bin/pgbench/exprparse.y @@ -19,6 +19,7 @@ #define PGBENCH_NARGS_VARIABLE (-1) #define PGBENCH_NARGS_CASE (-2) #define PGBENCH_NARGS_HASH (-3) +#define PGBENCH_NARGS_PERMUTE (-4) PgBenchExpr *expr_parse_result; @@ -370,6 +371,9 @@ static const struct { "hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A }, + { + "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE + }, /* keep as last array element */ { NULL, 0, 0 @@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args) } break; + /* pseudorandom permutation function with optional seed argument */ + case PGBENCH_NARGS_PERMUTE: + if (len < 2 |
Re: [PATCH] Implement motd for PostgreSQL
Hello Joel, This patch is one day late, my apologies for missing the deadline this year. PostgreSQL has since long been suffering from the lack of a proper UNIX style motd (message of the day). My 0.02€: apart from the Fool's day joke dimension, I'd admit that I would not mind actually having such a fun feature in pg, possibly disabled by default. -- Fabien.
Re: [PATCH] Implement motd for PostgreSQL
Perhaps the configuration-file parser has been fixed since to support embedded newlines? If so, then maybe it would actually be an idea to support newlines by escaping them? Dunno. If such a feature gets considered, I'm not sure I'd like to actually edit pg configuration file to change the message. The actual source looks pretty straightforward. I'm wondering whether pg style would suggest to write motd != NULL instead of just motd. I'm wondering whether it should be possible to designate (1) a file the content of which would be shown, or (2) a command, the output of which would be shown [ok, there might be security implications on this one]. -- Fabien.
Re: [PATCH] Implement motd for PostgreSQL
Hello Joel, My 0.02€: If such a feature gets considered, I'm not sure I'd like to actually edit pg configuration file to change the message. For the ALTER SYSTEM case, the value would be written to postgresql.auto.conf, and that file we shouldn't edit manually anyway, right? Sure. I meant change the configuration in any way, through manual editing, alter system, whatever. The actual source looks pretty straightforward. I'm wondering whether pg style would suggest to write motd != NULL instead of just motd. That's what I had originally, but when reviewing my code verifying code style, I noticed the other case it more common: If other cases are indeed pointers. For pgbench, all direct "if (xxx &&" cases are simple booleans or integers, pointers seem to have "!= NULL". While looking quickly at the grep output, ISTM that most obvious pointers have "!= NULL" and other cases often look like booleans: catalog/pg_operator.c: if (isDelete && t->oprcom == baseId) catalog/toasting.c: if (check && lockmode != AccessExclusiveLock) commands/async.c: if (amRegisteredListener && listenChannels == NIL) commands/explain.c: if (es->analyze && es->timing) … I'm sure there are exceptions, but ISTM that the local style is "!= NULL". I'm wondering whether it should be possible to designate (1) a file the content of which would be shown, or (2) a command, the output of which would be shown [ok, there might be security implications on this one]. Can't we just do that via plpgsql and EXECUTE somehow? Hmmm. Should we want to execute forcibly some PL/pgSQL on any new connection? Not sure this is really desirable. I was thinking of something more trivial, like the "motd" directeve could designate a file instead of the message itself. There could be a hook system to execute some user code on new connections and other events. It could be a new kind of event trigger, eg with connection_start, connection_end… That could be nice for other purposes, i.e. auditing. Now, event triggers are not global, they work inside a database, which would suggest that if extended a new connection event would be fired per database connection, not just once per connection. Not sure it would be a bad thing. -- Fabien.
Re: pgbench - add pseudo-random permutation function
Hello Dean, I think that permute should only use integer operations. I'd suggest to use one of the integer variants instead of going through a double computation and casting back to int. The internal state is based on integers, I do not see the added value of going through floats, possibly enduring floating point issues (undeflow, rounding, normalization, whatever) on the way, whereas from start to finish we just need ints. This is the already-established coding pattern used in getrand() to pick a random number uniformly in some range that's not necessarily a power of 2. Indeed. I'm not particularly happy with that one either. Floating point underflow and normalisation issues are not possible because erand48() takes a 48-bit integer N and uses ldexp() to store N/2^48 in a double, which is an exact operation since IEEE doubles have something like 56-bit mantissas. Double mantissa size is 52 bits. This is then turned back into an integer in the required range by multiplying by the desired maximum value, so there's never any risk of underflow or normalisation issues. ISTM that there are significant issues when multiplying with an integer, because the integer is cast to a double before multiplying, so if the int is over 52 bits then it is coldly truncated and some values are just lost in the process and will never be drawn. Probably not too many of them, but some of them anyway. I guess that there may be rounding variations once the required maximum value exceeds something like 2^56 (although the comment in getrand() is much more conservative than that), so it's possible that a pgbench script calling random() with (ub-lb) larger than that might give different results on different platforms. Dunno. This may be the same issue I'm pointing out above. For the non-uniform random functions, that effect might well kick in sooner. I'm not aware of any field complaints about that though, possibly because real-world data sizes are significantly smaller than that. In practice, permute() is likely to take its input from one of the non-uniform random functions, so it won't be permute() that first introduces rounding issues. Sure. I'd like permute to be immune to that. See attached v27 proposal. This update has a number of flaws. For example, this: Indeed:-) +static uint64 +randu64(RandomState * state) +{ +uint64 r1 = pg_jrand48((*state).xseed), + r2 = pg_jrand48((*state).xseed); +return r1 << 51 | r2 << 13 | r1 >> 13; +} It still uses a 48-bit RandomState, so it doesn't improve on getrand() in that respect. Sure. I'm pretty unhappy with that one, but I was not trying to address that. My idea that randu64 would be replace with something better at some point. My intention was "64-bits pseudo-random", my implementation does not work, ok:-) It replaces a single erand48() call with 2 jrand48() calls, which comes at a cost in terms of performance. (In fact, changing the number of rounds in the previous version of permute() from 4 to 6 has a smaller performance impact than this -- more about that below.) Sure, same remark as above, I was not trying to address that pointB. jrand48() returns a signed 32-bit integer, which has a 50% chance of being negative, so when that is cast to a uint64, there is a 50% chance that the 32 most significant bits will be 1. Argh. When the various parts are OR'ed together, that will then mask out any randomness from the other parts. For example, 50% of the time, the jrand48() value used for r1 will be negative, and so 32 bits in the middle of the final result will all be set. Argh. I hesitated to use xor. I should not have:-) So overall, the results will be highly non-uniform, with less randomness and poorer performance than erand48(). Indeed, bad choice. I wanted to used the unsigned version but it is not implemented, and swichted to the signed version without thinking of some of the implications. In addition, it returns a result in the range [0,2^64), which is not really what's wanted. For example: +/* Random offset */ +r = randu64(&random_state2); +v = (v + r) % size; The previous code used r = getrand(0, size-1), which gave a uniformly random offset. However, the new code risks introducing additional non-uniformity when size is not a power of 2. ISTM that the overall non uniformity is worse with the float approach as opposed to the int approach. Conceptually, the same kind of bias is expected whether you get through floats or through ints, because the underlying power-of-two maths is the same, so what makes the difference in reducing non-uniformity is using more bits. Basically, when enough bits are used the same number of values should appear n vs n+1 times. When not enough bits are provided, things get ugly: for instance, with size = 2^53, even if the floats were fully the 52-bit float pseudo-random mantissa (they are really 48 with erand48) would result in only even
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, My 0.02€: I tested this updated version and do not have any comment on this version. From my point of view it could be committed. I would not bother to separate the test style ajustments. Committed. The last thing I fixed was the diff in the copy2.out regression test. The order of the notices with respect to the error messages was wrong. I fixed that by switching back to the regular notice processor during COPY handling. Btw., not sure if that was mentioned before, but a cool use of this is to \watch multiple queries at once, like select current_date \; select current_time \watch Indeed, that was one of the things I tested on the patch. I'm wondering whether the documentation should point this out explicitely. Anyway Thanks for the push! -- Fabien.
Re: pgbench - add pseudo-random permutation function
Hello Dean, Pushed. I decided not to go with the "joke" randu64() function, but instead used getrand() directly. This at least removes any *direct* use of doubles in permute() (though of course they're still used indirectly), and means that if getrand() is improved in the future, permute() will benefit too. Good idea, at least it is hidden and in one place. Thanks for the push! -- Fabien.
RE: psql - add SHOW_ALL_RESULTS option
Hello, I met a problem after commit 3a51306722. While executing a SQL statement with psql, I can't interrupt it by pressing ctrl+c. For example: postgres=# insert into test select generate_series(1,1000); ^C^CINSERT 0 1000 Press ctrl+c before finishing INSERT, and psql still continuing to INSERT. I can confirm this unexpected change of behavior on this commit. This is indeed e bug. Is it the result expected? Obviously not. And I think maybe it is better to allow users to interrupt by pressing ctrl+c. Obviously yes. The problem is that the cancellation stuff is cancelled too early after sending an asynchronous request. Attached a patch which attempts to fix this by moving the cancellation cancelling request after processing results. -- Fabien.diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 028a357991..0482e57d45 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -1117,7 +1117,6 @@ SendQueryAndProcessResults(const char *query, double *pelapsed_msec, bool is_wat INSTR_TIME_SET_CURRENT(before); success = PQsendQuery(pset.db, query); - ResetCancelConn(); if (!success) { @@ -1257,6 +1256,9 @@ SendQueryAndProcessResults(const char *query, double *pelapsed_msec, bool is_wat if (!CheckConnection()) return -1; + /* all results are processed, nothing to cancel anymore */ + ResetCancelConn(); + return success ? 1 : -1; }
Re: psql - add SHOW_ALL_RESULTS option
Bonjour Julien, Attached a patch which attempts to fix this by moving the cancellation cancelling request after processing results. Thank you for your fixing. I tested and the problem has been solved after applying your patch. Thanks for the patch Fabien. I've hit this issue multiple time and this is indeed unwelcome. Should we add it as an open item? It is definitely a open item. I'm not sure where you want to add it… possibly the "Pg 14 Open Items" wiki page? I tried but I do not have enough privileges, if you can do it please proceed. I added an entry in the next CF in the bugfix section. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Bonjour Michaël, I was running a long query this morning and wondered why the cancellation was suddenly broken. So I am not alone, and here you are with already a solution :) So, studying through 3a51306, this stuff has changed the query execution from a sync PQexec() to an async PQsendQuery(). Yes, because we want to handle all results whereas PQexec jumps to the last one. And the proposed fix changes back to the behavior where the cancellation reset happens after getting a result, as there is no need to cancel anything. Yep. ISTM that was what happens internally in PQexec. No strong objections from here if the consensus is to make SendQueryAndProcessResults() handle the cancel reset properly though I am not sure if this is the cleanest way to do things, I was wondering as well, I did a quick fix because it can be irritating and put off looking at it more precisely over the week-end. but let's make at least the whole business consistent in the code for all those code paths. There are quite a few of them, some which reset the stuff and some which do not depending on various conditions, some with early exits, all of which required brain cells and a little time to investigate… For example, PSQLexecWatch() does an extra ResetCancelConn() that would be useless once we are done with SendQueryAndProcessResults(). Also, I can see that SendQueryAndProcessResults() would not issue a cancel reset if the query fails, for \watch when cancel is pressed, and for \watch with COPY. So, my opinion here would be to keep ResetCancelConn() within PSQLexecWatch(), just add an extra one in SendQuery() to make all the three code paths printing results consistent, and leave SendQueryAndProcessResults() out of the cancellation logic. Yep, it looks much better. I found it strange that the later did a reset but was not doing the set. Attached v2 does as you suggest. That's strange, I don't think you need special permission there. It's working for me so I added an item with a link to the patch! As long as you have a community account, you should have the possibility to edit the page. After login as "calvin", I have "Want to edit, but don't see an edit button when logged in? Click here.". So if you feel that any change is required, please feel free to do so, of course. -- Fabien.diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 028a357991..5355086fe2 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -1117,7 +1117,6 @@ SendQueryAndProcessResults(const char *query, double *pelapsed_msec, bool is_wat INSTR_TIME_SET_CURRENT(before); success = PQsendQuery(pset.db, query); - ResetCancelConn(); if (!success) { @@ -1486,6 +1485,8 @@ SendQuery(const char *query) sendquery_cleanup: + ResetCancelConn(); + /* reset \g's output-to-filename trigger */ if (pset.gfname) {
Re: psql - add SHOW_ALL_RESULTS option
Attached v2 does as you suggest. There is not a single test of "ctrl-c" which would have caught this trivial and irritating regression. ISTM that a TAP test is doable. Should one be added? -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Yep, it looks much better. I found it strange that the later did a reset but was not doing the set. Attached v2 does as you suggest. Close enough. I was thinking about this position of the attached, which is more consistent with the rest. Given the structural complexity of the function, the end of the file seemed like a good place to have an all-path-guaranteed reset. I find it a little bit strange to have the Set at the upper level and the Reset in many… but not all branches, though. For instance the on_error_rollback_savepoint/svptcmd branch includes a reset long after many other conditional resets, I cannot guess whether the initial set is still active or has been long wiped out and this query is just not cancellable. Also, ISTM that in the worst case a cancellation request is sent to a server which is idle, in which case it will be ignored, so the code should be in no hurry to clean it, at least not at the price of code clarity. Anyway, the place you suggest seems ok. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Tom, It's right: this is dead code because all paths through the if-nest starting at line 1373 now leave results = NULL. Hence, this patch has broken the autocommit logic; Do you mean yet another feature without a single non-regression test? :-( I tend to rely on non regression tests to catch bugs in complex multi-purpose hard-to-maintain functions when the code is modified. I have submitted a patch to improve psql coverage to about 90%, but given the lack of enthousiasm, I simply dropped it. Not sure I was right not to insist. it's no longer possible to tell whether we should do anything with our savepoint. Between this and the known breakage of control-C, it seems clear to me that this patch was nowhere near ready for prime time. I think shoving it in on the last day before feature freeze was ill-advised, and it ought to be reverted. We can try again later. The patch has been asleep for quite a while, and was resurrected, possibly too late in the process. ISTM that fixing it for 14 is manageable, but this is not my call. -- Fabien.