Re: Enhanced error message to include hint messages for redundant options error
On Sun, May 2, 2021 at 4:27 AM Alvaro Herrera wrote: > > On 2021-May-01, Bharath Rupireddy wrote: > > > IMO, it's not good to change the function API just for showing up > > parse_position (which is there for cosmetic reasons I feel) in an error > > which actually has the option name clearly mentioned in the error message. > > Why not? We've done it before, I'm sure you can find examples in the > git log. The function API is not that critical -- these functions are > mostly only called from ProcessUtility and friends. I feel it is better to include it wherever possible. Regards, Vignesh
Re: Enhanced error message to include hint messages for redundant options error
Le dim. 2 mai 2021 à 18:31, vignesh C a écrit : > On Sun, May 2, 2021 at 4:27 AM Alvaro Herrera > wrote: > > > > On 2021-May-01, Bharath Rupireddy wrote: > > > > > IMO, it's not good to change the function API just for showing up > > > parse_position (which is there for cosmetic reasons I feel) in an error > > > which actually has the option name clearly mentioned in the error > message. > > > > Why not? We've done it before, I'm sure you can find examples in the > > git log. The function API is not that critical -- these functions are > > mostly only called from ProcessUtility and friends. > > I feel it is better to include it wherever possible. > +1 >
Re: Enhanced error message to include hint messages for redundant options error
On Sat, May 1, 2021 at 10:54 PM Alvaro Herrera wrote: > > On 2021-May-01, vignesh C wrote: > > > On Thu, Apr 29, 2021 at 10:44 PM Alvaro Herrera > > wrote: > > > > > > On 2021-Apr-29, vignesh C wrote: > > > > > > > Thanks for the comments, please find the attached v3 patch which has > > > > the change for the first part. > > > > > > Looks good to me. I would only add parser_errposition() to the few > > > error sites missing that. > > > > I have not included parser_errposition as ParseState was not available > > for these errors. > > Yeah, it's tough to do that in a few of those such as validator > functions, and I don't think we'd want to do that. However there are > some cases where we can easily add the parsestate as an argument -- for > example CreatePublication can get it in ProcessUtilitySlow and pass it > down to parse_publication_options; likewise for ExecuteDoStmt. I didn't > check other places. Thanks for the comments. I have changed in most of the places except for a few places like plugin functions, internal commands and changes that required changing more levels of function callers. Attached patch has the changes for the same. Thoughts? Regards, Vignesh From 7c56cea92947b82107e2298f6d48d934df6fd7d8 Mon Sep 17 00:00:00 2001 From: vignesh Date: Mon, 26 Apr 2021 18:40:36 +0530 Subject: [PATCH v5] Enhance error message. Enhanced error message, so that the user can easily identify the error. --- contrib/file_fdw/file_fdw.c | 20 +-- src/backend/catalog/aclchk.c| 22 ++-- src/backend/commands/copy.c | 66 -- src/backend/commands/dbcommands.c | 90 +- src/backend/commands/extension.c| 27 ++-- src/backend/commands/foreigncmds.c | 30 +++-- src/backend/commands/functioncmds.c | 55 src/backend/commands/publicationcmds.c | 39 +++--- src/backend/commands/sequence.c | 57 +++-- src/backend/commands/subscriptioncmds.c | 75 +-- src/backend/commands/tablecmds.c| 2 +- src/backend/commands/typecmds.c | 38 +++--- src/backend/commands/user.c | 131 +++- src/backend/parser/parse_utilcmd.c | 2 +- src/backend/replication/pgoutput/pgoutput.c | 31 +++-- src/backend/replication/walsender.c | 23 ++-- src/backend/tcop/utility.c | 20 +-- src/include/commands/defrem.h | 6 +- src/include/commands/publicationcmds.h | 4 +- src/include/commands/subscriptioncmds.h | 4 +- src/include/commands/typecmds.h | 2 +- src/include/commands/user.h | 2 +- src/test/regress/expected/copy2.out | 24 ++-- src/test/regress/expected/foreign_data.out | 8 +- src/test/regress/expected/publication.out | 4 +- 25 files changed, 352 insertions(+), 430 deletions(-) diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index 2c2f149fb0..f857d5af97 100644 --- a/contrib/file_fdw/file_fdw.c +++ b/contrib/file_fdw/file_fdw.c @@ -200,6 +200,7 @@ file_fdw_validator(PG_FUNCTION_ARGS) char *filename = NULL; DefElem*force_not_null = NULL; DefElem*force_null = NULL; + DefElem*def; List *other_options = NIL; ListCell *cell; @@ -209,7 +210,7 @@ file_fdw_validator(PG_FUNCTION_ARGS) */ foreach(cell, options_list) { - DefElem*def = (DefElem *) lfirst(cell); + def = (DefElem *) lfirst(cell); if (!is_valid_option(def->defname, catalog)) { @@ -290,10 +291,7 @@ file_fdw_validator(PG_FUNCTION_ARGS) else if (strcmp(def->defname, "force_not_null") == 0) { if (force_not_null) -ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("conflicting or redundant options"), - errhint("Option \"force_not_null\" supplied more than once for a column."))); +goto duplicate_error; force_not_null = def; /* Don't care what the value is, as long as it's a legal boolean */ (void) defGetBoolean(def); @@ -302,10 +300,7 @@ file_fdw_validator(PG_FUNCTION_ARGS) else if (strcmp(def->defname, "force_null") == 0) { if (force_null) -ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("conflicting or redundant options"), - errhint("Option \"force_null\" supplied more than once for a column."))); +goto duplicate_error; force_null = def; (void) defGetBoolean(def); } @@ -328,6 +323,13 @@ file_fdw_validator(PG_FUNCTION_ARGS) errmsg("either filename or program is required for file_fdw foreign tables"))); PG_RETURN_VOID(); + +duplicate_error: + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("option \"%s\" specified more than once", def->defname))); + PG_RETURN_VOID();/* keep compiler quiet */ + } /* diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c index e1573eb398..be38488934 100644 --- a/src/backend/catalog/aclchk.c +++ b/src/b
Re: Enhanced error message to include hint messages for redundant options error
On Sat, May 1, 2021 at 9:02 PM Bharath Rupireddy wrote: > > On Sat, May 1, 2021 at 7:25 PM vignesh C wrote: > > > > I'm not attaching above one line change as a patch, maybe Vignesh can > > > > merge this into the main patch. > > > > Thanks for the comments. I have merged the change into the attached patch. > > Thoughts? > > Thanks! v4 basically LGTM. Can we park this in the current commitfest > if not done already? > > Upon looking at the number of places where we have the "option \"%s\" > specified more than once" error, I, now strongly feel that we should > use goto duplicate_error approach like in compute_common_attribute, so > that we will have only one ereport(ERROR. We can change it in > following files: copy.c, dbcommands.c, extension.c, > compute_function_attributes, sequence.c, subscriptioncmds.c, > typecmds.c, user.c, walsender.c, pgoutput.c. This will reduce the LOC > greatly. > > Thoughts? I have made the changes for this, I have posted the same in the v5 patch posted in my earlier mail. Regards, Vignesh
Re: Identify missing publications from publisher while create/alter subscription.
On Sat, May 1, 2021 at 7:58 PM Bharath Rupireddy wrote: > > On Sat, May 1, 2021 at 12:49 PM vignesh C wrote: > > Thanks for the comments, Attached patch has the fixes for the same. > > Thoughts? > > Few more comments on v5: > > 1) Deletion of below empty new line is spurious: > - > /* > * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands. > * > Modified. > 2) I think we could just do as below to save indentation of the code > for validate_publication == true. > static void > +connect_and_check_pubs(Subscription *sub, List *publications, > + bool validate_publication) > +{ > +char *err; > + > +if (validate_pulication == false ) > +return; > + > +/* Load the library providing us libpq calls. */ > +load_file("libpqwalreceiver", false); > Modified. > 3) To be consistent, either we pass in validate_publication to both > connect_and_check_pubs and check_publications, return immediately from > them if it is false or do the checks outside. I suggest to pass in the > bool parameter to check_publications like you did for > connect_and_check_pubs. Or remove validate_publication from > connect_and_check_pubs and do the check outside. > +if (validate_publication) > +check_publications(wrconn, publications); > +if (check_pub) > +check_publications(wrconn, sub->publications); > Modified. > 4) Below line of code is above 80-char limit: > +else if (strcmp(defel->defname, "validate_publication") == 0 > && validate_publication) > Modified > 5) Instead of adding a new file 021_validate_publications.pl for > tests, spawning a new test database which would make the overall > regression slower, can't we add with the existing database nodes in > 0001_rep_changes.pl? I would suggest adding the tests in there even if > the number of tests are many, I don't mind. 001_rep_changes.pl has the changes mainly for checking the replicated data. I did not find an appropriate file in the current tap tests, I preferred these tests to be in a separate file. Thoughts? Thanks for the comments. The Attached patch has the fixes for the same. Regards, Vignesh From 89a1e47c8be015fac520d2f94d16f86bf78a481c Mon Sep 17 00:00:00 2001 From: vignesh Date: Wed, 7 Apr 2021 22:05:53 +0530 Subject: [PATCH v6] Identify missing publications from publisher while create/alter subscription. Creating/altering subscription is successful when we specify a publication which does not exist in the publisher. This patch checks if the specified publications are present in the publisher and throws an error if any of the publication is missing in the publisher. --- doc/src/sgml/ref/alter_subscription.sgml | 13 + doc/src/sgml/ref/create_subscription.sgml | 13 + src/backend/commands/subscriptioncmds.c | 232 +++--- src/bin/psql/tab-complete.c | 7 +- .../t/021_validate_publications.pl| 82 +++ 5 files changed, 315 insertions(+), 32 deletions(-) create mode 100644 src/test/subscription/t/021_validate_publications.pl diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml index 367ac814f4..81e156437b 100644 --- a/doc/src/sgml/ref/alter_subscription.sgml +++ b/doc/src/sgml/ref/alter_subscription.sgml @@ -160,6 +160,19 @@ ALTER SUBSCRIPTION name RENAME TO < + + +validate_publication (boolean) + + + When true, the command verifies if all the specified publications + that are being subscribed to are present in the publisher and throws + an error if any of the publication doesn't exist. The default is + false. + + + + diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml index e812beee37..2f1f541253 100644 --- a/doc/src/sgml/ref/create_subscription.sgml +++ b/doc/src/sgml/ref/create_subscription.sgml @@ -239,6 +239,19 @@ CREATE SUBSCRIPTION subscription_name + + +validate_publication (boolean) + + + When true, the command verifies if all the specified publications + that are being subscribed to are present in the publisher and throws + an error if any of the publication doesn't exist. The default is + false. + + + + diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index 517c8edd3b..d731ba3e14 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -69,7 +69,9 @@ parse_subscription_options(List *options, char **synchronous_commit, bool *refresh, bool *binary_given, bool *binary, - bool *streaming_given, bool *streaming) + bool *streaming_given, bool *streaming, +
Re: Regex performance regression induced by match-all code
"Joel Jacobson" writes: > On Sat, May 1, 2021, at 21:46, Tom Lane wrote: >> I found a nasty performance problem in commit 824bf7190: given the >> right sort of regex, checkmatchall() takes an unreasonable amount >> of time. > Nice catch. > I've successfully tested the patch on the regex corpus: Thanks for testing! >> The main thing I find a bit ugly about this solution is that >> I'm abusing the state->tmp fields (which are declared struct state *) >> to hold the memoization results (which are "bool *"). It'd be >> possible to avoid that by allocating an additional "bool **" array >> indexed by state number, but whether it's worth that depends on how >> allergic you are to weird pointer casts. I tried rewriting it like that, and I have to say I do like it better that way. I think it's clearer, which seems well worth one extra malloc. Also, I poked a little more at the question of the heuristic limit on number of states, by checking the actual numbers of states in various ways of writing matchall regexes. It looks to me like we can cut the limit to DUPINF*2 and still have lots of daylight, because reasonable (and even not so reasonable) ways to write a pattern that can match up to K characters seem to come out with not much more than K states. Hence, PFA v2. I also added a couple of test cases based on looking at code coverage in this area, as well as a case that takes an unreasonably long time without this fix. regards, tom lane diff --git a/src/backend/regex/regc_nfa.c b/src/backend/regex/regc_nfa.c index 77b860cb0f..6d77c59e12 100644 --- a/src/backend/regex/regc_nfa.c +++ b/src/backend/regex/regc_nfa.c @@ -3032,96 +3032,189 @@ analyze(struct nfa *nfa) static void checkmatchall(struct nfa *nfa) { - bool hasmatch[DUPINF + 1]; - int minmatch, -maxmatch, -morematch; + bool **haspaths; + struct state *s; + int i; /* - * hasmatch[i] will be set true if a match of length i is feasible, for i - * from 0 to DUPINF-1. hasmatch[DUPINF] will be set true if every match - * length of DUPINF or more is feasible. + * If there are too many states, don't bother trying to detect matchall. + * This limit serves to bound the time and memory we could consume below. + * Note that even if the graph is all-RAINBOW, if there are significantly + * more than DUPINF states then it's likely that there are paths of length + * more than DUPINF, which would force us to fail anyhow. In practice, + * plausible ways of writing a matchall regex with maximum finite path + * length K tend not to have very many more than K states. */ - memset(hasmatch, 0, sizeof(hasmatch)); + if (nfa->nstates > DUPINF * 2) + return; /* - * Recursively search the graph for all-RAINBOW paths to the "post" state, - * starting at the "pre" state. The -1 initial depth accounts for the - * fact that transitions out of the "pre" state are not part of the - * matched string. We likewise don't count the final transition to the - * "post" state as part of the match length. (But we still insist that - * those transitions have RAINBOW arcs, otherwise there are lookbehind or - * lookahead constraints at the start/end of the pattern.) + * First, scan all the states to verify that only RAINBOW arcs appear, + * plus pseudocolor arcs adjacent to the pre and post states. This lets + * us quickly eliminate most cases that aren't matchall NFAs. */ - if (!checkmatchall_recurse(nfa, nfa->pre, false, -1, hasmatch)) - return; + for (s = nfa->states; s != NULL; s = s->next) + { + struct arc *a; + + for (a = s->outs; a != NULL; a = a->outchain) + { + if (a->type != PLAIN) +return; /* any LACONs make it non-matchall */ + if (a->co != RAINBOW) + { +if (nfa->cm->cd[a->co].flags & PSEUDO) +{ + /* + * Pseudocolor arc: verify it's in a valid place (this + * seems quite unlikely to fail, but let's be sure). + */ + if (s == nfa->pre && + (a->co == nfa->bos[0] || a->co == nfa->bos[1])) + /* okay BOS/BOL arc */ ; + else if (a->to == nfa->post && + (a->co == nfa->eos[0] || a->co == nfa->eos[1])) + /* okay EOS/EOL arc */ ; + else + return; /* unexpected pseudocolor arc */ + /* We'll check these arcs some more below. */ +} +else + return; /* any other color makes it non-matchall */ + } + } + /* Also, assert that the tmp fields are available for use. */ + assert(s->tmp == NULL); + } /* - * We found some all-RAINBOW paths, and not anything that we couldn't - * handle. Now verify that pseudocolor arcs adjacent to the pre and post - * states match the RAINBOW arcs there. (We could do this while - * recursing, but it's expensive and unlikely to fail, so do it last.) + * The next cheapest check we can make is to verify that the BOS/BOL + * outarcs of the pre state reach the same states as its RAINBOW outarcs. + * If they don't, the NFA expresses some constraints on the character + * before th
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
Hi! On Mon, Apr 19, 2021 at 9:57 AM Valentin Gatien-Baron wrote: > Looking at the tsvector and tsquery, we can see that the problem is > that the ":" counts as one position for the ts_query but not the > ts_vector: > > select to_tsvector('english', 'aaa: bbb'), websearch_to_tsquery('english', > '"aaa: bbb"'); >to_tsvector | websearch_to_tsquery > -+-- > 'aaa':1 'bbb':2 | 'aaa' <2> 'bbb' > (1 row) It seems there is another bug with phrase search and query parsing. It seems to me that since 0c4f355c6a websearch_to_tsquery() should just parse text in quotes as a single token. Besides fixing this bug, it simplifies the code. Trying to fix this bug before 0c4f355c6a doesn't seem to worth the efforts. I propose to push the attached patch to v14. Objections? -- Regards, Alexander Korotkov 0001-Make-websearch_to_tsquery-parse-text-in-quotes-as-a-.patch Description: Binary data
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
Alexander Korotkov writes: > It seems there is another bug with phrase search and query parsing. > It seems to me that since 0c4f355c6a websearch_to_tsquery() should > just parse text in quotes as a single token. Besides fixing this bug, > it simplifies the code. OK ... > Trying to fix this bug before 0c4f355c6a doesn't seem to worth the efforts. Agreed, plus it doesn't sound like the sort of behavior change that we want to push out in minor releases. > I propose to push the attached patch to v14. Objections? This patch seems to include some unrelated fooling around in GiST? regards, tom lane
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
On Sun, May 2, 2021 at 8:52 PM Tom Lane wrote: > Alexander Korotkov writes: > > It seems there is another bug with phrase search and query parsing. > > It seems to me that since 0c4f355c6a websearch_to_tsquery() should > > just parse text in quotes as a single token. Besides fixing this bug, > > it simplifies the code. > > OK ... > > > Trying to fix this bug before 0c4f355c6a doesn't seem to worth the efforts. > > Agreed, plus it doesn't sound like the sort of behavior change that > we want to push out in minor releases. +1 > > I propose to push the attached patch to v14. Objections? > > This patch seems to include some unrelated fooling around in GiST? Ooops, I've included this by oversight. The next revision is attached. Anything besides that? -- Regards, Alexander Korotkov 0001-Make-websearch_to_tsquery-parse-text-in-quotes-as-v2.patch Description: Binary data
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
Alexander Korotkov writes: > Ooops, I've included this by oversight. The next revision is attached. > Anything besides that? Some quick eyeball review: +/* Everything is quotes is processed as a single token */ Should read "Everything in quotes ..." -/* or else gettoken_tsvector() will raise an error */ +/* or else ƒtsvector() will raise an error */ Looks like an unintentional change? @@ -846,7 +812,6 @@ parse_tsquery(char *buf, state.buffer = buf; state.buf = buf; state.count = 0; - state.in_quotes = false; state.state = WAITFIRSTOPERAND; state.polstr = NIL; This change seems wrong/unsafe too. regards, tom lane
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
On Sun, May 2, 2021 at 10:57 AM Alexander Korotkov wrote: > On Sun, May 2, 2021 at 8:52 PM Tom Lane wrote: > > Alexander Korotkov writes: > > > It seems there is another bug with phrase search and query parsing. > > > It seems to me that since 0c4f355c6a websearch_to_tsquery() should > > > just parse text in quotes as a single token. Besides fixing this bug, > > > it simplifies the code. > > > > OK ... > > > > > Trying to fix this bug before 0c4f355c6a doesn't seem to worth the > efforts. > > > > Agreed, plus it doesn't sound like the sort of behavior change that > > we want to push out in minor releases. > > +1 > > > > I propose to push the attached patch to v14. Objections? > > > > This patch seems to include some unrelated fooling around in GiST? > > Ooops, I've included this by oversight. The next revision is attached. > > Anything besides that? > > -- > Regards, > Alexander Korotkov > Hi, + /* Everything is quotes is processed as a single token */ is quotes -> in quotes + /* iterate to the closing quotes or end of the string*/ closing quotes -> closing quote + /* or else ƒtsvector() will raise an error */ The character before tsvector() seems to be special. Cheers
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
On Sun, May 2, 2021 at 9:04 PM Tom Lane wrote: > Alexander Korotkov writes: > > Ooops, I've included this by oversight. The next revision is attached. > > Anything besides that? > > Some quick eyeball review: > > +/* Everything is quotes is processed as a single token */ > > Should read "Everything in quotes ..." > > -/* or else gettoken_tsvector() will raise an error */ > +/* or else ƒtsvector() will raise an error */ > > Looks like an unintentional change? Thank you for catching this! > @@ -846,7 +812,6 @@ parse_tsquery(char *buf, > state.buffer = buf; > state.buf = buf; > state.count = 0; > - state.in_quotes = false; > state.state = WAITFIRSTOPERAND; > state.polstr = NIL; > > This change seems wrong/unsafe too. It seems OK, because this patch removes in_quotes field altogether. We don't have to know whether we in quotes in the state, since we process everything in quotes as a single token. -- Regards, Alexander Korotkov 0001-Make-websearch_to_tsquery-parse-text-in-quotes-as-v3.patch Description: Binary data
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
On Sun, May 2, 2021 at 9:06 PM Zhihong Yu wrote: > + /* Everything is quotes is processed as a single token */ > > is quotes -> in quotes > > + /* iterate to the closing quotes or end of the string*/ > > closing quotes -> closing quote > > + /* or else ƒtsvector() will raise an error */ > > The character before tsvector() seems to be special. Thank you for catching. Fixed in v3. -- Regards, Alexander Korotkov
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
On Sun, May 2, 2021 at 11:12 AM Alexander Korotkov wrote: > On Sun, May 2, 2021 at 9:06 PM Zhihong Yu wrote: > > + /* Everything is quotes is processed as a single > token */ > > > > is quotes -> in quotes > > > > + /* iterate to the closing quotes or end of the > string*/ > > > > closing quotes -> closing quote > > > > + /* or else ƒtsvector() will raise an error */ > > > > The character before tsvector() seems to be special. > > Thank you for catching. Fixed in v3. > > -- > Regards, > Alexander Korotkov > Hi, One minor comment: + /* iterate to the closing quotes or end of the string*/ closing quotes -> closing quote Cheers
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
On Sun, May 2, 2021 at 9:17 PM Zhihong Yu wrote: > One minor comment: > + /* iterate to the closing quotes or end of the string*/ > > closing quotes -> closing quote Yep, I've missed the third place to change from plural to single form :) -- Regards, Alexander Korotkov 0001-Make-websearch_to_tsquery-parse-text-in-quotes-as-v4.patch Description: Binary data
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
Alexander Korotkov writes: > On Sun, May 2, 2021 at 9:04 PM Tom Lane wrote: >> - state.in_quotes = false; >> >> This change seems wrong/unsafe too. > It seems OK, because this patch removes in_quotes field altogether. Oh, sorry, I misread the patch --- I thought that earlier hunk was removing a local variable. Agreed, if you can do without this state field altogether, that's fine. regards, tom lane
Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
On Sun, May 2, 2021 at 9:37 PM Tom Lane wrote: > Alexander Korotkov writes: > > On Sun, May 2, 2021 at 9:04 PM Tom Lane wrote: > >> - state.in_quotes = false; > >> > >> This change seems wrong/unsafe too. > > > It seems OK, because this patch removes in_quotes field altogether. > > Oh, sorry, I misread the patch --- I thought that earlier hunk > was removing a local variable. Agreed, if you can do without this > state field altogether, that's fine. OK, thank you for review! -- Regards, Alexander Korotkov
pg_upgrade not preserving comments on predefined roles
Hi hackers, I recently did a pg_upgrade to 13 at $work, and noticed it did not preserve the comments I had added locally on the pg_* predefined roles. We have a bgworker that runs periodically and makes a report of existing roles, memberships, and grants, showing the comments on the roles, so I had added comments on the predefined ones so they would not look naked and unexplained in the report. All I had to do was go back to a pre-upgrade version of the report and re-add the comments. Is there an inherent technical or policy reason for pg_upgrade not to preserve comments on predefined roles (or on predefined objects generally)? For that matter, would it be objectionable for the predefined roles to come with comments right out of the box? I guess one possible objection could be "what next? comments on everything in pg_catalog?", but perhaps there is a way to distinguish the case of predefined roles: they are a relatively recent, um, encroachment into a namespace traditionally managed by the admin, so maybe there's that much extra reason for them to come with explanations attached. Another objection might be that they'd presumably be subject to translation, and would need some way for initdb to install the proper localized versions. So maybe it is simpler to leave them uncommented by default, but perhaps desirable for pg_upgrade to preserve comments locally added. I've appended the comments we use for them at $work, anyway. Regards, -Chap COMMENT ON ROLE pg_execute_server_program IS 'Allow executing programs on the database server as the user the database runs as with COPY and other functions which allow executing a server-side program. Since PG 11.'; COMMENT ON ROLE pg_monitor IS 'Read/execute various monitoring views and functions. This role is a member of pg_read_all_settings, pg_read_all_stats and pg_stat_scan_tables. Since PG 10.'; COMMENT ON ROLE pg_read_all_settings IS 'Read all configuration variables, even those normally visible only to superusers. Since PG 10.'; COMMENT ON ROLE pg_read_all_stats IS 'Read all pg_stat_* views and use various statistics related extensions, even those normally visible only to superusers. Since PG 10.'; COMMENT ON ROLE pg_read_server_files IS 'Allow reading files from any location the database user can access on the server with COPY and other file-access functions. Since PG 11.'; COMMENT ON ROLE pg_signal_backend IS 'Send signals to other backends (eg: cancel query, terminate). Since PG 9.6.'; COMMENT ON ROLE pg_stat_scan_tables IS 'Execute monitoring functions that may take ACCESS SHARE locks on tables, potentially for a long time. Since PG 10.'; COMMENT ON ROLE pg_write_server_files IS 'Allow writing to files in any location the database user can access on the server with COPY and other file-access functions. Since PG 11.';
Re: pg_upgrade not preserving comments on predefined roles
Chapman Flack writes: > Is there an inherent technical or policy reason for pg_upgrade not to > preserve comments on predefined roles (or on predefined objects generally)? I think this is absolutely out of scope for pg_dump. We generally expect that system objects' properties are not dumped, because they might be different in a newer version, and overwriting the system definition with a possibly-obsolete version would be a bad thing. You could quibble about comments being a different matter, but I don't buy it. Also, our one venture into this space (allowing custom modifications of system-object privileges to be propagated by pg_dump) has IMV been an unmitigated disaster. Years later, it *still* has unresolved bugs and definitional issues. So I'm going to run away screaming from any proposal to do likewise for other object properties. > For that matter, would it be objectionable for the predefined roles to > come with comments right out of the box? That, however, seems reasonable enough. We deliver built-in functions and operators with comments, so why not roles? > Another objection might be that they'd presumably be subject to translation, > and would need some way for initdb to install the proper localized versions. We've not worried about that for functions/operators. > I've appended the comments we use for them at $work, anyway. IMO these would have to be shortened quite a bit to be friendly for "\du+" displays. I'm not against the concept though. regards, tom lane
Re: WIP: WAL prefetch (another approach)
On Thu, Apr 29, 2021 at 12:24 PM Tom Lane wrote: > Andres Freund writes: > > On 2021-04-28 19:24:53 -0400, Tom Lane wrote: > >> IOW, we've spent over twice as many CPU cycles shipping data to the > >> standby as we did in applying the WAL on the standby. > > > I don't really know how the time calculation works on mac. Is there a > > chance it includes time spent doing IO? For comparison, on a modern Linux system I see numbers like this, while running that 025_stream_rep_regress.pl test I posted in a nearby thread: USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND tmunro 2150863 22.5 0.0 55348 6752 ?Ss 12:59 0:07 postgres: standby_1: startup recovering 00010002003C tmunro 2150867 17.5 0.0 55024 6364 ?Ss 12:59 0:05 postgres: standby_1: walreceiver streaming 2/3C675D80 tmunro 2150868 11.7 0.0 55296 7192 ?Ss 12:59 0:04 postgres: primary: walsender tmunro [local] streaming 2/3C675D80 Those ratios are better but it's still hard work, and perf shows the CPU time is all in page cache schlep: 22.44% postgres [kernel.kallsyms] [k] copy_user_enhanced_fast_string 20.12% postgres [kernel.kallsyms] [k] __add_to_page_cache_locked 7.30% postgres [kernel.kallsyms] [k] iomap_set_page_dirty That was with all three patches reverted, so it's nothing new. Definitely room for improvement... there have been a few discussions about not using a buffered file for high-frequency data exchange and relaxing various timing rules, which we should definitely look into, but I wouldn't be at all surprised if HFS+ was just much worse at this. Thinking more about good old HFS+... I guess it's remotely possible that there might have been coherency bugs in that could be exposed by our usage pattern, but then that doesn't fit too well with the clues I have from light reading: this is a non-SMP system, and it's said that HFS+ used to serialise pretty much everything on big filesystem locks anyway.
Re: Remove redundant variable from transformCreateStmt
On Fri, Apr 30, 2021 at 10:49 AM Bharath Rupireddy wrote: > > On Fri, Apr 30, 2021 at 7:07 AM Justin Pryzby wrote: > > > > On Thu, Apr 29, 2021 at 02:39:42PM -0400, Alvaro Herrera wrote: > > > On 2021-Apr-29, Tom Lane wrote: > > > > Alvaro Herrera writes: > > > > > I'd do it like this. Note I removed an if/else block in addition to > > > > > your changes. > > > > > > > > > I couldn't convince myself that this is worth pushing though; either > > > > > we > > > > > push it to all branches (which seems unwarranted) or we create > > > > > back-patching hazards. > > > > > > > > Yeah ... an advantage of the if/else coding is that it'd likely be > > > > simple to extend to cover additional statement types, should we ever > > > > wish to do that. The rendering you have here is nice and compact, > > > > but it would not scale up well. > > > > > > That makes sense. But that part is not in Amul's patch -- he was only > > > on about removing the is_foreign_table Boolean. If I remove the if/else > > > block change, does the rest of the patch looks something we'd want to > > > have? I kinda agree that the redundant variable is "ugly". Is it worth > > > removing? My hunch is no. > > > > Getting rid of a redundant, boolean variable is good not because it's more > > efficient but because it's one fewer LOC to read and maintain (and an > > opportunity for inconsistency, I suppose). > > Yes. > > > Also, this is a roundabout and too-verbose way to invert a boolean: > > | transformCheckConstraints(&cxt, !is_foreign_table ? true : false); > > I agree to remove only the redundant variable, is_foreign_table but > not the if else block as Tom said: it's not scalable. +1. Regards, Amul
Logical Replication - behavior of TRUNCATE ... CASCADE
Hi, In apply_handle_truncate, the following comment before ExecuteTruncateGuts says that it defaults to RESTRICT even if the CASCADE option has been specified in publisher's TRUNCATE command. /* * Even if we used CASCADE on the upstream primary we explicitly default * to replaying changes without further cascading. This might be later * changeable with a user specified option. */ I tried the following use case to see if that's actually true: 1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary key via foreign key) on both publisher and subscriber. 2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail because tbl_fk is dependent on tbl_pk. 3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is specified in the command. 4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both tbl_pk and tbl_fk are truncated. When this command is run on the publisher, the CASCADE option is sent to the subscriber, see DecodeTruncate. But the apply worker ignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. Therefore, the expectation(per the comment) is that on the subscriber, the behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. But we are also receiving the tbl_fk in the remote rels along with tbl_pk, so the behavior is equivalent to (3) and both tbl_pk and tbl_fk are truncated. Does the comment still hold true? Does ignoring the CASCADE option make sense in apply_handle_truncate, as we are receiving all the dependent relations in the remote rels from the publisher? Am I missing something? The commit id of the feature "Logical replication support for TRUNCATE" is 039eb6e92f, and adding relevant people in cc. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Identify missing publications from publisher while create/alter subscription.
On Sun, May 2, 2021 at 10:04 PM vignesh C wrote: > > Thanks for the comments. > The Attached patch has the fixes for the same. I was reviewing the documentation part, I think in the below paragraph we should include validate_publication as well? connect (boolean) Specifies whether the CREATE SUBSCRIPTION should connect to the publisher at all. Setting this to false will change default values of enabled, create_slot and copy_data to false. I will review/test the other parts of the patch and let you know. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila wrote: > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada wrote: > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila wrote: > > > > > > > > I am not sure if any of these alternatives are a good idea. What do > > > you think? Do you have any other ideas for this? > > > > I've been considering some ideas but don't come up with a good one > > yet. It’s just an idea and not tested but how about having > > CreateDecodingContext() register before_shmem_exit() callback with the > > decoding context to ensure that we send slot stats even on > > interruption. And FreeDecodingContext() cancels the callback. > > > > Is it a good idea to send stats while exiting and rely on the same? I > think before_shmem_exit is mostly used for the cleanup purpose so not > sure if we can rely on it for this purpose. I think we can't be sure > that in all cases we will send all stats, so maybe Vignesh's patch is > sufficient to cover the cases where we avoid losing it in cases where > we would have sent a large amount of data. > Sawada-San, any thoughts on this point? Apart from this, I think you have suggested somewhere in this thread to slightly update the description of stream_bytes. I would like to update the description of stream_bytes and total_bytes as below: stream_bytes Amount of transaction data decoded for streaming in-progress transactions to the decoding output plugin while decoding changes from WAL for this slot. This and other streaming counters for this slot can be used to tune logical_decoding_work_mem. total_bytes Amount of transaction data decoded for sending transactions to the decoding output plugin while decoding changes from WAL for this slot. Note that this includes data that is streamed and/or spilled. This update considers two points: a. we don't send this data across the network because plugin might decide to filter this data, ex. based on publications. b. not all of the decoded changes are sent to plugin, consider REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID, REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT, etc. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Fri, Apr 30, 2021 at 1:47 PM Amit Kapila wrote: > > LGTM. I have slightly edited the comments in the attached. I'll push > this early next week unless there are more comments. > Pushed. -- With Regards, Amit Kapila.
Re: Identify missing publications from publisher while create/alter subscription.
On Sun, May 2, 2021 at 10:04 PM vignesh C wrote: > > 5) Instead of adding a new file 021_validate_publications.pl for > > tests, spawning a new test database which would make the overall > > regression slower, can't we add with the existing database nodes in > > 0001_rep_changes.pl? I would suggest adding the tests in there even if > > the number of tests are many, I don't mind. > > 001_rep_changes.pl has the changes mainly for checking the replicated > data. I did not find an appropriate file in the current tap tests, I > preferred these tests to be in a separate file. Thoughts? If 001_rep_changes.pl is not the right place, how about adding them into 007_ddl.pl? That file seems to be only for DDL changes, and since the feature tests cases are for CREATE/ALTER SUBSCRIPTION, it's the right place. I strongly feel that we don't need a new file for these tests. Comment on the tests: 1) Instead of "pub_doesnt_exist" name, how about "non_existent_pub" or just pub_non_existent" or some other? + "CREATE SUBSCRIPTION testsub2 CONNECTION '$publisher_connstr' PUBLICATION pub_doesnt_exist WITH (VALIDATE_PUBLICATION = TRUE)" The error message with this name looks a bit odd to me. + /ERROR: publication "pub_doesnt_exist" does not exist in the publisher/, With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: WIP: WAL prefetch (another approach)
On Sun, May 2, 2021 at 3:16 PM Tom Lane wrote: > That last point means that there was some hard-to-hit problem even > before any of the recent WAL-related changes. However, 323cbe7c7 > (Remove read_page callback from XLogReader) increased the failure > rate by at least a factor of 5, and 1d257577e (Optionally prefetch > referenced data) seems to have increased it by another factor of 4. > But it looks like f003d9f87 (Add circular WAL decoding buffer) > didn't materially change the failure rate. Oh, wow. There are several surprising results there. Thanks for running those tests for so long so that we could see the rarest failures. Even if there are somehow *two* causes of corruption, one preexisting and one added by the refactoring or decoding patches, I'm struggling to understand how the chance increases with 1d2575, since that only adds code that isn't reached when not enabled (though I'm going to re-review that). > Considering that 323cbe7c7 was supposed to be just refactoring, > and 1d257577e is allegedly disabled-by-default, these are surely > not the results I was expecting to get. +1 > It seems like it's still an open question whether all this is > a real bug, or flaky hardware. I have seen occasional kernel > freezeups (or so I think -- machine stops responding to keyboard > or network input) over the past year or two, so I cannot in good > conscience rule out the flaky-hardware theory. But it doesn't > smell like that kind of problem to me. I think what we're looking > at is a timing-sensitive bug that was there before (maybe long > before?) and these commits happened to make it occur more often > on this particular hardware. This hardware is enough unlike > anything made in the past decade that it's not hard to credit > that it'd show a timing problem that nobody else can reproduce. Hmm, yeah that does seem plausible. It would be nice to see a report from any other system though. I'm still trying, and reviewing...
Re: Dump public schema ownership & seclabels
On Sat, May 01, 2021 at 09:43:36AM -0700, Zhihong Yu wrote: > On Sat, May 1, 2021 at 8:16 AM Asif Rehman wrote: > > The following review has been posted through the commitfest application: > > make installcheck-world: tested, passed > > Implements feature: tested, passed > > Spec compliant: tested, passed > > Documentation:not tested > > > > Hi, > > > > I have tested this patch. This patch still applies and it works well. > > > > Regards, > > Asif > > > > The new status of this patch is: Ready for Committer Thanks. Later, I saw that "pg_dump --schema=public" traditionally has yielded "CREATE SCHEMA public" and "COMMENT ON SCHEMA public". I've updated the patches to preserve that behavior. I'll push this when v15 branches. I do think it's a bug fix and could argue for including it in v14. On the other hand, I mailed three total patch versions now known to be wrong, so it would be imprudent to count on no surprises remaining. > For public-schema-comment-dump-v2.patch : > > + if (ncomments == 0) > + { > + comments = &empty_comment; > + ncomments = 1; > + } > + else if (strcmp(comments->descr, (fout->remoteVersion >= 80300 ? > + "standard public schema" : > + "Standard public schema")) == 0) > + { > + ncomments = 0; > > Is it possible that, in the case ncomments > 0, there are more than one > comment ? Yes, I think that's normal when the search terms include an objsubid (subid != InvalidOid). Author: Noah Misch Commit: Noah Misch Dump public schema ownership and security labels. As a side effect, this corrects dumps of public schema ACLs in databases where the DBA dropped and recreated that schema. Reviewed by Asif Rehman. Discussion: https://postgr.es/m/20201229134924.ga1431...@rfd.leadboat.com diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c index 60d306e..ea67e52 100644 --- a/src/bin/pg_dump/dumputils.c +++ b/src/bin/pg_dump/dumputils.c @@ -725,6 +725,7 @@ void buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery, PQExpBuffer init_acl_subquery, PQExpBuffer init_racl_subquery, const char *acl_column, const char *acl_owner, + const char *initprivs_expr, const char *obj_kind, bool binary_upgrade) { /* @@ -765,23 +766,25 @@ buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery, "WITH ORDINALITY AS perm(acl,row_n) " "WHERE NOT EXISTS ( " "SELECT 1 FROM " - "pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) " + "pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) " "AS init(init_acl) WHERE acl = init_acl)) as foo)", acl_column, obj_kind, acl_owner, + initprivs_expr, obj_kind, acl_owner); printfPQExpBuffer(racl_subquery, "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM " "(SELECT acl, row_n FROM " - "pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) " + "pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) " "WITH ORDINALITY AS initp(acl,row_n) " "WHERE NOT EXISTS ( " "SELECT 1 FROM " "pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) " "AS permp(orig_acl) WHERE acl = orig_acl)) as foo)", + initprivs_expr, obj_kind, acl_owner, acl_column, @@ -807,12 +810,13 @@ buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery, printfPQExpBuffer(init_acl_subquery, "CASE WHEN privtype = 'e' THEN " "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM " - "(SELECT acl, row_n FROM pg_catalog.unnest(pip.initprivs) " +
Re: Logical Replication - behavior of TRUNCATE ... CASCADE
On Mon, May 3, 2021 at 10:42 AM Bharath Rupireddy wrote: > > Hi, > > In apply_handle_truncate, the following comment before ExecuteTruncateGuts > says that it defaults to RESTRICT even if the CASCADE option has been > specified in publisher's TRUNCATE command. > /* > * Even if we used CASCADE on the upstream primary we explicitly default > * to replaying changes without further cascading. This might be later > * changeable with a user specified option. > */ > I tried the following use case to see if that's actually true: > 1) Created two tables tbl_pk (primary key), tbl_fk(references tbl_pk primary > key via foreign key) on both publisher and subscriber. > 2) In general, TRUNCATE tbl_pk; or TRUNCATE tbl_pk RESTRICT; would fail > because tbl_fk is dependent on tbl_pk. > 3) TRUNCATE tbl_pk, tbl_fk; would work because the dependent table is > specified in the command. > 4) TRUNCATE tbl_pk CASCADE; would work because of the CASCADE option and both > tbl_pk and tbl_fk are truncated. When this command is run on the publisher, > the CASCADE option is sent to the subscriber, see DecodeTruncate. But the > apply worker ignores it and passes DROP_RESTRICT to ExecuteTruncateGuts. > Therefore, the expectation(per the comment) is that on the subscriber, the > behavior should be equivalent to TRUNCATE tbl_pk;, so an error is expected. > But we are also receiving the tbl_fk in the remote rels along with tbl_pk, so > the behavior is equivalent to (3) and both tbl_pk and tbl_fk are truncated. > > Does the comment still hold true? Does ignoring the CASCADE option make sense > in apply_handle_truncate, as we are receiving all the dependent relations in > the remote rels from the publisher? Am I missing something? > > The commit id of the feature "Logical replication support for TRUNCATE" is > 039eb6e92f, and adding relevant people in cc. Assume this case publisher: tbl_pk -> tbl_fk_pub subscriber: tbl_pk-> tbl_fk_sub Now, in this case, this comment is true right because we are not supposed to truncate tbl_fk_sub on the subscriber side and this should error out. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Enhanced error message to include hint messages for redundant options error
On Sun, May 2, 2021 at 8:44 PM vignesh C wrote: > > On Sat, May 1, 2021 at 9:02 PM Bharath Rupireddy > wrote: > > > > On Sat, May 1, 2021 at 7:25 PM vignesh C wrote: > > > > > I'm not attaching above one line change as a patch, maybe Vignesh can > > > > > merge this into the main patch. > > > > > > Thanks for the comments. I have merged the change into the attached patch. > > > Thoughts? > > > > Thanks! v4 basically LGTM. Can we park this in the current commitfest > > if not done already? > > > > Upon looking at the number of places where we have the "option \"%s\" > > specified more than once" error, I, now strongly feel that we should > > use goto duplicate_error approach like in compute_common_attribute, so > > that we will have only one ereport(ERROR. We can change it in > > following files: copy.c, dbcommands.c, extension.c, > > compute_function_attributes, sequence.c, subscriptioncmds.c, > > typecmds.c, user.c, walsender.c, pgoutput.c. This will reduce the LOC > > greatly. > > > > Thoughts? > > I have made the changes for this, I have posted the same in the v5 > patch posted in my earlier mail. Thanks! The v5 patch looks good to me. Let's see if all agree on the goto duplicate_error approach which could reduce the LOC by ~80. I don't see it in the current commitfest, can we park it there so that the patch will get tested on cfbot systems? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com