Re: proposal: possibility to read dumped table's name from file
On Sat, Nov 12, 2022 at 09:35:59PM +0100, Pavel Stehule wrote: Thanks for the updated patch. Apart from the function comment it looks good to me. Justin, did you have any other comment on the patch? > > I don't fully understand the part about subpatterns, but is that necessary > > to > > describe it? Simply saying that any valid and possibly-quoted identifier > > can > > be parsed should make it clear that identifiers containing \n characters > > should > > work too. Maybe also just mention that whitespaces are removed and special > > care is taken to output routines in exactly the same way calling code will > > expect it (that is comma-and-single-space type delimiter). > > > > In this case I hit the limits of my English language skills. > > I rewrote this comment, but it needs more care. Please, can you look at it? I'm also not a native English speaker so I'm far for writing perfect comments myself :) Maybe something like /* * read_pattern - reads on object pattern from input * * This function will parse any valid identifier (quoted or not, qualified or * not), which can also includes the full signature for routines. * Note that this function takes special care to sanitize the detected * identifier (removing extraneous whitespaces or other unnecessary * characters). This is necessary as most backup/restore filtering functions * only recognize identifiers if they are written exactly way as they are * regenerated. * Returns a pointer to next character after the found identifier, or NULL on * error. */
Re: libpq support for NegotiateProtocolVersion
On 11.11.22 23:28, Jacob Champion wrote: Consider the case where the server sends a NegotiateProtocolVersion with a reasonable length, but then runs over its own message (either by sending an unterminated string as one of the extension names, or by sending a huge extension number). When I test that against a client on my machine, it churns CPU and memory waiting for the end of a message that will never come, even though it had already decided that the maximum length of the message should have been less than 2K. Put another way, why do we loop around and poll for more data when we hit the end of the connection buffer, if we've already checked at this point that we should have the entire message buffered locally? Isn't that the same behavior for other message types? I don't see anything in the handling of the early 'E' and 'R' messages that would handle this. If we want to address this, maybe this should be handled in the polling loop before we pass off the input buffer to the per-message-type handlers.
Re: refactor ownercheck and aclcheck functions
On 09.11.22 19:12, Corey Huinker wrote: After considering this again, I decided to brute-force this and get rid of all the trivial wrapper functions and also several of the special cases. That way, there is less confusion at the call sites about why this or that style is used in a particular case. Also, it now makes sure you can't accidentally use the generic functions when a particular one should be used. +1 committed
Re: libpq error message refactoring
On 09.11.22 13:29, Alvaro Herrera wrote: +/* + * Append a formatted string to the given buffer, after translation. A + * newline is automatically appended; the format should not end with a + * newline. + */ I find the "after translation" bit unclear -- does it mean that the caller should have already translated, or is it the other way around? I would say "Append a formatted string to the given buffer, after translating it", which (to me) conveys more clearly that translation occurs here. ok + /* Loop in case we have to retry after enlarging the buffer. */ + do + { + errno = save_errno; + va_start(args, fmt); + done = appendPQExpBufferVA(errorMessage, libpq_gettext(fmt), args); I wonder if it makes sense to do libpq_gettext() just once, instead of redoing it on each iteration. I wonder whether that would expose us to potential compiler warnings about the format string not being constant. As long as the compiler can trace that the string comes from gettext, it knows what's going on. Also, most error strings in practice don't need the loop, so maybe it's not a big issue. +/* + * appendPQExpBufferVA + * Shared guts of printfPQExpBuffer/appendPQExpBuffer. + * Attempt to format data and append it to str. Returns true if done + * (either successful or hard failure), false if need to retry. + * + * Caution: callers must be sure to preserve their entry-time errno + * when looping, in case the fmt contains "%m". + */ +extern bool appendPQExpBufferVA(PQExpBuffer str, const char *fmt, va_list args) pg_attribute_printf(2, 0); As an API user, I don't care that this is shared guts for something else, I just care about what it does. I think deleting that line is a sufficient fix. ok @@ -420,7 +418,8 @@ pqsecure_raw_write(PGconn *conn, const void *ptr, size_t len) snprintf(msgbuf, sizeof(msgbuf), libpq_gettext("server closed the connection unexpectedly\n" "\tThis probably means the server terminated abnormally\n" - "\tbefore or while processing the request.\n")); + "\tbefore or while processing the request.")); + strlcat(msgbuf, "\n", sizeof(msgbuf)); conn->write_err_msg = strdup(msgbuf); /* Now claim the write succeeded */ n = len; In these two places, we're writing the error message manually to a separate variable, so the extra \n is necessary. It looks a bit odd to do it with strlcat() after the fact, but AFAICT it's necessary as it keeps the \n out of the translation catalog, which is good. This is nonobvious, so perhaps add a comment about it. ok
Re: New docs chapter on Transaction Management and related changes
On Thu, 2022-11-10 at 12:17 +0100, Alvaro Herrera wrote: > On 2022-Nov-10, Laurenz Albe wrote: > > On Wed, 2022-11-09 at 09:16 -0500, Robert Treat wrote: > > > > > - If AND CHAIN is specified, a new > > > > > transaction is > > > > > + If AND CHAIN is specified, a new unaborted > > > > > transaction is > > > > > immediately started with the same transaction characteristics > > > > > (see > > > > linkend="sql-set-transaction"/>) as the just finished one. > > > > > Otherwise, > > > > > no new transaction is started. > > > > A new transaction is never aborted in my understanding. Being aborted > > is not a characteristic of a transaction, but a state. > > I agree, but maybe it's good to make the point explicit, because it > doesn't seem obvious. Perhaps something like > > "If X is specified, a new transaction (never in aborted state) is > immediately started with the same transaction characteristics (see X) as > the just finished one. Otherwise ..." > > Getting the wording of that parenthical comment right is tricky, though. > What I propose above is not great, but I don't know how to make it > better. Other ideas that seem slightly worse but may inspire someone: > > ... a new transaction (which is never in aborted state) is ... > ... a new transaction (not in aborted state) is ... > ... a new transaction (never aborted, even if the previous is) is ... > ... a new (not-aborted) transaction is ... > ... a new (never aborted) transaction is ... > ... a new (never aborted, even if the previous is) transaction is ... > ... a new (never aborted, regardless of the status of the previous one) > transaction is ... > > > Maybe there's a way to reword the entire phrase that leads to a better > formulation of the idea. Any of your auggestions is better than "unaborted". How about: If AND CHAIN is specified, a new transaction is immediately started with the same transaction characteristics (see ) as the just finished one. This new transaction won't be in the aborted state, even if the old transaction was aborted. Yours, Laurenz Albe
Re: Making Bitmapsets be valid Nodes
On 11.11.22 21:05, Tom Lane wrote: Per the discussion at [1], it seems like it'd be a good idea to make Bitmapsets into full-fledged, tagged Nodes, so that we could do things like print or copy lists of them without special-case logic. The extra space for the NodeTag is basically free due to alignment considerations, at least on 64-bit hardware. Attached is a cleaned-up version of Amit's patch v24-0003 at [2]. I fixed the problems with not always tagging Bitmapsets, and changed the outfuncs/readfuncs logic so that Bitmapsets still print exactly as they did before (thus, this doesn't require a catversion bump). This looks good to me.
Re: Schema variables - new implementation for Postgres 15
pá 4. 11. 2022 v 15:28 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Fri, Nov 04, 2022 at 03:17:18PM +0100, Pavel Stehule wrote: > > > I've stumbled upon something that looks weird to me (inspired by the > > > example from tests): > > > > > > =# create variable v2 as int; > > > =# let v2 = 3; > > > =# create view vv2 as select coalesce(v2, 0) + 1000 as result > > > > > > =# select * from vv2; > > > result > > > > > > 1003 > > > > > > =# set force_parallel_mode to on; > > > =# select * from vv2; > > > result > > > > > > 1000 > > > > > > In the second select the actual work is done from a worker backend. > > > Since values of session variables are stored in the backend local > > > memory, it's not being shared with the worker and the value is not > found > > > in the hash map. Does this suppose to be like that? > > > > > > > It looks like a bug, but parallel queries should be supported. > > > > The value of the variable is passed as parameter to the worker backend. > But > > probably somewhere the original reference was not replaced by parameter > > > > On Fri, Nov 04, 2022 at 10:17:13PM +0800, Julien Rouhaud wrote: > > Hi, > > > > There's code to serialize and restore all used variables for parallel > workers > > (see code about PARAM_VARIABLE and queryDesc->num_session_variables / > > queryDesc->plannedstmt->sessionVariables). I haven't reviewed that part > yet, > > but it's supposed to be working. Blind guess would be that it's missing > > something in expression walker. > > I see, thanks. I'll take a deeper look, my initial assumption was due to > the fact that in the worker case create_sessionvars_hashtables is > getting called for every query. > It should be fixed in today's patch The problem was in missing pushing the hasSessionVariables flag to the upper subquery in pull_up_simple_subquery. --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -1275,6 +1275,9 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte, /* If subquery had any RLS conditions, now main query does too */ parse->hasRowSecurity |= subquery->hasRowSecurity; + /* If subquery had session variables, now main query does too */ + parse->hasSessionVariables |= subquery->hasSessionVariables; + Thank you for the check and bug report. Your example was added to regress tests Regards Pavel
Re: Typo about subxip in comments
On Sat, 12 Nov 2022 at 12:12, Amit Kapila wrote: > On Fri, Nov 11, 2022 at 2:45 PM Japin Li wrote: >> Maybe a wrong plural in XidInMvccSnapshot(). >> >> * Make a quick range check to eliminate most XIDs without looking at the >> * xip arrays. >> >> I think we should use "xip array" instead of "xip arrays". >> > > I think here the comment is referring to both xip and subxip array, so > it looks okay to me. > Yeah, it means xip in normal case, and subxip in recovery case. >> Furthermore, if the snapshot is taken during recovery, the xip array is >> empty, and we should check subxip array. How about changing "xip arrays" >> to "xip or subxip array"? >> > > I don't know if that is an improvement. I think we should stick to > your initial proposed change. Agreed. Let's focus on the initial proposed change. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Making Bitmapsets be valid Nodes
Peter Eisentraut writes: > On 11.11.22 21:05, Tom Lane wrote: >> Attached is a cleaned-up version of Amit's patch v24-0003 at [2]. >> I fixed the problems with not always tagging Bitmapsets, and changed >> the outfuncs/readfuncs logic so that Bitmapsets still print exactly >> as they did before (thus, this doesn't require a catversion bump). > This looks good to me. Pushed, thanks for looking. regards, tom lane
Re: Reducing power consumption on idle servers
On Thu, 24 Mar 2022 at 16:02, Simon Riggs wrote: > > On Thu, 24 Mar 2022 at 15:39, Robert Haas wrote: > > > > On Thu, Mar 24, 2022 at 6:59 AM Simon Riggs > > wrote: > > > The proposals of this patch are the following, each of which can be > > > independently accepted/rejected: > > > 1. fix the sleep pattern of bgwriter, walwriter and logical worker > > > (directly affects powersave) > > > 2. deprecate promote_trigger_file, which then allows us to fix the > > > sleep for startup process (directly affects powersave) > > > 3. treat hibernation in all procs the same, for simplicity, and to > > > make sure we don't forget one later > > > 4. provide a design pattern for background worker extensions to > > > follow, so as to encourage powersaving > > > > Unfortunately, the patch isn't split in a way that corresponds to this > > division. I think I'm +1 on change #2 -- deprecating > > promote_trigger_file seems like a good idea to me independently of any > > power-saving considerations. But I'm not sure that I am on board with > > any of the other changes. > > OK, so change (2) is good. Separate patch attached. Thanks to Ian for prompting me to pick up this thread again; apologies for getting distracted. The attached patch is a reduced version of the original. It covers only: * deprecation of the promote_trigger_file - there are no tests that use that, hence why there is no test coverage for the patch * changing the sleep time of the startup process to 60s * docs and comments Other points will be discussed in another branch of this thread. -- Simon Riggshttp://www.EnterpriseDB.com/ hibernate_startup.v6.patch Description: Binary data
Re: Data is copied twice when specifying both child and parent table in publication
On Fri, 11 Nov 2022 at 11:13, wangw.f...@fujitsu.com wrote: > > On Fri, Oct 21, 2022 at 17:02 PM Peter Smith wrote: > > > > Thanks for your comments. Sorry for not replying in time. > > > On Mon, Oct 17, 2022 at 4:49 PM wangw.f...@fujitsu.com > > wrote: > > > > > > On Wed, Oct 5, 2022 at 11:08 AM Peter Smith > > wrote: > > > > Hi Wang-san. Here are my review comments for HEAD_v12-0001 patch. > > > > > ... > > > > > > > > 3. QUESTION - pg_get_publication_tables / fetch_table_list > > > > > > > > When the same table is published by different publications (but there > > > > are other differences like row-filters/column-lists in each > > > > publication) the result tuple of this function does not include the > > > > pubid. Maybe the SQL of pg_publication_tables/fetch_table_list() is OK > > > > as-is but how does it manage to associate each table with the correct > > > > tuple? > > > > > > > > I know it apparently all seems to work but I’m not how does that > > > > happen? Can you explain why a puboid is not needed for the result > > > > tuple of this function? > > > > > > Sorry, I am not sure I understand your question. > > > I try to answer your question by explaining the two functions you > > > mentioned: > > > > > > First, the function pg_get_publication_tables gets the list (see > > > table_infos) > > > that included published table and the corresponding publication. Then > > > based > > > on this list, the function pg_get_publication_tables returns information > > > (scheme, relname, row filter and column list) about the published tables > > > in the > > > publications list. It just doesn't return pubid. > > > > > > Then, the SQL in the function fetch_table_list will get the columns in the > > > column list from pg_attribute. (This is to return all columns when the > > > column > > > list is not specified) > > > > > > > I meant, for example, if the different publications specified > > different col-lists for the same table then IIUC the > > fetch_table_lists() is going to return 2 list elements > > (schema,rel_name,row_filter,col_list). But when the schema/rel_name > > are the same for 2 elements then (without also a pubid) how are you > > going to know where the list element came from, and how come that is > > not important? > > > > > > ~~ > > > > > > > > test_pub=# create table t1(a int, b int, c int); > > > > CREATE TABLE > > > > test_pub=# create publication pub1 for table t1(a) where (a > 99); > > > > CREATE PUBLICATION > > > > test_pub=# create publication pub2 for table t1(a,b) where (b < 33); > > > > CREATE PUBLICATION > > > > > > > > Following seems OK when I swap orders of publication names... > > > > > > > > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual, > > > > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC > > > > ARRAY['pub2','pub1']) gpt(relid, attrs, qual); > > > > relid | attrs | rowfilter > > > > ---+---+--- > > > > 16385 | 1 2 | (b < 33) > > > > 16385 | 1 | (a > 99) > > > > (2 rows) > > > > > > > > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual, > > > > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC > > > > ARRAY['pub1','pub2']) gpt(relid, attrs, qual); > > > > relid | attrs | rowfilter > > > > ---+---+--- > > > > 16385 | 1 | (a > 99) > > > > 16385 | 1 2 | (b < 33) > > > > (2 rows) > > > > > > > > But what about this (this is similar to the SQL fragment from > > > > fetch_table_list); I swapped the pub names but the results are the > > > > same... > > > > > > > > test_pub=# SELECT pg_get_publication_tables(VARIADIC > > > > array_agg(p.pubname)) from pg_publication p where pubname > > > > IN('pub2','pub1'); > > > > > > > > pg_get_publication_tables > > > > > > > > - > > > > > > - > > > > - > > > > > > - > > > > --- > > > > (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset > > > > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1 > > > > :vartype 23 :vartypmod -1 :var > > > > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47} > > > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4 > > > > :constbyval true :constisnull false : > > > > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}") > > > > (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16 > > > > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 > > > > :varattno 2 :vartype 23 :vartypmod -1 :v > > > > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49} > > > > {CONST :consttype 23 :constty
Re: proposal: possibility to read dumped table's name from file
ne 13. 11. 2022 v 9:58 odesílatel Julien Rouhaud napsal: > On Sat, Nov 12, 2022 at 09:35:59PM +0100, Pavel Stehule wrote: > > Thanks for the updated patch. Apart from the function comment it looks > good to > me. > > Justin, did you have any other comment on the patch? > > > > I don't fully understand the part about subpatterns, but is that > necessary > > > to > > > describe it? Simply saying that any valid and possibly-quoted > identifier > > > can > > > be parsed should make it clear that identifiers containing \n > characters > > > should > > > work too. Maybe also just mention that whitespaces are removed and > special > > > care is taken to output routines in exactly the same way calling code > will > > > expect it (that is comma-and-single-space type delimiter). > > > > > > > In this case I hit the limits of my English language skills. > > > > I rewrote this comment, but it needs more care. Please, can you look at > it? > > I'm also not a native English speaker so I'm far for writing perfect > comments > myself :) > far better than mine :) Thank you very much updated patch attached Regards Pavel > > Maybe something like > > /* > * read_pattern - reads on object pattern from input > * > * This function will parse any valid identifier (quoted or not, qualified > or > * not), which can also includes the full signature for routines. > * Note that this function takes special care to sanitize the detected > * identifier (removing extraneous whitespaces or other unnecessary > * characters). This is necessary as most backup/restore filtering > functions > * only recognize identifiers if they are written exactly way as they are > * regenerated. > * Returns a pointer to next character after the found identifier, or NULL > on > * error. > */ > diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index 8b9d9f4cad..d5a6e2c7ee 100644 --- a/doc/src/sgml/ref/pg_dump.sgml +++ b/doc/src/sgml/ref/pg_dump.sgml @@ -779,6 +779,80 @@ PostgreSQL documentation + + --filter=filename + + +Specify a filename from which to read patterns for objects to include +or exclude from the dump. The patterns are interpreted according to the +same rules as the corresponding options: +-t/--table for tables, +-n/--schema for schemas, +--include-foreign-data for data on foreign servers and +--exclude-table-data for table data. +To read from STDIN, use - as the +filename. The --filter option can be specified in +conjunction with the above listed options for including or excluding +objects, and can also be specified more than once for multiple filter +files. + + + +The file lists one object pattern per row, with the following format: + +{ include | exclude } { table | schema | foreign_data | table_data } PATTERN + + + + +The first keyword specifies whether the objects matched by the pattern +are to be included or excluded. The second keyword specifies the type +of object to be filtered using the pattern: + + + + table: tables, works like + -t/--table + + + + + schema: schemas, works like + -n/--schema + + + + + foreign_data: data on foreign servers, works like + --include-foreign-data. This keyword can only be + used with the include keyword. + + + + + table_data: table data, works like + --exclude-table-data. This keyword can only be + used with the exclude keyword. + + + + + + +Lines starting with # are considered comments and +ignored. Comments can be placed after filter as well. Blank lines +are also ignored. See for how to +perform quoting in patterns. + + + +Example files are listed below in the +section. + + + + + --if-exists @@ -1119,6 +1193,7 @@ PostgreSQL documentation schema (-n/--schema) and table (-t/--table) qualifier match at least one extension/schema/table in the database to be dumped. +This also applies to filters used with --filter. Note that if none of the extension/schema/table qualifiers find matches, pg_dump will generate an error even without --strict-names. @@ -1528,6 +1603,19 @@ CREATE DATABASE foo WITH TEMPLATE template0; $ pg_dump -t "\"MixedCaseName\"" mydb > mytab.sql + + + + To dump all tables with names starting with mytable, except for table + mytable2, specify a filter file + filter.txt like: + +include table mytable* +exclude table mytable2 + + + +$ pg_dump --filter=filter.txt mydb > db.sql diff --git a/doc/src
Re: Reducing power consumption on idle servers
On Mon, Nov 14, 2022 at 5:52 AM Simon Riggs wrote: > The attached patch is a reduced version of the original. It covers only: > * deprecation of the promote_trigger_file - there are no tests that > use that, hence why there is no test coverage for the patch > * changing the sleep time of the startup process to 60s > * docs and comments LPGTM. If we also fix the bogus SIGALRM wakeups[1], then finally a completely idle recovery process looks like: kevent(8,0x0,0,{ },1,{ 60.0 }) = 0 (0x0) kevent(8,0x0,0,{ },1,{ 60.0 }) = 0 (0x0) kevent(8,0x0,0,{ },1,{ 60.0 }) = 0 (0x0) Presumably it would have no timeout at all in the next release. [1] https://www.postgresql.org/message-id/calj2acuiyn+zmpguvmgeoy1u7ino2qsvqrnufk8swpvk3a8...@mail.gmail.com
Re: Suppressing useless wakeups in walreceiver
Thomas Munro writes: > And with that change and a pgindent, pushed. There is something very seriously wrong with this patch. On my machine, running "make -j10 check-world" (with compilation already done) has been taking right about 2 minutes for some time. Since this patch, it's taking around 2:45 --- I did a bisect run to confirm that this patch is where it changed. The buildfarm is showing a hit, too. Comparing the step runtimes at https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2022-11-08%2005%3A29%3A28 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2022-11-08%2007%3A49%3A31 it'd seem that most tests involving walreceivers got much slower: pg_basebackup-check from 00:29 to 00:39, pg_rewind-check went from 00:56 to 01:26, and recovery-check went from 03:56 to 04:45. Curiously, subscription-check only went from 03:26 to 03:29. I've not dug into it further than that, but my bet is that some required wakeup condition was not accounted for. regards, tom lane
Re: Suppressing useless wakeups in walreceiver
On Mon, Nov 14, 2022 at 11:08 AM Tom Lane wrote: > There is something very seriously wrong with this patch. > > On my machine, running "make -j10 check-world" (with compilation > already done) has been taking right about 2 minutes for some time. > Since this patch, it's taking around 2:45 --- I did a bisect run > to confirm that this patch is where it changed. > > The buildfarm is showing a hit, too. Comparing the step runtimes at > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2022-11-08%2005%3A29%3A28 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2022-11-08%2007%3A49%3A31 > > it'd seem that most tests involving walreceivers got much slower: > pg_basebackup-check from 00:29 to 00:39, > pg_rewind-check went from 00:56 to 01:26, > and recovery-check went from 03:56 to 04:45. > Curiously, subscription-check only went from 03:26 to 03:29. > > I've not dug into it further than that, but my bet is that some > required wakeup condition was not accounted for. Looking...
Re: Suppressing useless wakeups in walreceiver
On Sun, Nov 13, 2022 at 05:08:04PM -0500, Tom Lane wrote: > There is something very seriously wrong with this patch. > > On my machine, running "make -j10 check-world" (with compilation > already done) has been taking right about 2 minutes for some time. > Since this patch, it's taking around 2:45 --- I did a bisect run > to confirm that this patch is where it changed. I've been looking into this. I wrote a similar patch for logical/worker.c before noticing that check-world was taking much longer. The problem in that case seems to be that process_syncing_tables() isn't called as often. It wouldn't surprise me if there's also something in walreceiver.c that depends upon the frequent wakeups. I suspect this will require a revert. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Reducing power consumption on idle servers
On Sun, 13 Nov 2022 at 21:28, Thomas Munro wrote: > > On Mon, Nov 14, 2022 at 5:52 AM Simon Riggs > wrote: > > The attached patch is a reduced version of the original. It covers only: > > * deprecation of the promote_trigger_file - there are no tests that > > use that, hence why there is no test coverage for the patch > > * changing the sleep time of the startup process to 60s > > * docs and comments > > LPGTM. If we also fix the bogus SIGALRM wakeups[1], then finally a > completely idle recovery process looks like: > > kevent(8,0x0,0,{ },1,{ 60.0 }) = 0 (0x0) > kevent(8,0x0,0,{ },1,{ 60.0 }) = 0 (0x0) > kevent(8,0x0,0,{ },1,{ 60.0 }) = 0 (0x0) > > Presumably it would have no timeout at all in the next release. > > [1] > https://www.postgresql.org/message-id/calj2acuiyn+zmpguvmgeoy1u7ino2qsvqrnufk8swpvk3a8...@mail.gmail.com Clearly, I haven't been watching Hackers! Thanks for the nudge. See if this does the trick? -- Simon Riggshttp://www.EnterpriseDB.com/ hibernate_startup.v7.patch Description: Binary data
Re: Suppressing useless wakeups in walreceiver
On Mon, Nov 14, 2022 at 11:26 AM Nathan Bossart wrote: > On Sun, Nov 13, 2022 at 05:08:04PM -0500, Tom Lane wrote: > > There is something very seriously wrong with this patch. > > > > On my machine, running "make -j10 check-world" (with compilation > > already done) has been taking right about 2 minutes for some time. > > Since this patch, it's taking around 2:45 --- I did a bisect run > > to confirm that this patch is where it changed. > > I've been looking into this. I wrote a similar patch for logical/worker.c > before noticing that check-world was taking much longer. The problem in > that case seems to be that process_syncing_tables() isn't called as often. > It wouldn't surprise me if there's also something in walreceiver.c that > depends upon the frequent wakeups. I suspect this will require a revert. In the case of "meson test pg_basebackup/020_pg_receivewal" it looks like wait_for_catchup hangs around for 10 seconds waiting for HS feedback. I'm wondering if we need to go back to high frequency wakeups until it's caught up, or (probably better) arrange for a proper event for progress. Digging...
Re: Suppressing useless wakeups in walreceiver
On Mon, Nov 14, 2022 at 12:11 PM Thomas Munro wrote: > On Mon, Nov 14, 2022 at 11:26 AM Nathan Bossart > wrote: > > On Sun, Nov 13, 2022 at 05:08:04PM -0500, Tom Lane wrote: > > > There is something very seriously wrong with this patch. > > > > > > On my machine, running "make -j10 check-world" (with compilation > > > already done) has been taking right about 2 minutes for some time. > > > Since this patch, it's taking around 2:45 --- I did a bisect run > > > to confirm that this patch is where it changed. > > > > I've been looking into this. I wrote a similar patch for logical/worker.c > > before noticing that check-world was taking much longer. The problem in > > that case seems to be that process_syncing_tables() isn't called as often. > > It wouldn't surprise me if there's also something in walreceiver.c that > > depends upon the frequent wakeups. I suspect this will require a revert. > > In the case of "meson test pg_basebackup/020_pg_receivewal" it looks > like wait_for_catchup hangs around for 10 seconds waiting for HS > feedback. I'm wondering if we need to go back to high frequency > wakeups until it's caught up, or (probably better) arrange for a > proper event for progress. Digging... Maybe there is a better way to code this (I mean, who likes global variables?) and I need to test some more, but I suspect the attached is approximately what we missed. diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c index 8bd2ba37dd..fed2cc6e6f 100644 --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replication/walreceiver.c @@ -1080,6 +1080,9 @@ XLogWalRcvClose(XLogRecPtr recptr, TimeLineID tli) recvFile = -1; } +static XLogRecPtr writePtr = 0; +static XLogRecPtr flushPtr = 0; + /* * Send reply message to primary, indicating our current WAL locations, oldest * xmin and the current time. @@ -1096,8 +1099,6 @@ XLogWalRcvClose(XLogRecPtr recptr, TimeLineID tli) static void XLogWalRcvSendReply(bool force, bool requestReply) { - static XLogRecPtr writePtr = 0; - static XLogRecPtr flushPtr = 0; XLogRecPtr applyPtr; TimestampTz now; @@ -1334,6 +1335,9 @@ WalRcvComputeNextWakeup(WalRcvWakeupReason reason, TimestampTz now) case WALRCV_WAKEUP_REPLY: if (wal_receiver_status_interval <= 0) wakeup[reason] = PG_INT64_MAX; + else if (writePtr != LogstreamResult.Write || + flushPtr != LogstreamResult.Flush) +wakeup[reason] = now + 10; /* frequent replies, not yet caught up */ else wakeup[reason] = now + wal_receiver_status_interval * INT64CONST(100); break;
Re: Suppressing useless wakeups in walreceiver
On Mon, Nov 14, 2022 at 12:35 PM Thomas Munro wrote: > Maybe there is a better way to code this (I mean, who likes global > variables?) and I need to test some more, but I suspect the attached > is approximately what we missed. Erm, ENOCOFFEE, sorry that's not quite right, it needs the apply LSN, but it demonstrates what the problem is...
Re: CI and test improvements
On Fri, Nov 04, 2022 at 06:59:46PM -0700, Andres Freund wrote: > Hi, > > On 2022-11-04 18:54:12 -0500, Justin Pryzby wrote: > > Subject: [PATCH 1/8] meson: PROVE is not required > > Subject: [PATCH 3/8] meson: rename 'main' tasks to 'regress' and 'isolation' > > Pushed, thanks for the patches. Thanks. > > diff --git a/.cirrus.yml b/.cirrus.yml > > index 9f2282471a9..99ac09dc679 100644 > > --- a/.cirrus.yml > > +++ b/.cirrus.yml > > @@ -451,12 +451,20 @@ task: > > > >build_script: | > > vcvarsall x64 > > -ninja -C build > > +ninja -C build |tee build/meson-logs/build.txt > > +REM Since pipes lose exit status of the preceding command, rerun > > compilation, > > +REM without the pipe exiting now if it fails, rather than trying to > > run checks > > +ninja -C build > nul > > This seems mighty grotty :(. but I guess it's quick enough not worry about, > and I can't come up with a better plan. > > It doesn't seem quite right to redirect into meson-logs/ to me, my > interpretation is that that's "meson's namespace". Why not just store it in > build/? I put it there so it'd be included with the build artifacts. Maybe it's worth adding a separate line to artifacts for stuff like this, and ccache log ? > > From e85fe83fd8a4b4c79a96d2bf66cd6a5e1bdfcd1e Mon Sep 17 00:00:00 2001 > > From: Justin Pryzby > > Date: Sat, 26 Feb 2022 19:34:35 -0600 > > Subject: [PATCH 5/8] cirrus: build docs as a separate task.. > > + # Exercise HTML and other docs: > > + ninja_docs_build_script: | > > +mkdir build.ninja > > +cd build.ninja > > Perhaps build-ninja instead? build.ninja is the filename for ninja's build > instructions, so it seems a bit confusing. Sure. Do you think building docs with both autoconf and meson is what's desirable here ? I'm not sure if this ought to be combined with/before/after your "move compilerwarnings task to meson" patch? (Regarding that patch: I mentioned that it shouldn't use ccache -C, and it should use meson_log_artifacts.) > > From: Justin Pryzby > > Date: Tue, 26 Jul 2022 20:30:02 -0500 > > Subject: [PATCH 6/8] cirrus/ccache: add explicit cache keys.. > > > > Since otherwise, building with ci-os-only will probably fail to use the > > normal cache, since the cache key is computed using both the task name > > and its *index* in the list of caches (internal/executor/cache.go:184). > > Seems like this would potentially better addressed by reporting a bug to the > cirrus folks? You said that before, but I don't think so - since they wrote code to do that, it's odd to file a bug that says that the behavior is wrong. I am curious why, but it seems delibrate. https://www.postgresql.org/message-id/20220828171029.GO2342%40telsasoft.com > There's enough copies of this to make it worth deduplicating. If we > use something like fingerprint_script: echo > ccache/$CIRRUS_BRANCH/$CIRRUS_OS we can use a yaml ref? > > > I'll look into it... > I think you experimented with creating a 'base' ccache dir (e.g. on the master > branch) and then using branch specific secondar caches? I have to revisit that sometime. That's a new feaure in ccache 4.4, which is currently only in macos. This is another thing that it'd be easier to test by having cfbot clobber the cirrus.yml rather than committing to postgres repo. (Technically, it should probably only do use the in-testing cirrus.yml if the patch it's testing doesn't itself modify .cirrus.yml) > How did that turn out? I think cfbot's caches constantly get removed > due to overrunning the global space. For cfbot, I don't know if there's much hope that any patch-specific build artifacts will be cached from the previous run, typically ~24h prior. One idea I have, for the "Warnings" task (and maybe linux too), is to defer pruning until after all the compilations. To avoid LRU pruning during early tasks causing bad hit ratios of later tasks. Another thing that probably happens is that task1 starts compiling patch1, and then another instance of task1 starts compiling patch2. A bit later, the first instance will upload its ccache result for patch1, which will be summarily overwritten by the second instance's compilation result, which doesn't include anything from the first instance. Also, whenever ccache hits its MAXSIZE threshold, it prunes the cache down to 80% of the configured size, which probably wipes away everything from all but the most recent ~20 builds. I also thought about having separate caches for each compilation in the warnings task - but that requires too much repeated yaml just for that.. > > From: Justin Pryzby > > Date: Sun, 3 Apr 2022 00:10:20 -0500 > > Subject: [PATCH 7/8] cirrus/ccache: disable compression and show stats > > > > linux/debian/bullseye has 4.2; 99MB cirrus cache; cache_siz
Re: Suppressing useless wakeups in walreceiver
On Mon, Nov 14, 2022 at 12:51:14PM +1300, Thomas Munro wrote: > On Mon, Nov 14, 2022 at 12:35 PM Thomas Munro wrote: >> Maybe there is a better way to code this (I mean, who likes global >> variables?) and I need to test some more, but I suspect the attached >> is approximately what we missed. > > Erm, ENOCOFFEE, sorry that's not quite right, it needs the apply LSN, > but it demonstrates what the problem is... Yeah, I'm able to sort-of reproduce the problem by sending fewer replies, but it's not clear to me why that's the problem. AFAICT all of the code paths that write/flush are careful to send a reply shortly afterward, which should keep writePtr/flushPtr updated. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
RE: logical replication restrictions
On Tuesday, November 8, 2022 2:27 PM Kuroda, Hayato/黒田 隼人 wrote: > If you could not allocate a time to discuss this problem because of other > important tasks or events, we would like to take over the thread and modify > your patch. > > We've planned that we will start to address comments and reported bugs if > you would not respond by the end of this week. Hi, I've simply rebased the patch to make it applicable on top of HEAD and make the tests pass. Note there are still open pending comments and I'm going to start to address those. I've written Euler as the original author in the commit message to note his credit. Best Regards, Takamichi Osumi v8-0001-Time-delayed-logical-replication-subscriber.patch Description: v8-0001-Time-delayed-logical-replication-subscriber.patch
Re: Bitmapsets as Nodes
On Mon, Oct 17, 2022 at 6:30 PM Amit Langote wrote: > For a couple of patches that I am working on ([1], [2]), I have needed > to put Bitmapsets into a List that is in turn part of a Plan tree or a > Node tree that may be written (using outNode) and read (using > nodeRead). Bitmapsets not being a Node themselves causes the > write/read of such Plan/Node trees (containing Bitmapsets in a List) > to not work properly. > > So, I included a patch in both of those threads to add minimal support > for Bitmapsets to be added into a Plan/Node tree without facing the > aforementioned problem, though Peter E suggested [3] that it would be > a good idea to discuss it more generally in a separate thread, so this > email. Attached a patch to make the infrastructure changes necessary > to allow adding Bitmapsets as Nodes, though I have not changed any of > the existing Bitmapset that are added either to Query or to > PlannedStmt to use that infrastructure. That is, by setting their > NodeTag and changing gen_node_support.pl to emit > WRITE/READ_NODE_FIELD() instead of WRITE/READ_BITMAPSET_FIELD() for > any Bitmapsets encountered in a Node tree. One thing I am not quite > sure about is who would be setting the NodeTag, the existing routines > in bitmapset.c, or if we should add wrappers that do. > > Actually, Tom had posted about exactly the same thing last year [4], > though trying to make Bitmapset Nodes became unnecessary after he > resolved the problem that required making Bitmapsets Nodes by other > means -- by getting rid of the field that was a List of Bitmapset > altogether. Maybe I should try to do the same in the case of both [1] > and [2]? In fact, I have tried getting rid of the need for List of > Bitmapset for [1], and I like the alternative better in that case, but > for [2], it still seems that a List of Bitmapset may be better than > List of some-new-Node-containing-the-Bitmapset. FTR, this has been taken care of in 5e1f3b9ebf6e5. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Re: Suppressing useless wakeups in walreceiver
On Mon, Nov 14, 2022 at 1:05 PM Nathan Bossart wrote: > Yeah, I'm able to sort-of reproduce the problem by sending fewer replies, > but it's not clear to me why that's the problem. AFAICT all of the code > paths that write/flush are careful to send a reply shortly afterward, which > should keep writePtr/flushPtr updated. Ahh, I think I might have it. In the old coding, sendTime starts out as 0, which I think caused it to send its first reply message after the first 100ms sleep, and only after that fall into a cadence of wal_receiver_status_interval (10s) while idle. Our new coding won't send its first reply until start time + wal_receiver_status_interval. If I have that right, think we can get back to the previous behaviour by explicitly setting the first message time, like: @@ -433,6 +433,9 @@ WalReceiverMain(void) for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i) WalRcvComputeNextWakeup(i, now); + /* XXX start with a reply after 100ms */ + wakeup[WALRCV_WAKEUP_REPLY] = now + 10; + /* Loop until end-of-streaming or error */ Obviously that's bogus and racy (it races with wait_for_catchup, and it's slow, actually both sides are not great and really should be event-driven, in later work)...
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment
On Tue, Nov 8, 2022 at 8:57 AM David Rowley wrote: > On Tue, 8 Nov 2022 at 05:24, Andres Freund wrote: > > I doubtthere's ever a need to realloc such a pointer? Perhaps we could just > > elog(ERROR)? > > Are you maybe locked into just thinking about your current planned use > case that we want to allocate BLCKSZ bytes in each case? It does not > seem impossible to me that someone will want something more than an > 8-byte alignment and also might want to enlarge the allocation at some > point. I thought it might be more dangerous not to implement repalloc. > It might not be clear to someone using palloc_aligned() that there's > no code path that can call repalloc on the returned pointer. I can imagine a use case for arrays of cacheline-sized objects. > TYPEALIGN() will not work correctly unless the alignment is a power of > 2. We could modify it to, but that would require doing some modular > maths instead of bitmasking. That seems pretty horrible when the macro > is given a value that's not constant at compile time as we'd end up > with a (slow) divide in the code path. I think the restriction is a > good idea. I imagine there will never be any need to align to anything > that's not a power of 2. +1 > > Should we handle the case where we get a suitably aligned pointer from > > MemoryContextAllocExtended() differently? > > Maybe it would be worth the extra check. I'm trying to imagine future > use cases. Maybe if someone wanted to ensure that we're aligned to > CPU cache line boundaries then the chances of the pointer already > being aligned to 64 bytes is decent enough. The problem is it that > it's too late to save any memory, it just saves a bit of boxing and > unboxing of the redirect headers. To my mind the main point of detecting this case is to save memory, so if that's not possible/convenient, special-casing doesn't seem worth it. - Assert((char *) chunk > (char *) block); + Assert((char *) chunk >= (char *) block); Is this related or independent? -- John Naylor EDB: http://www.enterprisedb.com
Re: Add test module for Custom WAL Resource Manager feature
On Sat, Nov 12, 2022 at 4:40 AM Jeff Davis wrote: > > On Fri, 2022-11-11 at 17:01 +0530, Bharath Rupireddy wrote: > > Hi, > > > > Inspired by recent commits 9fcdf2c, e813e0e and many small test > > modules/extensions under src/test/modules, I would like to propose > > one > > such test module for Custom WAL Resource Manager feature introduced > > by > > commit 5c279a6. It not only covers the code a bit, but it also > > demonstrates usage of the feature. > > > > I'm attaching a patch herewith. Thoughts? > > Good idea. Thanks. > Can we take it a little further to exercise the decoding > path, as well? For instance, we can do something like a previous proposal[1], > except > it can now be done as an extension. If it's useful, we could even put > it in contrib with a real RMGR ID. > > [1] > https://www.postgresql.org/message-id/20ee0b0ae6958804a88fe9580157587720faf664.ca...@j-davis.com We have tests/modules defined for testing logical decoding, no? If the intention is to define rm_redo in this test module, I think it's not required. > Though I'm also fine just adding a test module to start with. Thanks. I would like to keep it simple. I've added some more comments and attached v2 patch herewith. Please review. I've also added a CF entry - https://commitfest.postgresql.org/41/4009/. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com v2-0001-Add-test-module-for-Custom-WAL-Resource-Manager-f.patch Description: Binary data
RE: [PATCH] Support % wildcard in extension upgrade filenames
> Re: Sandro Santilli > > I'm attaching an updated version of the patch. This time the patch is > > tested. Nothing changes unless the .control file for the subject > > extension doesn't have a "wildcard_upgrades = true" statement. > > > > When wildcard upgrades are enabled, a file with a "%" symbol as the > > "source" part of the upgrade path will match any version and > > Fwiw I believe wildcard_upgrades isn't necessary in the .control file. > If there are no % files, none would be used anyway, and if there are, it's clear > it's meant as wildcard since % won't appear in any remotely sane version > number. > > Christoph I also like the idea of skipping the wildcard_upgrades syntax. Then there is no need to have a conditional control file for PG 16 vs. older versions. Thanks, Regina
Re: Allow file inclusion in pg_hba and pg_ident files
On Sat, Nov 12, 2022 at 04:13:53PM +0800, Julien Rouhaud wrote: > It's looks good to me. I agree that file name and line number should be > enough > to diagnose any unexpected error. Thanks for checking. I have looked at 0001 and 0002 again with a fresh mind, and applied both of them this morning. This makes the remaining bits of the patch much easier to follow in hba.c. Here are more comments after a closer review of the whole for the C logic. -MemoryContext -tokenize_auth_file(const char *filename, FILE *file, List **tok_lines, - int elevel, int depth) +static void +tokenize_file_with_context(MemoryContext linecxt, const char *filename, I really tend to prefer having one routine rather than two for the tokenization entry point. Switching to the line context after setting up the callback is better, and tokenize_file_with_context() does so. Anyway, what do you think about having one API that gains a "MemoryContext *" argument, as of the following: void tokenize_auth_file(const char *filename, FILE *file, List **tok_lines, int depth, int elevel, MemoryContext *linectx) If the caller passes NULL for *linectx as the initial line context, just create it as we do now. If *linectx is not NULL, just reuse it. That may be cleaner than returning the created MemoryContext as returned result from tokenize_auth_file(). +/* Cumulate errors if any. */ +if (err_msg) +{ +if (err_buf.len > 0) +appendStringInfoChar(&err_buf, '\n'); +appendStringInfoString(&err_buf, err_msg); +} This aggregates all the error messages for all the files included in a given repository. As the patch stands, it seems to me that we would get errors related to an include_dir clause for two cases: - The specified path does not exist, in which case we have only one err_msg to consume and report back. - Multiple failures in opening and/or reading included files. In the second case, aggregating the reports would provide a full set of information, but that's not something a user would be able to act on directly as this is system-related. Or there is a case to know a full list of files in the case of multiple files that cannot be read because of permission issues? We may be fine with just the first system error here. Note that in the case where files can be read and opened, these would have their own TokenizedAuthLines for each line parsed, meaning one line in the SQL views once translated to an HbaLine or an IdentLine. This line of thoughts brings an interesting point, actually: there is an inconsistency between "include_if_exists" and "include" compared to the GUC processing. As of the patch, if we use "include" on a file that does not exist, the tokenization logic would jump over it and continue processing the follow-up entries anyway. This is a different policy than the GUCs, where we would immediately stop looking at parameters after an "include" if it fails because its file does not exist, working as a immediate stop in the processing. The difference that the patch brings between "include_if_exists" and "include" is that we report an error in one case but not the other, still skip the files in both cases and move on with the rest. Hence my question, shouldn't we do like the GUC processing for the hba and ident files, aka stop immediately when we fail to find a file on an "include" clause? This would be equivalent to doing a "break" in tokenize_file_with_context() after failing an include file. -- Michael signature.asc Description: PGP signature
Time delayed LR (WAS Re: logical replication restrictions)
Hi, The thread title doesn't really convey the topic under discussion, so changed it. IIRC, this has been mentioned by others as well in the thread. On Sat, Nov 12, 2022 at 7:21 PM vignesh C wrote: > > Few comments: > 1) I feel if the user has specified a long delay there is a chance > that replication may not continue if the replication slot falls behind > the current LSN by more than max_slot_wal_keep_size. I feel we should > add this reference in the documentation of min_apply_delay as the > replication will not continue in this case. > This makes sense to me. > 2) I also noticed that if we have to shut down the publisher server > with a long min_apply_delay configuration, the publisher server cannot > be stopped as the walsender waits for the data to be replicated. Is > this behavior ok for the server to wait in this case? If this behavior > is ok, we could add a log message as it is not very evident from the > log files why the server could not be shut down. > I think for this case, the behavior should be the same as for physical replication. Can you please check what is behavior for the case you are worried about in physical replication? Note, we already have a similar parameter for recovery_min_apply_delay for physical replication. -- With Regards, Amit Kapila.
Re: Schema variables - new implementation for Postgres 15
On 13.11.2022 20:59, Pavel Stehule wrote: fresh rebase Hello, Sorry, I haven't been following this thread, but I'd like to report a memory management bug. I couldn't apply the latest patches, so I tested with v20221104-1-* patches applied atop of commit b0284bfb1db. postgres=# create variable s text default 'abc'; create function f() returns text as $$ begin return g(s); end; $$ language plpgsql; create function g(t text) returns text as $$ begin let s = 'BOOM!'; return t; end; $$ language plpgsql; select f(); CREATE VARIABLE CREATE FUNCTION CREATE FUNCTION server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. LOG: server process (PID 55307) was terminated by signal 11: Segmentation fault DETAIL: Failed process was running: select f(); I believe it's a use-after-free error, triggered by assigning a new value to s in g(), thus making t a dangling pointer. After reconnecting I get a scary error: postgres=# select f(); ERROR: compressed pglz data is corrupt Best regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: Time delayed LR (WAS Re: logical replication restrictions)
On Mon, Nov 14, 2022 at 12:14 PM Amit Kapila wrote: > > On Sat, Nov 12, 2022 at 7:21 PM vignesh C wrote: > > > > Few comments: > > 1) I feel if the user has specified a long delay there is a chance > > that replication may not continue if the replication slot falls behind > > the current LSN by more than max_slot_wal_keep_size. I feel we should > > add this reference in the documentation of min_apply_delay as the > > replication will not continue in this case. > > > > This makes sense to me. > > > 2) I also noticed that if we have to shut down the publisher server > > with a long min_apply_delay configuration, the publisher server cannot > > be stopped as the walsender waits for the data to be replicated. Is > > this behavior ok for the server to wait in this case? If this behavior > > is ok, we could add a log message as it is not very evident from the > > log files why the server could not be shut down. > > > > I think for this case, the behavior should be the same as for physical > replication. Can you please check what is behavior for the case you > are worried about in physical replication? Note, we already have a > similar parameter for recovery_min_apply_delay for physical > replication. > I don't understand the reason for the below change in the patch: + /* + * If this subscription has been disabled and it has an apply + * delay set, wake up the logical replication worker to finish + * it as soon as possible. + */ + if (!opts.enabled && sub->applydelay > 0) + logicalrep_worker_wakeup(sub->oid, InvalidOid); + It seems to me Kuroda-San has proposed this change [1] to fix the test but it is not clear to me why such a change is required. Why can't CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below code [2] in LogicalRepApplyLoop() sufficient to handle parameter updates? [2] if (!in_remote_transaction && !in_streamed_transaction) { /* * If we didn't get any transactions for a while there might be * unconsumed invalidation messages in the queue, consume them * now. */ AcceptInvalidationMessages(); maybe_reread_subscription(); ... [1] - https://www.postgresql.org/message-id/TYAPR01MB5866F9716A18DA0C68A2CDB3F5469%40TYAPR01MB5866.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.
Re: Non-decimal integer literals
On Mon, Oct 10, 2022 at 9:17 PM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > Taking another look around ecpg to see how this interacts with C-syntax > integer literals. I'm not aware of any particular issues, but it's > understandably tricky. I can find no discussion in the archives about the commit that added "xch": commit 6fb3c3f78fbb2296894424f6e3183d339915eac7 Author: Michael Meskes Date: Fri Oct 15 19:02:08 1999 + *** empty log message *** ...and I can't think of why bounds checking a C literal was done like this. Regarding the patch, it looks good overall. My only suggestion would be to add a regression test for just below and just above overflow, at least for int2. Minor nits: - * Process {integer}. Note this will also do the right thing with {decimal}, + * Process {*integer}. Note this will also do the right thing with {numeric}, I scratched my head for a while, thinking this was Flex syntax, until I realized my brain was supposed to do shell-globbing first, at which point I could see it was referring to multiple Flex rules. I'd try to rephrase. +T661 Non-decimal integer literals YES SQL:202x draft Is there an ETA yet? Also, it's not this patch's job to do it, but there are at least a half dozen places that open-code turning a hex char into a number. It might be a good easy "todo item" to unify that. -- John Naylor EDB: http://www.enterprisedb.com
Re: Making Bitmapsets be valid Nodes
On Mon, Nov 14, 2022 at 12:23 AM Tom Lane wrote: > Peter Eisentraut writes: > > On 11.11.22 21:05, Tom Lane wrote: > >> Attached is a cleaned-up version of Amit's patch v24-0003 at [2]. > >> I fixed the problems with not always tagging Bitmapsets, and changed > >> the outfuncs/readfuncs logic so that Bitmapsets still print exactly > >> as they did before (thus, this doesn't require a catversion bump). > > > This looks good to me. > > Pushed, thanks for looking. Thanks a lot for this. I agree that it may not be worthwhile to add an extra function call by changing COPY_BITMAPSET_FIELD, etc. that is currently emitted by gen_node_support.pl for any Bitmapset * / Relid struct members to COPY_NODE_FIELD, etc. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Re: ExecRTCheckPerms() and many prunable partitions
On Sat, Nov 12, 2022 at 1:46 AM Tom Lane wrote: > Alvaro Herrera writes: > > On 2022-Oct-06, Amit Langote wrote: > >> Actually, List of Bitmapsets turned out to be something that doesn't > >> just-work with our Node infrastructure, which I found out thanks to > >> -DWRITE_READ_PARSE_PLAN_TREES. So, I had to go ahead and add > >> first-class support for copy/equal/write/read support for Bitmapsets, > > > Hmm, is this related to what Tom posted as part of his 0004 in > > https://postgr.es/m/2901865.1667685...@sss.pgh.pa.us > > It could be. For some reason I thought that Amit had withdrawn > his proposal to make Bitmapsets be Nodes. I think you are referring to [1] that I had forgotten to link to here. I did reimplement a data structure in my patch on the "Re: generic plans and initial pruning" thread to stop using a List of Bitmapsets, so the Bitmapset as Nodes functionality became unnecessary there, though I still need it for the proposal here to move extraUpdatedColumns (patch 0004) into ModifyTable node. > The code I was using that for would rather have fixed-size arrays > of Bitmapsets than variable-size Lists, mainly because it always > knows ab initio what the max length of the array will be. But > I don't think that the preference is so strong that it justifies > a private data structure. > > The main thing I was wondering about in connection with that > was whether to assume that there could be other future applications > of the logic to perform multi-bitmapset union, intersection, > etc. If so, then I'd be inclined to choose different naming and > put those functions in or near to bitmapset.c. It doesn't look > like Amit's code needs anything like that, but maybe somebody > has an idea about other applications? Yes, simple storage of multiple Bitmapsets in a List somewhere in a parse/plan tree sounded like that would have wider enough use to add proper node support for. Assuming you mean trying to generalize VarAttnoSet in your patch 0004 posted at [2], I wonder if you want to somehow make its indexability by varno / RT index a part of the interface of the generic code you're thinking for it? For example, varattnoset_*_members collection of routines in that patch seem to assume that the Bitmapsets at a given index in the provided pair of VarAttnoSets are somehow related -- covering to the same base relation in this case. That does not sound very generalizable but maybe that is not what you are thinking at all. > Anyway, I concur with Peter's upthread comment that making > Bitmapsets be Nodes is probably justifiable all by itself. > The lack of a Node tag in them now is just because in a 32-bit > world it seemed like unnecessary bloat. But on 64-bit machines > it's free, and we aren't optimizing for 32-bit any more. > > I do not like the details of v24-0003 at all though, because > it seems to envision that a "node Bitmapset" is a different > thing from a raw Bitmapset. That can only lead to bugs --- > why would we not make it the case that every Bitmapset is > properly labeled with the node tag? Yeah, I just didn't think hard enough to realize that having bitmapset.c itself set the node tag is essentially free, and it looks like a better design anyway than the design where callers get to choose to make the bitmapset they are manipulating a Node or not. > Also, although I'm on board with making Bitmapsets be Nodes, > I don't think I'm on board with changing their dump format. > Planner node dumps would get enormously bulkier and less > readable if we changed things like > >:relids (b 1 2) > > to > >:relids > {BITMAPSET >(b 1 2) > } > > or whatever the output would look like as the patch stands. > So that needs a bit more effort, but it's surely manageable. Agreed with leaving the dump format unchanged or not bloating it. Thanks a lot for 5e1f3b9ebf6e5. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/CA+HiwqG8L3DVoZauJi1-eorLnnoM6VcfJCCauQX8=ofi-qm...@mail.gmail.com [2] https://www.postgresql.org/message-id/2901865.1667685211%40sss.pgh.pa.us
Re: Allow file inclusion in pg_hba and pg_ident files
Hi, On Mon, Nov 14, 2022 at 02:40:37PM +0900, Michael Paquier wrote: > On Sat, Nov 12, 2022 at 04:13:53PM +0800, Julien Rouhaud wrote: > > It's looks good to me. I agree that file name and line number should be > > enough > > to diagnose any unexpected error. > > Thanks for checking. I have looked at 0001 and 0002 again with a > fresh mind, and applied both of them this morning. Thanks a lot! > This makes the remaining bits of the patch much easier to follow in > hba.c. Here are more comments after a closer review of the whole for > the C logic. Agreed. > -MemoryContext > -tokenize_auth_file(const char *filename, FILE *file, List **tok_lines, > - int elevel, int depth) > +static void > +tokenize_file_with_context(MemoryContext linecxt, const char *filename, > > I really tend to prefer having one routine rather than two for the > tokenization entry point. Switching to the line context after setting > up the callback is better, and tokenize_file_with_context() does so. > Anyway, what do you think about having one API that gains a > "MemoryContext *" argument, as of the following: > void tokenize_auth_file(const char *filename, FILE *file, > List **tok_lines, > int depth, int elevel, MemoryContext *linectx) > > If the caller passes NULL for *linectx as the initial line context, > just create it as we do now. If *linectx is not NULL, just reuse it. > That may be cleaner than returning the created MemoryContext as > returned result from tokenize_auth_file(). I originally used two functions as it's a common pattern (e.g. ReadBuffer / ReadBufferExtended or all the *_internal versions of functions) and to avoid unnecessary branch, but I agree that here having an extra branch is unlikely to make any measurable difference. It would only matter on -DEXEC_BACKEND / win32, and in that case the extra overhead (over an already expensive new backend mechanism) is way more than that, so +1 to keep things simple here. > +/* Cumulate errors if any. */ > +if (err_msg) > +{ > +if (err_buf.len > 0) > +appendStringInfoChar(&err_buf, '\n'); > +appendStringInfoString(&err_buf, err_msg); > +} > > This aggregates all the error messages for all the files included in a > given repository. As the patch stands, it seems to me that we would > get errors related to an include_dir clause for two cases: > - The specified path does not exist, in which case we have only one > err_msg to consume and report back. > - Multiple failures in opening and/or reading included files. > In the second case, aggregating the reports would provide a full set > of information, but that's not something a user would be able to act > on directly as this is system-related. Or there is a case to know a > full list of files in the case of multiple files that cannot be read > because of permission issues? We may be fine with just the first > system error here. Note that in the case where files can be read and > opened, these would have their own TokenizedAuthLines for each line > parsed, meaning one line in the SQL views once translated to an > HbaLine or an IdentLine. If you have an include_dir directive and multiple files have wrong permission (or maybe broken symlink or something like that), you will get multiple errors when trying to process that single directive. I think it's friendlier to report as much detail as we can, so users can make sure they fix everything rather than iterating over the first error. That's especially helpful if the fix is done in some external tooling (puppet or whatever) rather than directly on the server. > > This line of thoughts brings an interesting point, actually: there is > an inconsistency between "include_if_exists" and "include" compared to > the GUC processing. As of the patch, if we use "include" on a file > that does not exist, the tokenization logic would jump over it and > continue processing the follow-up entries anyway. This is a different > policy than the GUCs, where we would immediately stop looking at > parameters after an "include" if it fails because its file does not > exist, working as a immediate stop in the processing. The difference > that the patch brings between "include_if_exists" and "include" is > that we report an error in one case but not the other, still skip the > files in both cases and move on with the rest. Hence my question, > shouldn't we do like the GUC processing for the hba and ident files, > aka stop immediately when we fail to find a file on an "include" > clause? This would be equivalent to doing a "break" in > tokenize_file_with_context() after failing an include file. I think that the problem is that we have the same interface for processing the files on a startup/reload and for filling the views, so if we want the views to be helpful and report all errors we have to also allow a bogus "include" to continue in the reload case too. The same problem doe