Re: Allow file inclusion in pg_hba and pg_ident files
On Tue, Oct 25, 2022 at 03:43:21PM +0900, Michael Paquier wrote: > > Another advantage is that it minimizes the presence of the hardcoded > HbaFileName and IdentFileName in hba.c, which is one thing we are > trying to achieve here for the inclusion of more files. I found a bit > strange that IdentLine had no sourcefile, actually. We track the file > number but use it nowhere, and it seems to me that having more > symmetry between both would be a good thing. If IdentLine->linenumber is useless, why not get rid of it rather than tracking another useless info? Hba is much more structured so we need a more specialized struct, but I don't think ident will ever go that way. > So, the key of the logic is how we are going to organize the > tokenization of the HBA and ident lines through all the inclusions.. > As far as I get it, tokenize_auth_file() is the root call and > tokenize_file_with_context() with its depth is able to work on each > individual file, and it can optionally recurse depending on what's > included. Why do you need to switch to the old context in > tokenize_file_with_context()? Could it be simpler to switch once to > linecxt outside of the internal routine? It just seemed the cleanest way to go. We could do without but then we would have to duplicate MemoryContextSwitchTo calls all over the place, and probably handling an optional memory context creation in the function. > It looks like GetDirConfFiles() is another piece that can be > refactored and reviewed on its own, as we use it in > ParseConfigDirectory()@guc.c. I'm fine with it.
Re: fixing typo in comment for restriction_is_or_clause
On 2022-Oct-25, Richard Guo wrote: > Agree with your point. Do you think we can further make the one-line > function a macro or an inline function in the .h file? We can, but should we? > I think this function is called quite frequently during planning, so > maybe doing that would bring a little bit of efficiency. You'd need to measure it and show some gains. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "Siempre hay que alimentar a los dioses, aunque la tierra esté seca" (Orual)
Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
On 2022-Oct-25, Tom Lane wrote: > Michael Paquier writes: > > On Mon, Oct 24, 2022 at 02:22:16PM +0200, Alvaro Herrera wrote: > >> I confess I don't understand why is it important that XLogBeginInsert is > >> called inside the critical section. It seems to me that that part is > >> only a side-effect of having to acquire the buffer locks in the critical > >> section. Right? > > > Yeah, you are right that it would not matter for XLogBeginInsert(), > > though I'd like to think that this is a good practice on consistency > > grounds with anywhere else, and we respect what's documented in the > > README. > > Yeah --- it's documented that way, and there doesn't seem to be > a good reason not to honor that here. Okay, so if we follow this argument, then the logical conclusion is that this *should* be backpatched, after all. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ Maybe there's lots of data loss but the records of data loss are also lost. (Lincoln Yeoh)
Re: fixing typo in comment for restriction_is_or_clause
On Tue, Oct 25, 2022 at 2:25 PM John Naylor wrote: > > On Tue, Oct 25, 2022 at 9:48 AM Richard Guo > wrote: > > > > > > On Tue, Oct 25, 2022 at 10:05 AM John Naylor < > john.nay...@enterprisedb.com> wrote: > >> > >> It's perfectly clear and simple now, even if it doesn't win at "code > golf". > > > > > > Agree with your point. Do you think we can further make the one-line > > function a macro or an inline function in the .h file? I think this > > function is called quite frequently during planning, so maybe doing that > > would bring a little bit of efficiency. > > My point was misunderstood, which is: I don't think we need to do anything > at all here if the goal was purely about aesthetics. > > If the goal has now changed to efficiency, I have no opinion about that > yet since no evidence has been presented. > Now I think I've got your point. Sorry for the misread. Your concern makes sense. When talking about efficiency we'd better attach some concrete proof, such as benchmark tests. Thanks Richard
Re: fixing typo in comment for restriction_is_or_clause
On Tue, Oct 25, 2022 at 3:37 PM Alvaro Herrera wrote: > On 2022-Oct-25, Richard Guo wrote: > > > Agree with your point. Do you think we can further make the one-line > > function a macro or an inline function in the .h file? > > We can, but should we? > > > I think this function is called quite frequently during planning, so > > maybe doing that would bring a little bit of efficiency. > > You'd need to measure it and show some gains. Yeah, that is what has to be done to make it happen. Thanks Richard
Re: Proposal: Adding isbgworker column to pg_stat_activity
Hello, I just noticed that this proposal from 2020 didn't get any backers: On 2020-Dec-01, Paul Martinez wrote: > It is currently slightly difficult to determine how many background worker > processes are currently running, which is useful when trying to manage > the max_worker_processes parameter. It seems the best bet is to use the > backend_type column and filter out the many types that are defined in > miscinit.c: > > https://github.com/postgres/postgres/blob/REL_13_1/src/backend/utils/init/miscinit.c#L201-L253 > > I would like to propose adding a simple column isbgworker, that simply > stores the value of the expression `beentry->st_backendType == B_BG_WORKER`, > which is used in pg_stat_get_activity. Having recently had the chance to interact with this --and not at all happy about how it went-- I think we should do this, or something like it. It should be easy to filter out or in processes which correspond to bgworkers in pg_stat_activity. One simple idea would be to prefix something to backend_type (IIRC we used to do that), but a separate flag is probably more appropriate. Are you up for writing a patch? I'm guessing it should be easy. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"
Confused about TransactionIdSetTreeStatus
Hi, hackers I'm a bit confused about TransactionIdSetTreeStatus, the comment says if subtransactions cross multiple CLOG pages, it will mark the subxids, that are on the same page as the main transaction, as sub-committed, and then set main transaction and subtransactions to committed (step 2). * Example: * TransactionId t commits and has subxids t1, t2, t3, t4 * t is on page p1, t1 is also on p1, t2 and t3 are on p2, t4 is on p3 * 1. update pages2-3: * page2: set t2,t3 as sub-committed * page3: set t4 as sub-committed * 2. update page1: * set t1 as sub-committed, * then set t as committed, then set t1 as committed * 3. update pages2-3: * page2: set t2,t3 as committed * page3: set t4 as committed However, the code marks the main transaction and subtransactions directly to the committed. /* * If this is a commit then we care about doing this correctly (i.e. * using the subcommitted intermediate status). By here, we know * we're updating more than one page of clog, so we must mark entries * that are *not* on the first page so that they show as subcommitted * before we then return to update the status to fully committed. * * To avoid touching the first page twice, skip marking subcommitted * for the subxids on that first page. */ if (status == TRANSACTION_STATUS_COMMITTED) set_status_by_pages(nsubxids - nsubxids_on_first_page, subxids + nsubxids_on_first_page, TRANSACTION_STATUS_SUB_COMMITTED, lsn); /* * Now set the parent and subtransactions on same page as the parent, * if any */ pageno = TransactionIdToPage(xid); TransactionIdSetPageStatus(xid, nsubxids_on_first_page, subxids, status, lsn, pageno, false); /* * Now work through the rest of the subxids one clog page at a time, * starting from the second page onwards, like we did above. */ set_status_by_pages(nsubxids - nsubxids_on_first_page, subxids + nsubxids_on_first_page, status, lsn); Is the comment correct? If not, should we remove it? -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Make EXPLAIN generate a generic plan for a parameterized query
On Wed, 2022-10-12 at 00:03 +0800, Julien Rouhaud wrote: > On Tue, Oct 11, 2022 at 09:49:14AM -0400, Tom Lane wrote: > > I think it might be better to drive it off an explicit EXPLAIN option, > > perhaps > > > > EXPLAIN (GENERIC_PLAN) SELECT * FROM tab WHERE col = $1; > > > > If you're trying to investigate custom-plan behavior, then you > > need to supply concrete parameter values somewhere, so I think > > this approach is fine for that case. (Shoehorning parameter > > values into EXPLAIN options seems like it'd be a bit much.) > > However, investigating generic-plan behavior this way is tedious, > > since you have to invent irrelevant parameter values, plus mess > > with plan_cache_mode or else run the explain half a dozen times. > > So I can get behind having a more convenient way for that. > > One common use case is tools identifying a slow query using > pg_stat_statements, > identifying some missing indexes and then wanting to check whether the index > should be useful using some hypothetical index. > > FTR I'm working on such a project and for now we have to go to great lengths > trying to "unjumble" such queries, so having a way to easily get the answer > for > a generic plan would be great. Thanks for the suggestions and the encouragement. Here is a patch that implements it with an EXPLAIN option named GENERIC_PLAN. Yours, Laurenz Albe From 85991f35f0de6e4e0a0b5843373e2ba3d5976c85 Mon Sep 17 00:00:00 2001 From: Laurenz Albe Date: Tue, 25 Oct 2022 11:01:53 +0200 Subject: [PATCH] Add EXPLAIN option GENERIC_PLAN This allows EXPLAIN to generate generic plans for parameterized statements (that have parameter placeholders like $1 in the statement text). Author: Laurenz Albe Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel%40cybertec.at --- doc/src/sgml/ref/explain.sgml | 15 ++ src/backend/commands/explain.c| 9 + src/backend/parser/analyze.c | 13 + src/backend/parser/parse_coerce.c | 15 ++ src/backend/parser/parse_expr.c | 16 +++ src/include/commands/explain.h| 1 + src/include/parser/parse_node.h | 2 ++ src/test/regress/expected/explain.out | 28 +++ src/test/regress/sql/explain.sql | 16 +++ 9 files changed, 115 insertions(+) diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml index d4895b9d7d..659d5c51b6 100644 --- a/doc/src/sgml/ref/explain.sgml +++ b/doc/src/sgml/ref/explain.sgml @@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] statementboolean ] COSTS [ boolean ] SETTINGS [ boolean ] +GENERIC_PLAN [ boolean ] BUFFERS [ boolean ] WAL [ boolean ] TIMING [ boolean ] @@ -167,6 +168,20 @@ ROLLBACK; + +GENERIC_PLAN + + + Generate a generic plan for the statement (see + for details about generic plans). The statement can contain parameter + placeholders like $1 and must be a statement that can + use parameters. This option cannot be used together with + ANALYZE, since a statement with unknown parameters + cannot be executed. + + + + BUFFERS diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index f86983c660..7b7ca3f90a 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -190,6 +190,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, es->wal = defGetBoolean(opt); else if (strcmp(opt->defname, "settings") == 0) es->settings = defGetBoolean(opt); + else if (strcmp(opt->defname, "generic_plan") == 0) + es->generic = defGetBoolean(opt); else if (strcmp(opt->defname, "timing") == 0) { timing_set = true; @@ -227,6 +229,13 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, parser_errposition(pstate, opt->location))); } + /* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */ + if (es->generic && es->analyze) + ereport(ERROR, +(errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("EXPLAIN ANALYZE cannot be used with GENERIC_PLAN"))); + + /* check that WAL is used with EXPLAIN ANALYZE */ if (es->wal && !es->analyze) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 6688c2a865..c849765151 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -27,6 +27,7 @@ #include "access/sysattr.h" #include "catalog/pg_proc.h" #include "catalog/pg_type.h" +#include "commands/defrem.h" #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" @@ -2894,6 +2895,18 @@ static Query * transformExplainStmt(ParseState *pstate, ExplainStmt *stmt) { Query *result; + ListCell *lc; + bool generic_plan; + + foreach(lc, stmt->options) + { + DefElem*opt = (DefElem *) lfirst(lc); + + if (strcmp(opt->defname, "generic_plan") == 0) +
Re: Make EXPLAIN generate a generic plan for a parameterized query
Hi, On Tue, Oct 25, 2022 at 11:08:27AM +0200, Laurenz Albe wrote: > > Here is a patch that > implements it with an EXPLAIN option named GENERIC_PLAN. I only have a quick look at the patch for now. Any reason why you don't rely on the existing explain_filter() function for emitting stable output (without having to remove the costs)? It would also take care of checking that it works in plpgsql.
Re: Question about "compound" queries.
Thanks a lot for the reply and timely help! On 25.10.2022 01:36, David G. Johnston wrote: I suspect they came about out of simplicity - being able to simply take a text file with a bunch of SQL commands in a script and send them as-is to the server without any client-side parsing and let the server just deal with it. It works because the system needs to do those kinds of things anyway so, why not make it user-facing, even if most uses would find its restrictions makes it undesirable to use. David J. All the best, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Allow file inclusion in pg_hba and pg_ident files
On Tue, Oct 25, 2022 at 03:08:59PM +0800, Julien Rouhaud wrote: > On Tue, Oct 25, 2022 at 03:43:21PM +0900, Michael Paquier wrote: >> Another advantage is that it minimizes the presence of the hardcoded >> HbaFileName and IdentFileName in hba.c, which is one thing we are >> trying to achieve here for the inclusion of more files. I found a bit >> strange that IdentLine had no sourcefile, actually. We track the file >> number but use it nowhere, and it seems to me that having more >> symmetry between both would be a good thing. > > If IdentLine->linenumber is useless, why not get rid of it rather than > tracking > another useless info? Hba is much more structured so we need a more > specialized struct, but I don't think ident will ever go that way. Hmm. I would be tempted to keep track of the file name and the line number as well in IdentLine. One reason is that this can become useful for debugging. A second is that this can reduce a bit the arguments of fill_ident_line() and fill_hba_line() in hbafuncs.c once we track these in HbaLine and IdentLine. And HEAD is slightly overdoing it in its interface for the line number, actually, as we pass the line number twice: from {Ident,Hba}Line and the respective field from TokenizedAuthLine. -- Michael signature.asc Description: PGP signature
Re: explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))
On Thu, 2022-10-20 at 21:09 -0500, Justin Pryzby wrote: > Rebased. I had a look at the patch set. It applies and builds cleanly and passes the regression tests. 0001: Add GUC: explain_regress I like the idea of the "explain_regress" GUC. That should simplify the regression tests. --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -625,7 +625,7 @@ initialize_environment(void) * user's ability to set other variables through that. */ { - const char *my_pgoptions = "-c intervalstyle=postgres_verbose"; + const char *my_pgoptions = "-c intervalstyle=postgres_verbose -c explain_regress=on"; const char *old_pgoptions = getenv("PGOPTIONS"); char *new_pgoptions; @@ -2288,6 +2288,7 @@ regression_main(int argc, char *argv[], fputs("log_lock_waits = on\n", pg_conf); fputs("log_temp_files = 128kB\n", pg_conf); fputs("max_prepared_transactions = 2\n", pg_conf); + // fputs("explain_regress = yes\n", pg_conf); for (sl = temp_configs; sl != NULL; sl = sl->next) { This code comment is a leftover and should go. 0002: exercise explain_regress This is the promised simplification. Looks good. 0003: Make explain default to BUFFERS TRUE Yes, please! This patch is independent from the previous patches. I'd like this to be applied, even if the GUC is rejected. (I understand that that would cause more changes in the regression test output if we changed that without introducing "explain_regress".) The patch changes the default value of "auto_explain.log_buffers" to "on", which makes sense. However, the documentation in doc/src/sgml/auto-explain.sgml should be updated to reflect that. --- a/doc/src/sgml/perform.sgml +++ b/doc/src/sgml/perform.sgml @@ -731,8 +731,8 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)'; -EXPLAIN has a BUFFERS option that can be used with -ANALYZE to get even more run time statistics: +EXPLAIN ANALYZE has a BUFFERS option which +provides even more run time statistics: EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000; This is not enough. The patch would have to update all the examples that use EXPLAIN ANALYZE. I see two options: 1. Change the output of all the examples and move this explanation to the first example with EXPLAIN ANALYZE: The numbers in the Buffers: line help to identify which parts of the query are the most I/O-intensive. 2. Change all examples that currently do *not* use BUFFERS to EXPLAIN (BUFFERS OFF). Then you could change the example that shows BUFFERS to something like If you do not suppress the BUFFERS option that can be used with EXPLAIN (ANALYZE), you get even more run time statistics: 0004, 0005, 0006, 0007: EXPLAIN (MACHINE) I think it is confusing that these are included in this patch set. EXPLAIN (MACHINE OFF) is similar to "explain_regress = on", only it goes even further: no query ID, no Heap Fetches, no Sort details, ... Why not add this functionality to the GUC? 0005 suppresses "rows removed by filter", but how is that machine dependent? > BTW, I think it may be that the GUC should be marked PGDLLIMPORT ? I think it is project policy to apply this mark wherever it is needed. Do you think that third-party extensions might need to use this in C code? Yours, Laurenz Albe
Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
On Tue, 25 Oct 2022 01:03:23 +0100, Jacob Champion said: > I'd like to try to get this conversation started again. To pique > interest I've attached a new version of 0001, which implements > `sslrootcert=system` instead as suggested upthread. In 0002 I went > further and switched the default sslmode to `verify-full` when using > the system CA roots, because I feel pretty strongly that anyone > interested in using public CA systems is also interested in verifying > hostnames. (Otherwise, why make the switch?) Yeah I agree that not forcing verify-full when using system CAs is a giant foot-gun, and many will stop configuring just until it works. Is there any argument for not checking hostname when using a CA pool for which literally anyone can create a cert that passes? It makes sense for self-signed, or "don't care", since that provides at least protection against passive attacks, but if someone went out of their way to get a third party signed cert, then it doesn't. One downside to this approach is that now one option will change the value of another option. For SSL mode (my rejected patch :-) ) that makes maybe some more sense. For users, what is more surprising: A foot-gun that sounds safe, or one option that overrides another? -- typedef struct me_s { char name[] = { "Thomas Habets" }; char email[] = { "tho...@habets.se" }; char kernel[]= { "Linux" }; char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt"; }; char pgp[] = { "9907 8698 8A24 F52F 1C2E 87F6 39A4 9EEA 460A 0169" }; char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" }; } me_t;
Re: Fix gin index cost estimation
Hi, Ronan! On Wed, Oct 12, 2022 at 10:15 AM Ronan Dunklau wrote: > > > You're right, I was too eager to try to raise the CPU cost proportionnally > to > > > the number of array scans (scalararrayop). I'd really like to understand > where > > > this equation comes from though... > > > > So, what's the latest update here? > > Thanks Michael for reviving this thread. > > Before proceeding any further with this, I'd like to understand where we > stand. Tom argued my way of charging cost per entry pages visited boils down > to charging per tuple, which I expressed disagreement with. > > If we can come to a consensus whether that's a bogus way of thinking about it > I'll proceed with what we agree on. I briefly read the thread. I think this line is copy-paste from other index access methods and trying to estimate the whole index scan CPU cost by bypassing all the details. *indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost); I think Tom's point was that it's wrong to add a separate entry-tree CPU cost estimation to another estimation, which tries (very inadequately) to estimate the whole scan cost. Instead, I propose writing better estimations for both entry-tree CPU cost and data-trees CPU cost and replacing existing CPU estimation altogether. -- Regards, Alexander Korotkov
Re: parse partition strategy string in gram.y
It will often happen that some hash keys are more frequently referenced than others. Consider a scenario where customer_id is the hash key, and one customer is very large in terms of their activity, like IBM, and other keys have much less activity. This asymmetry creates a noisy neighbor problem. Some partitions may have more than one noisy neighbor, and in general it would be more flexible to be able to divide the work evenly in terms of activity instead of evenly with respect to the encoding of the keys. On 10/24/22, 8:50 PM, "Tom Lane" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Alvaro Herrera writes: > On 2022-Oct-24, Finnerty, Jim wrote: >> The advantage of hash partition bounds is that they are not >> domain-specific, as they are for ordinary RANGE partitions, but they >> are more flexible than MODULUS/REMAINDER partition bounds. I'm more than a bit skeptical of that claim. Under what circumstances (other than a really awful hash function, perhaps) would it make sense to not use equi-sized hash partitions? regards, tom lane
Re: Fix gin index cost estimation
Alexander Korotkov writes: > I think Tom's point was that it's wrong to add a separate entry-tree CPU > cost estimation to another estimation, which tries (very inadequately) to > estimate the whole scan cost. Instead, I propose writing better estimations > for both entry-tree CPU cost and data-trees CPU cost and replacing existing > CPU estimation altogether. Great idea, if someone is willing to do it ... regards, tom lane
Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
On 2022-10-25 Tu 07:01, tho...@habets.se wrote: > On Tue, 25 Oct 2022 01:03:23 +0100, Jacob Champion > said: >> I'd like to try to get this conversation started again. To pique >> interest I've attached a new version of 0001, which implements >> `sslrootcert=system` instead as suggested upthread. In 0002 I went >> further and switched the default sslmode to `verify-full` when using >> the system CA roots, because I feel pretty strongly that anyone >> interested in using public CA systems is also interested in verifying >> hostnames. (Otherwise, why make the switch?) > Yeah I agree that not forcing verify-full when using system CAs is a > giant foot-gun, and many will stop configuring just until it works. > > Is there any argument for not checking hostname when using a CA pool > for which literally anyone can create a cert that passes? > > It makes sense for self-signed, or "don't care", since that provides > at least protection against passive attacks, but if someone went out > of their way to get a third party signed cert, then it doesn't. > > One downside to this approach is that now one option will change the > value of another option. For SSL mode (my rejected patch :-) ) that > makes maybe some more sense. > > For users, what is more surprising: A foot-gun that sounds safe, or > one option that overrides another? I don't find too much difficulty in having one option's default depend on another's value, as long as it's documented. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Confused about TransactionIdSetTreeStatus
On 25/10/2022 12:02, Japin Li wrote: I'm a bit confused about TransactionIdSetTreeStatus, the comment says if subtransactions cross multiple CLOG pages, it will mark the subxids, that are on the same page as the main transaction, as sub-committed, and then set main transaction and subtransactions to committed (step 2). * Example: * TransactionId t commits and has subxids t1, t2, t3, t4 * t is on page p1, t1 is also on p1, t2 and t3 are on p2, t4 is on p3 * 1. update pages2-3: * page2: set t2,t3 as sub-committed * page3: set t4 as sub-committed * 2. update page1: * set t1 as sub-committed, * then set t as committed, then set t1 as committed * 3. update pages2-3: * page2: set t2,t3 as committed * page3: set t4 as committed However, the code marks the main transaction and subtransactions directly to the committed. Hmm, yeah, step 2 in this example doesn't match reality. We actually set t and t1 directly as committed. The explanation above that comment is correct, but the example is not. It used to work the way the example says, but that was changed in commit 06da3c570f21394003fc392d80f54862f7dec19f. Ironically, that commit also added the outdated comment. The correct example would be: TransactionId t commits and has subxids t1, t2, t3, t4 t is on page p1, t1 is also on p1, t2 and t3 are on p2, t4 is on p3 1. update pages2-3: page2: set t2,t3 as sub-committed page3: set t4 as sub-committed 2. update page1: page1: set t,t1 as committed, 3. update pages2-3: page2: set t2,t3 as committed page3: set t4 as committed - Heikki
Re: [PATCHES] Post-special page storage TDE support
> > Explicitly > > locking (assuming you stay in your lane) should only need to guard > > against access from other > > backends of this type if using shared buffers, so will be use-case > > dependent. > > I'm not sure what you mean here? I'm mainly pointing out that the specific code that manages this feature is the only one who has to worry about modifying said page region. > > This does have a runtime overhead due to moving some offset > > calculations from compile time to > > runtime. It is thought that the utility of this feature will outweigh > > the costs here. > > Have you done some benchmarking to give an idea of how much overhead we're > talking about? Not yet, but I am going to work on this. I suspect the current code could be improved, but will try to get some sort of measurement of the additional overhead. > > Candidates for page features include 32-bit or 64-bit checksums, > > encryption tags, or additional > > per-page metadata. > > > > While we are not currently getting rid of the pd_checksum field, this > > mechanism could be used to > > free up that 16 bits for some other purpose. > > IIUC there's a hard requirement of initdb-time initialization, as there's > otherwise no guarantee that you will find enough free space in each page at > runtime. It seems like a very hard requirement for a full replacement of the > current checksum approach (even if I agree that the current implementation > limitations are far from ideal), especially since there's no technical reason > that would prevent us from dynamically enabling data-checksums without doing > all the work when the cluster is down. As implemented, that is correct; we are currently assuming this specific feature mechanism is set at initdb time only. Checksums are not the primary motivation here, but were something that I could use for an immediate illustration of the feature. That said, presumably you could define a way to set the features per-relation (say with a template field in pg_class) which would propagate to a relation on rewrite, so there could be ways to handle things incrementally, were this an overall goal. Thanks for looking, David
Re: Confused about TransactionIdSetTreeStatus
On Tue, 25 Oct 2022 at 22:46, Heikki Linnakangas wrote: > On 25/10/2022 12:02, Japin Li wrote: >> However, the code marks the main transaction and subtransactions directly >> to the committed. > > Hmm, yeah, step 2 in this example doesn't match reality. We actually > set t and t1 directly as committed. The explanation above that comment > is correct, but the example is not. It used to work the way the > example says, but that was changed in commit > 06da3c570f21394003fc392d80f54862f7dec19f. Ironically, that commit also > added the outdated comment. > > The correct example would be: > > TransactionId t commits and has subxids t1, t2, t3, t4 t is on page > p1, t1 is also on p1, t2 and t3 are on p2, t4 is on p3 > 1. update pages2-3: > page2: set t2,t3 as sub-committed > page3: set t4 as sub-committed > 2. update page1: > page1: set t,t1 as committed, > 3. update pages2-3: > page2: set t2,t3 as committed > page3: set t4 as committed > Thanks for your explanation. Attach a patch to remove the outdated comment. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd. >From c079d8f33a0eb65b8ee9fc1f53c6c358e7ea1516 Mon Sep 17 00:00:00 2001 From: Japin Li Date: Tue, 25 Oct 2022 23:00:22 +0800 Subject: [PATCH v1 1/1] Remove outdated comment for TransactionIdSetTreeStatus Commit 06da3c570f eliminates the marking of subtransactions as SUBCOMMITTED in pg_clog during their commit, however, it introduces an outdated comment. --- src/backend/access/transam/clog.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c index a7dfcfb4da..77d9894dab 100644 --- a/src/backend/access/transam/clog.c +++ b/src/backend/access/transam/clog.c @@ -146,9 +146,7 @@ static void TransactionIdSetPageStatusInternal(TransactionId xid, int nsubxids, * page2: set t2,t3 as sub-committed * page3: set t4 as sub-committed * 2. update page1: - * set t1 as sub-committed, - * then set t as committed, - then set t1 as committed + * page1: set t,t1 as committed * 3. update pages2-3: * page2: set t2,t3 as committed * page3: set t4 as committed -- 2.25.1
Re: [PATCH] Fix build with LLVM 15 or above
Hi, On Tue, 2022-10-18 at 22:06 +1300, Thomas Munro wrote: > Will do first thing tomorrow. Just wanted to confirm that I pushed Fedora RPMs built against LLVM 15 by adding these patches. Thanks Thomas. Regards, -- Devrim Gündüz Open Source Solution Architect, Red Hat Certified Engineer Twitter: @DevrimGunduz , @DevrimGunduzTR signature.asc Description: This is a digitally signed message part
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi Arne, On Mon, 2022-10-24 at 15:27 +, Arne Roland wrote: > Hello Reid, > > could you rebase the patch again? It doesn't apply currently > (http://cfbot.cputube.org/patch_40_3867.log). Thanks! rebased patches attached. > You mention, that you want to prevent the compiler from getting > cute.I don't think this comments are exactly helpful in the current > state. I think probably fine to just omit them. I attempted to follow previous convention when adding code and these comments have been consistently applied throughout backend_status.c where a volatile pointer is being used. > I don't understand the purpose of the result variable in > exceeds_max_total_bkend_mem. What purpose does it serve? > > I really like the simplicity of the suggestion here to prevent oom. If max_total_backend_memory is configured, exceeds_max_total_bkend_mem() will return true if an allocation request will push total backend memory allocated over the configured value. exceeds_max_total_bkend_mem() is implemented in the various allocators along the lines of ...snip... /* Do not exceed maximum allowed memory allocation */ if (exceeds_max_total_bkend_mem('new request size')) return NULL; ...snip... Do not allocate the memory requested, return NULL instead. PG already had code in place to handle NULL returns from allocation requests. The allocation code in aset.c, slab.c, generation.c, dsm_impl.c utilizes exceeds_max_total_bkend_mem() max_total_backend_memory (integer) Specifies a limit to the amount of memory (MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. A backend request that would push the total over the limit will be denied with an out of memory error causing that backend's current query/transaction to fail. Due to the dynamic nature of memory allocations, this limit is not exact. If within 1.5MB of the limit and two backends request 1MB each at the same time both may be allocated, and exceed the limit. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. This limit does not affect auxiliary backend processes Auxiliary process . Backend memory allocations (backend_mem_allocated) are displayed in the pg_stat_activity view. > I intent to play around with a lot of backends, once I get a rebased > patch. > > Regards > Arne From ab654a48ec7bfbc3bc377c5757a04f1756e72e79 Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Thu, 11 Aug 2022 12:01:25 -0400 Subject: [PATCH 1/2] Add tracking of backend memory allocated to pg_stat_activity This new field displays the current bytes of memory allocated to the backend process. It is updated as memory for the process is malloc'd/free'd. Memory allocated to items on the freelist is included in the displayed value. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. On occasion, orphaned memory segments may be cleaned up on postmaster startup. This may result in decreasing the sum without a prior increment. We limit the floor of backend_mem_allocated to zero. Updated pg_stat_activity documentation for the new column. --- doc/src/sgml/monitoring.sgml| 12 +++ src/backend/catalog/system_views.sql| 1 + src/backend/storage/ipc/dsm_impl.c | 80 +++ src/backend/utils/activity/backend_status.c | 105 src/backend/utils/adt/pgstatfuncs.c | 4 +- src/backend/utils/mmgr/aset.c | 18 src/backend/utils/mmgr/generation.c | 15 +++ src/backend/utils/mmgr/slab.c | 21 src/include/catalog/pg_proc.dat | 6 +- src/include/utils/backend_status.h | 7 +- src/test/regress/expected/rules.out | 9 +- 11 files changed, 269 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index e5d622d514..4983bbc814 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -947,6 +947,18 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser + + + backend_mem_allocated bigint + + + The byte count of memory allocated to this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + query text diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_vie
Re: parse partition strategy string in gram.y
Or if you know the frequencies of the highly frequent values of the partitioning key at the time the partition bounds are defined, you could define hash ranges that contain approximately the same number of rows in each partition. A parallel sequential scan of all partitions would then perform better because data skew is minimized.
Re: Add tracking of backend memory allocated to pg_stat_activity
patch rebased to current master -- Reid Thompson Senior Software Engineer Crunchy Data, Inc. reid.thomp...@crunchydata.com www.crunchydata.com
Re: Add tracking of backend memory allocated to pg_stat_activity
On Tue, 2022-10-25 at 14:51 -0400, Reid Thompson wrote: > patch rebased to current master > actually attach the patch -- Reid Thompson Senior Software Engineer Crunchy Data, Inc. reid.thomp...@crunchydata.com www.crunchydata.com From ab654a48ec7bfbc3bc377c5757a04f1756e72e79 Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Thu, 11 Aug 2022 12:01:25 -0400 Subject: [PATCH] Add tracking of backend memory allocated to pg_stat_activity This new field displays the current bytes of memory allocated to the backend process. It is updated as memory for the process is malloc'd/free'd. Memory allocated to items on the freelist is included in the displayed value. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. On occasion, orphaned memory segments may be cleaned up on postmaster startup. This may result in decreasing the sum without a prior increment. We limit the floor of backend_mem_allocated to zero. Updated pg_stat_activity documentation for the new column. --- doc/src/sgml/monitoring.sgml| 12 +++ src/backend/catalog/system_views.sql| 1 + src/backend/storage/ipc/dsm_impl.c | 80 +++ src/backend/utils/activity/backend_status.c | 105 src/backend/utils/adt/pgstatfuncs.c | 4 +- src/backend/utils/mmgr/aset.c | 18 src/backend/utils/mmgr/generation.c | 15 +++ src/backend/utils/mmgr/slab.c | 21 src/include/catalog/pg_proc.dat | 6 +- src/include/utils/backend_status.h | 7 +- src/test/regress/expected/rules.out | 9 +- 11 files changed, 269 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index e5d622d514..4983bbc814 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -947,6 +947,18 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser + + + backend_mem_allocated bigint + + + The byte count of memory allocated to this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + query text diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 2d8104b090..cbf804625c 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -865,6 +865,7 @@ CREATE VIEW pg_stat_activity AS S.backend_xid, s.backend_xmin, S.query_id, +S.backend_mem_allocated, S.query, S.backend_type FROM pg_stat_get_activity(NULL) AS S diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index e1b90c5de4..3356bb65b5 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -66,6 +66,7 @@ #include "postmaster/postmaster.h" #include "storage/dsm_impl.h" #include "storage/fd.h" +#include "utils/backend_status.h" #include "utils/guc.h" #include "utils/memutils.h" @@ -232,6 +233,13 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, name))); return false; } + /* + * Detach and destroy pass through here, only decrease the memory + * shown allocated in pg_stat_activity when the creator destroys the + * allocation. + */ + if (op == DSM_OP_DESTROY) + pgstat_report_backend_mem_allocated_decrease(*mapped_size); *mapped_address = NULL; *mapped_size = 0; if (op == DSM_OP_DESTROY && shm_unlink(name) != 0) @@ -332,6 +340,36 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, name))); return false; } + + /* + * Attach and create pass through here, only update backend memory + * allocated in pg_stat_activity for the creator process. + */ + if (op == DSM_OP_CREATE) + { + /* + * Posix creation calls dsm_impl_posix_resize implying that resizing + * occurs or may be added in the future. As implemented + * dsm_impl_posix_resize utilizes fallocate or truncate, passing the + * whole new size as input, growing the allocation as needed (only + * truncate supports shrinking). We update by replacing the old + * allocation with the new. + */ +#if defined(HAVE_POSIX_FALLOCATE) && defined(__linux__) + /* + * posix_fallocate does not shrink allocations, adjust only on + * allocation increase. + */ + if (request_size > *mapped_size) + { + pgstat_report_backend_mem_allocated_decrease(*mapped_size); + pgstat_report_backend_mem_allocated_increase(request_size); + } +#else + pgstat_report_backend_mem_allocated_decrease(*mapped_size); + pgstat_report_backend
Re: Confused about TransactionIdSetTreeStatus
On 25/10/2022 18:09, Japin Li wrote: On Tue, 25 Oct 2022 at 22:46, Heikki Linnakangas wrote: On 25/10/2022 12:02, Japin Li wrote: However, the code marks the main transaction and subtransactions directly to the committed. Hmm, yeah, step 2 in this example doesn't match reality. We actually set t and t1 directly as committed. The explanation above that comment is correct, but the example is not. It used to work the way the example says, but that was changed in commit 06da3c570f21394003fc392d80f54862f7dec19f. Ironically, that commit also added the outdated comment. The correct example would be: TransactionId t commits and has subxids t1, t2, t3, t4 t is on page p1, t1 is also on p1, t2 and t3 are on p2, t4 is on p3 1. update pages2-3: page2: set t2,t3 as sub-committed page3: set t4 as sub-committed 2. update page1: page1: set t,t1 as committed, 3. update pages2-3: page2: set t2,t3 as committed page3: set t4 as committed Thanks for your explanation. Attach a patch to remove the outdated comment. Applied, thanks! - Heikki
Re: GUC values - recommended way to declare the C variables?
+#ifdef USE_ASSERT_CHECKING + sanity_check_GUC_C_var(hentry->gucvar); +#endif => You can conditionally define that as an empty function so #ifdefs aren't needed in the caller: void sanity_check_GUC_C_var() { #ifdef USE_ASSERT_CHECKING ... #endif } + /* Skip checking for dynamic (compiler-dependent) GUCs. */ => This should say that the GUC's default is determined at compile-time. But actually, I don't think you should use my patch. You needed to exclude update_process_title: src/backend/utils/misc/ps_status.c:bool update_process_title = true; ... src/backend/utils/misc/guc_tables.c-#ifdef WIN32 src/backend/utils/misc/guc_tables.c-false, src/backend/utils/misc/guc_tables.c-#else src/backend/utils/misc/guc_tables.c-true, src/backend/utils/misc/guc_tables.c-#endif src/backend/utils/misc/guc_tables.c-NULL, NULL, NULL My patch would also exclude the 16 other GUCs with compile-time defaults from your check. It'd be better not to exclude them; I think the right solution is to change the C variable initialization to a compile-time constant: #ifdef WIN32 bool update_process_title = false; #else bool update_process_title = true; #endif Or something more indirect like: #ifdef WIN32 #define DEFAULT_PROCESS_TITLE false #else #define DEFAULT_PROCESS_TITLE true #endif bool update_process_title = DEFAULT_PROCESS_TITLE; I suspect there's not many GUCs that would need to change - this might be the only one. If this GUC were defined in the inverse (bool skip_process_title), it wouldn't need special help, either. -- Justin
Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
On Tue, Oct 25, 2022 at 4:01 AM wrote: > Yeah I agree that not forcing verify-full when using system CAs is a > giant foot-gun, and many will stop configuring just until it works. > > Is there any argument for not checking hostname when using a CA pool > for which literally anyone can create a cert that passes? I don't think so. For verify-ca to make any sense, the system CA pool would need to be very strictly curated, and IMO we already have that use case covered today. If there are no valuable use cases for weaker checks, then we could go even further than my 0002 and just reject any weaker sslmodes outright. That'd be nice. --Jacob
Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
On Tue, Oct 25, 2022 at 7:26 AM Andrew Dunstan wrote: > I don't find too much difficulty in having one option's default depend > on another's value, as long as it's documented. My patch is definitely missing the documentation for that part right now; I wanted to get feedback on the approach before wordsmithing too much. Thanks, --Jacob
Reducing duplicativeness of EquivalenceClass-derived clauses
While fooling with my longstanding outer-join variables changes (I am making progress on that, honest), I happened to notice that equivclass.c is leaving some money on the table by generating redundant RestrictInfo clauses. It already attempts to not generate the same clause twice, which can save a nontrivial amount of work because we cache selectivity estimates and so on per-RestrictInfo. I realized though that it will build and return a clause like "a.x = b.y" even if it already has "b.y = a.x". This is just wasteful. It's always been the case that equivclass.c will produce clauses that are ordered according to its own whims. Consumers that need the operands in a specific order, such as index scans or hash joins, are required to commute the clause to be the way they want it while building the finished plan. Therefore, it shouldn't matter which order of the operands we return, and giving back the commutator clause if available could potentially save as much as half of the selectivity-estimation work we do with these clauses subsequently. Hence, PFA a patch that adjusts create_join_clause() to notice commuted as well as exact matches among the EquivalenceClass's existing clauses. This results in a number of changes visible in regression test cases, but they're all clearly inconsequential. The only thing that I think might be controversial here is that I dropped the check for matching operator OID. To preserve that, we'd have needed to use get_commutator() in the reverse-match cases, which it seemed to me would be a completely unjustified expenditure of cycles. The operators we select for freshly-generated clauses will certainly always match those of previously-generated clauses. Maybe there's a chance that they'd not match those of ec_sources clauses (that is, the user-written clauses we started from), but if they don't and that actually makes any difference then surely we are talking about a buggy opclass definition. I've not bothered to make any performance tests to see if there's actually an easily measurable gain here. Saving some duplicative selectivity estimates could be down in the noise ... but it's surely worth the tiny number of extra tests added here. Comments? regards, tom lane diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index b3c8ce0131..558e94b845 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -526,7 +526,7 @@ EXPLAIN (VERBOSE, COSTS OFF) -> Foreign Scan Output: t3.c1 Relations: (public.ft1 t2) INNER JOIN (public.ft2 t3) - Remote SQL: SELECT r3."C 1" FROM ("S 1"."T 1" r2 INNER JOIN "S 1"."T 1" r3 ON (((r2."C 1" = r3."C 1" ORDER BY r2."C 1" ASC NULLS LAST + Remote SQL: SELECT r3."C 1" FROM ("S 1"."T 1" r2 INNER JOIN "S 1"."T 1" r3 ON (((r3."C 1" = r2."C 1" ORDER BY r2."C 1" ASC NULLS LAST -> Index Only Scan using t1_pkey on "S 1"."T 1" t1 Output: t1."C 1" (12 rows) @@ -740,7 +740,7 @@ EXPLAIN (VERBOSE, COSTS OFF) Index Cond: (a."C 1" = 47) -> Foreign Scan on public.ft2 b Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1")) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer)) (8 rows) SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2; @@ -763,7 +763,7 @@ EXPLAIN (VERBOSE, COSTS OFF) Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c2 = 6)) -> Foreign Scan on public.ft2 b Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8 - Filter: (upper((a.c7)::text) = (b.c7)::text) + Filter: ((b.c7)::text = upper((a.c7)::text)) Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1")) (10 rows) @@ -1220,7 +1220,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t Foreign Scan Output: t1.c1, t2.c1, t1.c3 Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2) - Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1" ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST LIMIT 10::bigint OFFSET 100::bigint + Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = r1."C 1" ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST LIMIT 10::bigint OFFSET 100::bigint (4 rows) SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10; @@ -1246,7 +1246,7 @@ SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t Foreign Scan Output: t1.c1, t2.c2, t3.c3, t1.c3 Relation
Re: [PATCH] Improve tab completion for ALTER TABLE on identity columns
> Hi Hackers, > > I noticed that psql has no tab completion around identity columns in > ALTER TABLE, so here's some patches for that. > > In passing, I also added completion for ALTER SEQUECNE … START, which was > missing for some reason. > > - ilmari Hi ilmari I've tested all 4 of your patches, and all of them seem to work as expected. This is my first time reviewing a patch, so let's see if more experience hackers has anything more to say about these patches, but at first they seem correct to me. -- Matheus Alcantara
Re: [PATCH] CF app: add "Returned: Needs more interest"
On Mon, Aug 8, 2022 at 8:45 AM Andres Freund wrote: > On 2022-08-08 08:37:41 -0700, Jacob Champion wrote: > > Agreed. This probably bleeds over into the other documentation thread > > a bit -- how do we want to communicate the subtle points to people in > > a CF? > > We should write a docs patch for it and then reference if from a bunch of > places. I started down that road a few years back [1] but unfortunately lost > steam. As we approach a new CF, I'm reminded of this patch again. Are there any concerns preventing a consensus here, that I can help with? I can draft the docs patch that Andres has suggested, if that's seen as a prerequisite. Thanks, --Jacob
Re: parse partition strategy string in gram.y
On 2022-Oct-25, Finnerty, Jim wrote: > Or if you know the frequencies of the highly frequent values of the > partitioning key at the time the partition bounds are defined, you > could define hash ranges that contain approximately the same number of > rows in each partition. A parallel sequential scan of all partitions > would then perform better because data skew is minimized. This sounds very much like list partitioning to me. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "The problem with the future is that it keeps turning into the present" (Hobbes)
Re: parse partition strategy string in gram.y
On 2022-Oct-26, Alvaro Herrera wrote: > On 2022-Oct-25, Finnerty, Jim wrote: > > > Or if you know the frequencies of the highly frequent values of the > > partitioning key at the time the partition bounds are defined, you > > could define hash ranges that contain approximately the same number of > > rows in each partition. A parallel sequential scan of all partitions > > would then perform better because data skew is minimized. > > This sounds very much like list partitioning to me. ... or maybe you mean "if the value is X then use this specific partition, otherwise use hash partitioning". It's a bit like multi-level partitioning, but not really. (You could test this idea by using two levels, list partitioning on top with a default partition which is in turn partitioned by hash; but this is unlikely to work well for large scale in practice. Or does it?) -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Entristecido, Wutra (canción de Las Barreras) echa a Freyr a rodar y a nosotros al mar"
Re: [PATCH] Fix build with LLVM 15 or above
On Wed, Oct 26, 2022 at 4:28 AM Devrim Gündüz wrote: > On Tue, 2022-10-18 at 22:06 +1300, Thomas Munro wrote: > > Will do first thing tomorrow. > > Just wanted to confirm that I pushed Fedora RPMs built against LLVM 15 > by adding these patches. > > Thanks Thomas. Cool. FTR I still have to finish the 'real' fixes for LLVM 16. Their cadence is one major release every 6 months, putting it at about April '23, but I'll try to get it ready quite soon on our master branch. BF animal seawasp is green again for now, but I expect it will turn back to red pretty soon when they start ripping out the deprecated stuff on their master branch...
Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
On Tue, Oct 25, 2022 at 09:37:08AM +0200, Alvaro Herrera wrote: > Okay, so if we follow this argument, then the logical conclusion is that > this *should* be backpatched, after all. After sleeping on it and looking at all the stable branches involved, backpatched down to v10. -- Michael signature.asc Description: PGP signature
Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options
On Sun, 23 Oct 2022 at 03:03, Vik Fearing wrote: > Shouldn't it be able to detect that these two windows are the same and > only do one WindowAgg pass? > > > explain (verbose, costs off) > select row_number() over w1, > lag(amname) over w2 > from pg_am > window w1 as (order by amname), > w2 as (w1 rows unbounded preceding) > ; Good thinking. I think the patch should also optimise that case. It requires re-doing a similar de-duplication phase the same as what's done in transformWindowFuncCall(). I've added code to do that in the attached version. This got me wondering if the support function, instead of returning some more optimal versions of the frameOptions, I wondered if it should just return which aspects of the WindowClause it does not care about. For example, SELECT row_number() over (), lag(relname) over (order by relname) from pg_class; could, in theory, have row_number() reuse the WindowAgg for lag. Here because the WindowClause for row_number() has an empty ORDER BY clause, I believe it could just reuse the lag's WindowClause. It wouldn't be able to do that if row_number() had an ORDER BY, or if row_number() were some other WindowFunc that cared about peer rows. I'm currently thinking this might not be worth the trouble as it seems a bit unlikely that someone would use row_number() and not care about the ORDER BY. However, maybe the row_number() could reuse some other WindowClause with a more strict ordering. My current thoughts are that this feels a bit too unlikely to apply in enough cases for it to be worthwhile. I just thought I'd mention it for the sake of the archives. David Thanks for taking it for a spin. David diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 78a8174534..43468081f3 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -38,6 +38,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #ifdef OPTIMIZER_DEBUG #include "nodes/print.h" #endif @@ -207,6 +208,8 @@ static PathTarget *make_partial_grouping_target(PlannerInfo *root, PathTarget *grouping_target, Node *havingQual); static List *postprocess_setop_tlist(List *new_tlist, List *orig_tlist); +static void optimize_window_clauses(PlannerInfo *root, + WindowFuncLists *wflists); static List *select_active_windows(PlannerInfo *root, WindowFuncLists *wflists); static PathTarget *make_window_input_target(PlannerInfo *root, PathTarget *final_target, @@ -1422,7 +1425,16 @@ grouping_planner(PlannerInfo *root, double tuple_fraction) wflists = find_window_functions((Node *) root->processed_tlist, list_length(parse->windowClause)); if (wflists->numWindowFuncs > 0) + { + /* +* See if any modifications can be made to each WindowClause +* to allow the executor to execute the WindowFuncs more +* quickly. +*/ + optimize_window_clauses(root, wflists); + activeWindows = select_active_windows(root, wflists); + } else parse->hasWindowFuncs = false; } @@ -5391,6 +5403,150 @@ postprocess_setop_tlist(List *new_tlist, List *orig_tlist) return new_tlist; } +/* + * optimize_window_clauses + * Call each WindowFunc's prosupport function to see if we're able to + * make any adjustments to any of the WindowClause's so that the executor + * can execute the window functions in a more optimal way. + * + * Currently we only allow adjustments to the WindowClause's frameOptions. We + * may allow more things to be done here in the future. + */ +static void +optimize_window_clauses(PlannerInfo *root, WindowFuncLists *wflists) +{ + List *windowClause = root->parse->windowClause; + ListCell *lc; + + foreach(lc, windowClause) + { + WindowClause *wc = lfirst_node(WindowClause, lc); + ListCell *lc2; + int optimizedFrameOptions = 0; + + Assert(wc->winref <= wflists->maxWinRef); + + /* skip any WindowClauses that have no WindowFuncs */ + if (wflists->windowFuncs[wc->winref] == NIL)
Re: Allow file inclusion in pg_hba and pg_ident files
On Tue, Oct 25, 2022 at 08:59:57PM +0900, Michael Paquier wrote: > > Hmm. I would be tempted to keep track of the file name and the line > number as well in IdentLine. One reason is that this can become > useful for debugging. A second is that this can reduce a bit the > arguments of fill_ident_line() and fill_hba_line() in hbafuncs.c once > we track these in HbaLine and IdentLine. Ok, I guess something like the attached v14 is what you want. > And HEAD is slightly > overdoing it in its interface for the line number, actually, as we > pass the line number twice: from {Ident,Hba}Line and the respective > field from TokenizedAuthLine. That wouldn't be overdoing anymore if we remove the line number / filename from the fill_*_line prototypes right? >From ecc27b9101acf40f4888da8be033c70e6f21358a Mon Sep 17 00:00:00 2001 From: Julien Rouhaud Date: Tue, 25 Oct 2022 15:17:27 +0900 Subject: [PATCH v14 1/5] Refactor knowledge of origin file in hba.c This limits the footprint of HbaFileName and IdentFileName to their entry loading point, easing the introduction of the inclusion logic. Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya%40jrouhaud --- src/backend/libpq/hba.c | 114 +--- src/include/libpq/hba.h | 3 ++ 2 files changed, 63 insertions(+), 54 deletions(-) diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c index ea92f02a47..56bbe31dfd 100644 --- a/src/backend/libpq/hba.c +++ b/src/backend/libpq/hba.c @@ -641,6 +641,7 @@ tokenize_auth_file(const char *filename, FILE *file, List **tok_lines, tok_line = (TokenizedAuthLine *) palloc(sizeof(TokenizedAuthLine)); tok_line->fields = current_line; + tok_line->file_name = pstrdup(filename); tok_line->line_num = line_number; tok_line->raw_line = pstrdup(buf.data); tok_line->err_msg = err_msg; @@ -984,7 +985,7 @@ do { \ errmsg("authentication option \"%s\" is only valid for authentication methods %s", \ optname, _(validmethods)), \ errcontext("line %d of configuration file \"%s\"", \ - line_num, HbaFileName))); \ + line_num, file_name))); \ *err_msg = psprintf("authentication option \"%s\" is only valid for authentication methods %s", \ optname, validmethods); \ return false; \ @@ -1004,7 +1005,7 @@ do { \ errmsg("authentication method \"%s\" requires argument \"%s\" to be set", \ authname, argname), \ errcontext("line %d of configuration file \"%s\"", \ - line_num, HbaFileName))); \ + line_num, file_name))); \ *err_msg = psprintf("authentication method \"%s\" requires argument \"%s\" to be set", \ authname, argname); \ return NULL; \ @@ -1027,7 +1028,7 @@ do { \ (errcode(ERRCODE_CONFIG_FILE_ERROR), \ errmsg("missing entry at end of line"), \ errcontext("line %d of configuration file \"%s\"", \ - line_num, IdentFileName))); \ + line_num, file_name))); \ *err_msg = pstrdup("missing entry at end of line"); \ return NULL; \ } \ @@ -1040,7 +1041,7 @@ do { \ (errcode(ERRCODE_CONFIG_FILE_ERROR), \ errmsg("multiple values in ident field"), \ errcontext("line %d of configuration file \"%s\"", \ - line_num, IdentFileName))); \ + line_num, file_name))); \ *err_msg = pstrdup("multiple values in ident field"); \ return NULL; \ } \ @@ -1063,6 +1064,7 @@ HbaLine * parse_hba_line(TokenizedAuthLine *tok_line, int elevel) { int line_num = tok_line->line_num; + char *file_name = tok_line->file_name; char **err_msg = &tok_line->err_msg; char *str; struct addrinfo *gai_result; @@ -1077,6 +1079,7 @@ parse_hba_line(TokenizedAuthLine *tok_line, int elevel) HbaLine*parsedline; parsedline = palloc0(sizeof(HbaLine)); + parsedline->sourcefile = pstrdup(file_name); parsedline->linenumber = line_num; parsedline->rawline = pstrdup(tok_line->raw_line); @@
Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()
On Tue, Oct 18, 2022 at 1:03 PM Amul Sul wrote: > > On Tue, Oct 18, 2022 at 12:01 PM Bharath Rupireddy > wrote: > > > > Hi, > > > > In standby mode, the state machine in WaitForWALToBecomeAvailable() > > reads WAL from pg_wal after failing to read from the archive. This is > > currently implemented in XLogFileReadAnyTLI() by calling > > XLogFileRead() with source XLOG_FROM_PG_WAL after it fails with source > > XLOG_FROM_PG_ARCHIVE and the current source isn't changed at all. > > Also, passing the source to XLogFileReadAnyTLI() in > > WaitForWALToBecomeAvailable() isn't straight i.e. it's not necessary > > to pass in XLOG_FROM_ANY at all. These things make the state machine a > > bit complicated and hard to understand. > > > > The attached patch attempts to simplify the code a bit by changing the > > current source to XLOG_FROM_PG_WAL after failing in > > XLOG_FROM_PG_ARCHIVE so that the state machine can move smoothly to > > read from pg_wal. And we can just pass the current source to > > XLogFileReadAnyTLI(). It also enables us to reduce a bit of extra > > XLogFileRead() code in XLogFileReadAnyTLI(). > > > > Thoughts? > > +1 Thanks. Let's see what others think about it. I will add a CF entry in a while. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Allow file inclusion in pg_hba and pg_ident files
On Wed, Oct 26, 2022 at 11:19:48AM +0800, Julien Rouhaud wrote: > That wouldn't be overdoing anymore if we remove the line number / filename > from > the fill_*_line prototypes right? Yeah, but there is a twist: HbaLine or IdentLine can be passed as NULL when entering in fill_hba_line() or fill_ident_line() in the event of an error when tokenizing the line or when failing the creation of their Line, so attempting to read the line number from either of them when filling in a tuple for their respective view would cause a crash. So, for now, I have taken the minimalistic approach with the addition of the source file name in HbaFile and in TokenizedAuthLine. The former is actually necessary anyway as auth.c wants it in two locations (improved auth_failed() and set_authn_id()). It is still true that the line number in IdentLine remains unused. Hence, do you think that the addition of the source file name and its line number could be useful for the error reporting done in check_ident_usermap()? The contents of the HbaLines are used in auth.c for its own reporting magic, and it looks like this would be the only location where what's in the IdentLines is useful, aka provide more details about what is happening. Once users are able to include ident files, that would likely help in debugging issues. -- Michael signature.asc Description: PGP signature
Re: proposal: possibility to read dumped table's name from file
Hi út 18. 10. 2022 v 11:33 odesílatel Julien Rouhaud napsal: > On Thu, Oct 13, 2022 at 11:46:34AM -0700, Andres Freund wrote: > > Hi, > > > > On 2022-10-07 07:26:08 +0200, Pavel Stehule wrote: > > > I am sending version with handy written parser and meson support > > > > Given this is a new approach it seems inaccurate to have the CF entry > marked > > ready-for-committer. I've updated it to needs-review. > > I just had a quick look at the rest of the patch. > > For the parser, it seems that filter_get_pattern is reimplementing an > identifier parsing function but isn't entirely correct. It can correctly > parse > quoted non-qualified identifiers and non-quoted qualified identifiers, but > not > quoted and qualified ones. For instance: > > $ echo 'include table nsp.tbl' | pg_dump --filter - >/dev/null > $echo $? > 0 > > $ echo 'include table "TBL"' | pg_dump --filter - >/dev/null > $echo $? > 0 > > $ echo 'include table "NSP"."TBL"' | pg_dump --filter - >/dev/null > pg_dump: error: invalid format of filter on line 1: unexpected extra data > after pattern > fixed > > This should also be covered in the regression tests. > done > > I'm wondering if psql's parse_identifier() could be exported and reused > here > rather than creating yet another version. > I looked there, and I don't think this parser is usable for this purpose. It is very sensitive on white spaces, and doesn't support multi-lines. It is designed for support readline tab complete, it is designed for simplicity not for correctness. > > Nitpicking: the comments needs some improvements: > > + /* > + * Simple routines - just don't repeat same code > + * > + * Returns true, when filter's file is opened > + */ > + bool > + filter_init(FilterStateData *fstate, const char *filename) > done > > also, is there any reason why this function doesn't call exit_nicely in > case of > error rather than letting each caller do it without any other cleanup? > It is commented few lines up /* * Following routines are called from pg_dump, pg_dumpall and pg_restore. * Unfortunatelly, implementation of exit_nicely in pg_dump and pg_restore * is different from implementation of this rutine in pg_dumpall. So instead * direct calling exit_nicely we have to return some error flag (in this * case NULL), and exit_nicelly will be executed from caller's routine. */ > > + /* > + * Release allocated sources for filter > + */ > + void > + filter_free_sources(FilterStateData *fstate) > > I'm assuming "ressources" not "sources"? > changed > > + /* > + * log_format_error - Emit error message > + * > + * This is mostly a convenience routine to avoid duplicating file > closing code > + * in multiple callsites. > + */ > + void > + log_invalid_filter_format(FilterStateData *fstate, char *message) > > mismatch between comment and function name (same for filter_read_item) > fixes > > + static const char * > + filter_object_type_name(FilterObjectType fot) > > No description. > > fixed > /* > * Helper routine to reduce duplicated code > */ > void > log_unsupported_filter_object_type(FilterStateData *fstate, > > const char *appname, > > FilterObjectType fot) > > Need more helpful comment. > fixed Thank you for comments attached updated patch Regards Pavel diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index 8b9d9f4cad..955bfcfdad 100644 --- a/doc/src/sgml/ref/pg_dump.sgml +++ b/doc/src/sgml/ref/pg_dump.sgml @@ -779,6 +779,80 @@ PostgreSQL documentation + + --filter=filename + + +Specify a filename from which to read patterns for objects to include +or exclude from the dump. The patterns are interpreted according to the +same rules as the corresponding options: +-t/--table for tables, +-n/--schema for schemas, +--include-foreign-data for data on foreign servers and +--exclude-table-data for table data. +To read from STDIN, use - as the +filename. The --filter option can be specified in +conjunction with the above listed options for including or excluding +objects, and can also be specified more than once for multiple filter +files. + + + +The file lists one object pattern per row, with the following format: + +{ include | exclude } { table | schema | foreign_data | data } PATTERN + + + + +The first keyword specifies whether the objects matched by the pattern +are to be included or excluded. The second keyword specifies the type +of object to be filtered using the pattern: + + + + table: tables, works like + -t/--table + + + + + schema: schemas, works like + -n/--schema + + + + + foreign_data: data on foreign servers, works like + --include-foreign-data. This
Re: Some regression tests for the pg_control_*() functions
On Tue, Oct 25, 2022 at 11:07 AM Michael Paquier wrote: > > Hi all, > > As mentioned in [1], there is no regression tests for the SQL control > functions: pg_control_checkpoint, pg_control_recovery, > pg_control_system and pg_control_init. > > It would be minimal to check their execution, as of a "SELECT FROM > func()", still some validation can be done on its output as long as > the test is portable enough (needs transparency for wal_level, commit > timestamps, etc.). > > Attached is a proposal to provide some coverage. Some of the checks > could be just removed, like the ones for non-NULL fields, but I have > written out everything to show how much could be done. > > Thoughts? > > [1]: https://www.postgresql.org/message-id/yzy0ilxnbmaxh...@paquier.xyz +1 for improving the test coverage. Is there a strong reason to validate individual output columns rather than select count(*) > 0 from pg_control_(); sort of tests? If the intention is to validate the pg_controlfile contents, we have pg_controldata to look at and pg_control_() functions doing crc checks. If this isn't enough, we can have the pg_control_validate() function to do all the necessary checks and simplify the tests, no? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Move backup-related code to xlogbackup.c/.h
On Mon, Oct 24, 2022 at 1:00 PM Michael Paquier wrote: > > On Wed, Oct 19, 2022 at 09:07:04PM +0530, Bharath Rupireddy wrote: > > XLogBackupResetRunning() seemed better. +1 for above function names. > > I see what you are doing here. XLogCtl would still live in xlog.c, > but we want to have functions that are able to manipulate some of its > fields. Right. > I am not sure to like that much because it introduces a > circling dependency between xlog.c and xlogbackup.c. As of HEAD, > xlog.c calls build_backup_content() from xlogbackup.c, which is fine > as xlog.c is kind of a central piece that feeds on the insert and > recovery pieces. However your patch makes some code paths of > xlogbackup.c call routines from xlog.c, and I don't think that we > should do that. If you're talking about header file dependency, there's already header file dependency between them - xlog.c includes xlogbackup.h for build_backup_content() and xlogbackup.c includes xlog.h for wal_segment_size. And, I think the same kind of dependency exists between xlog.c and xlogrecovery.c. Please note that we're trying to reduce xlog.c file size apart from centralizing backup related code. > > I'm okay either way. > > > > Please see the attached v8 patch set. > > Among all that, CleanupBackupHistory() is different, still it has a > dependency with some of the archiving pieces.. Is there a problem with that? This function is used solely by backup functions and it happens to use one of the archiving utility functions. Please see the other archiving utility functions being used elsewhere in the code, not only in xlog.c - for instance, KeepFileRestoredFromArchive() and XLogArchiveNotify(). I'm attaching the v9 patch set herewith after rebasing. Please review it further. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 385f42ebf6b85fd4a0ad6211428f4dd92e53b269 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Wed, 26 Oct 2022 05:15:37 + Subject: [PATCH v9] Add functions for xlogbackup.c to call back into xlog.c --- src/backend/access/transam/xlog.c | 175 - src/include/access/xlog.h | 1 + src/include/access/xlog_internal.h | 8 ++ 3 files changed, 131 insertions(+), 53 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 8f10effe3a..34260a2434 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8311,9 +8311,7 @@ do_pg_backup_start(const char *backupidstr, bool fast, List **tablespaces, * runningBackups, to ensure adequate interlocking against * XLogInsertRecord(). */ - WALInsertLockAcquireExclusive(); - XLogCtl->Insert.runningBackups++; - WALInsertLockRelease(); + XLogBackupSetRunning(); /* * Ensure we decrement runningBackups if we fail below. NB -- for this to @@ -8383,12 +8381,8 @@ do_pg_backup_start(const char *backupidstr, bool fast, List **tablespaces, * to restore starting from the checkpoint is precisely the REDO * pointer. */ - LWLockAcquire(ControlFileLock, LW_SHARED); - state->checkpointloc = ControlFile->checkPoint; - state->startpoint = ControlFile->checkPointCopy.redo; - state->starttli = ControlFile->checkPointCopy.ThisTimeLineID; - checkpointfpw = ControlFile->checkPointCopy.fullPageWrites; - LWLockRelease(ControlFileLock); + GetCheckpointLocation(&state->checkpointloc, &state->startpoint, + &state->starttli, &checkpointfpw); if (backup_started_in_recovery) { @@ -8399,9 +8393,7 @@ do_pg_backup_start(const char *backupidstr, bool fast, List **tablespaces, * (i.e., since last restartpoint used as backup starting * checkpoint) contain full-page writes. */ -SpinLockAcquire(&XLogCtl->info_lck); -recptr = XLogCtl->lastFpwDisableRecPtr; -SpinLockRelease(&XLogCtl->info_lck); +recptr = XLogGetLastFPWDisableRecptr(); if (!checkpointfpw || state->startpoint <= recptr) ereport(ERROR, @@ -8434,13 +8426,7 @@ do_pg_backup_start(const char *backupidstr, bool fast, List **tablespaces, * taking a checkpoint right after another is not that expensive * either because only few buffers have been dirtied yet. */ - WALInsertLockAcquireExclusive(); - if (XLogCtl->Insert.lastBackupStart < state->startpoint) - { -XLogCtl->Insert.lastBackupStart = state->startpoint; -gotUniqueStartpoint = true; - } - WALInsertLockRelease(); + gotUniqueStartpoint = XLogBackupSetLastStart(state->startpoint); } while (!gotUniqueStartpoint); /* @@ -8549,6 +8535,15 @@ get_backup_status(void) return sessionBackupState; } +/* + * Utility routine to reset the session-level status of a backup running. + */ +void +reset_backup_status(void) +{ + sessionBackupState = SESSION_BACKUP_NONE; +} + /* * do_pg_backup_stop * @@ -8590,33 +8585,16 @@ do_pg_backup_stop(BackupState *state, bool waitforarchive)
Re: Allow file inclusion in pg_hba and pg_ident files
On Wed, Oct 26, 2022 at 11:19:48AM +0800, Julien Rouhaud wrote: > That wouldn't be overdoing anymore if we remove the line number / filename > from > the fill_*_line prototypes right? So, I have spent a good portion of today looking at what you have here, applying 0001 and 0003 while fixing, rebasing and testing the whole, discarding 0002 (we could do more for the line number and source file in terms of the LOGs reported for a regexec failure). Now remains 0004, which is the core of the proposal, and while it needs a rebase, I have not been able to spend much time looking at its internals. In order to help with the follow-up steps, I have spotted a few areas that could be done first: - The diffs in guc.h/c for the introduction of GetDirConfFiles() to get a set of files in a directory (got to think a bit more about ParseConfigFp() when it comes to the keywords, but that comes to play with the parsing logic which is different). - The TAP test, which is half the size of the patch in line number. Could it be possible to make it more edible, introducing a basic infrastructure to check a set of rules in pg_hba.conf and pg_ident.conf without the inclusion logic? Checks for error patterns (that I agree we strongly lack tests for) look like something we'd want to tackle independently of the inclusion logic, and it should be built on top of a basic test infra, at least. -- Michael signature.asc Description: PGP signature