Re: BitmapHeapScan streaming read user and prelim refactoring
Hi, On 2025-03-22 16:14:11 -0400, Andres Freund wrote: > On 2025-03-22 22:00:00 +0200, Alexander Lakhin wrote: > > @@ -24,14 +24,14 @@ > > SELECT count(*) FROM bmscantest WHERE a = 1 AND b = 1; > > count > > --- > > - 23 > > + 18 > > (1 row) > > Is it possible that this is the bug reported here: > https://postgr.es/m/873c33c5-ef9e-41f6-80b2-2f5e11869f1c%40garret.ru > > I.e. that the all-visible logic in bitmap heapscans is fundamentally broken? > > I can reproduce different results on a fast machien by just putting a VACUUM > FREEZE bmscantest after the CREATE INDEXes in bitmapops.sql. After I disable > the all-visible logic in bitmapheap_stream_read_next(), I can't observe such a > difference anymore. Hm, it's clearly related to the all-visible path, but I think the bug is actually independent of what was reported there. The bug you reported is perfectly reproducible, without any concurrency, after all. Here's a smaller reproducer: CREATE TABLE bmscantest (a int, t text); INSERT INTO bmscantest SELECT (r%53), 'foo' FROM generate_series(1,7) r; CREATE INDEX i_bmtest_a ON bmscantest(a); set enable_indexscan=false; set enable_seqscan=false; -- Lower work_mem to trigger use of lossy bitmaps set work_mem = 64; SELECT count(*) FROM bmscantest WHERE a = 1; vacuum freeze bmscantest; SELECT count(*) FROM bmscantest WHERE a = 1; -- clean up DROP TABLE bmscantest; The first SELECT reports 1321, the second 572 tuples. Greetings, Andres Freund
Re: Update LDAP Protocol in fe-connect.c to v3
Andrew Jackson writes: > Currently the LDAP usage in fe-connect.c does not explicitly set the > protocol version to v3. This causes issues with many LDAP servers as they > will often require clients to use the v3 protocol and disallow any use of > the v2 protocol. This is the first complaint I can recall hearing about that, so exactly which ones are "many"? Also, are we really sufficiently compliant with v3 that just adding this bit is enough? > One further note is that I do not currently see any test coverage over the > LDAP functionality in `fe-connect.c`. I am happy to add that to this patch > if needed. src/test/ldap/ doesn't do it for you? regards, tom lane
Re: Proposal - Allow extensions to set a Plan Identifier
> On Sat, Mar 22, 2025 at 11:50:06PM +0100, Andrei Lepikhov wrote: > > planId actually looks less controversial than queryId or plan_node_id. At > > the same time, it is not free from the different logic that extensions may > > incorporate into this value: I can imagine, for example, the attempt of > > uniquely numbering plans related to the same queryId, plain hashing of the > > plan tree, hashing without constants, etc. I think plan_node_id is probably the least controversial because that value comes straight from core, and different extensions cannot have their own interpretation of what that value could be. Computing a plan_id is even more open for interpretation than it is for query_id perhaps, which is why giving this ability to extensions will be useful. Different extensions may choose to compute a plan_id different ways, but they all should be able to push this value into core, and this is what this patch will allow for. FWIW, Lukas did start a Wiki [0] to open the discussion for what parts of the plan should be used to compute a plan_id, and maybe we can in the future compite a plan_id in core by default. > > So, it seems that extensions may conflict with the same field. Are we sure > > that will happen if they are loaded in different orders in neighbouring > > backends? > > Depends on what kind of computation once wants to do, and surely > without some coordination across the extensions you are using these > cannot be shared, but it's no different from the concept of a query > ID. correct. This was also recently raised in the "making Explain extensible" [0] thread also. That is the nature of extensions, and coordination is required. [0] https://wiki.postgresql.org/wiki/Plan_ID_Jumbling [1] https://www.postgresql.org/message-id/CA%2BTgmoZ8sTodL-orXQm51_npNxuDAS86BS5kC8t0LULneShRbg%40mail.gmail.com -- Sami Imseih Amazon Web Services (AWS)
Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
After considering the opinions expressed in this discussion, I tend to agree more with David. If we add info about estimated unique keys and estimated capacity, then any additional information - such as evict_ratio and hit_ratio - can also be calculated, as EXPLAIN ANALYZE provides all the necessary details to compute these values. For now, I’m attaching a rebased v4 patch, which includes the fix for ExplainPropertyText. -- Best regards, Ilia Evdokimov, Tantor Labs LLC. From b60e508583b1cb7f30814cef6f39c4b570693350 Mon Sep 17 00:00:00 2001 From: Ilia Evdokimov Date: Fri, 21 Mar 2025 11:52:29 +0300 Subject: [PATCH v4] Show ndistinct and est_entries in EXPLAIN for Memoize --- src/backend/commands/explain.c | 16 src/backend/optimizer/path/costsize.c | 3 +++ src/backend/optimizer/plan/createplan.c | 10 +++--- src/backend/optimizer/util/pathnode.c | 3 +++ src/include/nodes/pathnodes.h | 2 ++ src/include/nodes/plannodes.h | 6 ++ 6 files changed, 37 insertions(+), 3 deletions(-) diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index 391b34a2af2..cb7e295ca8a 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -3628,6 +3628,22 @@ show_memoize_info(MemoizeState *mstate, List *ancestors, ExplainState *es) ExplainPropertyText("Cache Key", keystr.data, es); ExplainPropertyText("Cache Mode", mstate->binary_mode ? "binary" : "logical", es); + if (es->costs) + { + if (es->format == EXPLAIN_FORMAT_TEXT) + { + ExplainIndentText(es); + appendStringInfo(es->str, "Estimated Capacity: %u Estimated Distinct Lookup Keys: %0.0f\n", + ((Memoize *) plan)->est_entries, + ((Memoize *) plan)->est_unique_keys); + } + else + { + ExplainPropertyUInteger("Estimated Capacity", "", ((Memoize *) plan)->est_entries, es); + ExplainPropertyFloat("Estimated Distinct Lookup Keys", "", ((Memoize *) plan)->est_unique_keys, 0, es); + } + } + pfree(keystr.data); if (!es->analyze) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index f6f77b8fe19..e3abf9ccc26 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -2604,6 +2604,9 @@ cost_memoize_rescan(PlannerInfo *root, MemoizePath *mpath, mpath->est_entries = Min(Min(ndistinct, est_cache_entries), PG_UINT32_MAX); + /* Remember ndistinct for a potential EXPLAIN later */ + mpath->est_unique_keys = ndistinct; + /* * When the number of distinct parameter values is above the amount we can * store in the cache, then we'll have to evict some entries from the diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 75e2b0b9036..596b82c5dc9 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -284,7 +284,8 @@ static Material *make_material(Plan *lefttree); static Memoize *make_memoize(Plan *lefttree, Oid *hashoperators, Oid *collations, List *param_exprs, bool singlerow, bool binary_mode, - uint32 est_entries, Bitmapset *keyparamids); + uint32 est_entries, Bitmapset *keyparamids, + double est_unique_keys); static WindowAgg *make_windowagg(List *tlist, WindowClause *wc, int partNumCols, AttrNumber *partColIdx, Oid *partOperators, Oid *partCollations, int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators, Oid *ordCollations, @@ -1703,7 +1704,8 @@ create_memoize_plan(PlannerInfo *root, MemoizePath *best_path, int flags) plan = make_memoize(subplan, operators, collations, param_exprs, best_path->singlerow, best_path->binary_mode, - best_path->est_entries, keyparamids); + best_path->est_entries, keyparamids, + best_path->est_unique_keys); copy_generic_path_info(&plan->plan, (Path *) best_path); @@ -6636,7 +6638,8 @@ materialize_finished_plan(Plan *subplan) static Memoize * make_memoize(Plan *lefttree, Oid *hashoperators, Oid *collations, List *param_exprs, bool singlerow, bool binary_mode, - uint32 est_entries, Bitmapset *keyparamids) + uint32 est_entries, Bitmapset *keyparamids, + double est_unique_keys) { Memoize*node = makeNode(Memoize); Plan *plan = &node->plan; @@ -6654,6 +6657,7 @@ make_memoize(Plan *lefttree, Oid *hashoperators, Oid *collations, node->binary_mode = binary_mode; node->est_entries = est_entries; node->keyparamids = keyparamids; + node->est_unique_keys = est_unique_keys; return node; } diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index 93e73cb44db..1fbcda99067 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -1701,6 +1701,9 @@ create_memoize_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath, Assert(enable_memoize); pathnode->path.disabled_nodes = subpath->disabled_nodes; + /* Estimated numb
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
On Sat, Mar 22, 2025 at 09:04:19PM -0400, Tom Lane wrote: > Right. I'm arguing that that's good. The proposed patch already > obscures the difference between similar table names in different > (temp) schemas, and I'm suggesting that taking that a bit further > would be fine. > > Note that if the tables we're considering don't have identical > rowtypes, the queries would likely jumble differently anyway > due to differences in Vars' varattno and vartype. Not for the types AFAIK, the varattnos count in, but perhaps for the same argument as previously it's just kind of OK? Please see the tests in the attached about that. I've spent a few hours looking at the diffs of a pgss dump before and after the fact. The reduction in the number of entries seem to come mainly from tests where we do a successive creates and drops of the same table name. There are quite a few of them for updatable views, but it's not the only one. A good chunk comes from tables with simpler and rather generic names. So your idea to use the relation name in eref while skipping the column list looks kind of promising. Per se the attached. Thoughts? -- Michael From fd2bc0e26134b7d74fe33da99eab6623a9859f22 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Sun, 23 Mar 2025 15:33:05 +0900 Subject: [PATCH v4 1/2] Add support for custom_query_jumble at field-level This option gives the possibility for query jumbling to force custom jumbling policies specific to a field of a node. When dealing with complex node structures, this is simpler than having to enforce a custom function across the full node. Custom functions need to be defined as _jumble${node}_${field}, are given in input the parent node and the field value. The field value is not really required if we have the parent Node, but it makes custom implementations easier to follow with field-specific definitions and values. The code generated by gen_node_support.pl uses a macro called JUMBLE_CUSTOM(), that hides the objects specific to queryjumblefuncs.c. --- src/include/nodes/nodes.h | 4 src/backend/nodes/gen_node_support.pl | 15 +-- src/backend/nodes/queryjumblefuncs.c | 6 ++ 3 files changed, 23 insertions(+), 2 deletions(-) diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index d18044b4e650..adec68a45ab8 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -54,6 +54,7 @@ typedef enum NodeTag * readfuncs.c. * * - custom_query_jumble: Has custom implementation in queryjumblefuncs.c. + * Available as well as a per-field attribute. * * - no_copy: Does not support copyObject() at all. * @@ -101,6 +102,9 @@ typedef enum NodeTag * - equal_ignore_if_zero: Ignore the field for equality if it is zero. * (Otherwise, compare normally.) * + * - custom_query_jumble: Has custom implementation in queryjumblefuncs.c + * applied for the field of a node. Available as well as a node attribute. + * * - query_jumble_ignore: Ignore the field for the query jumbling. Note * that typmod and collation information are usually irrelevant for the * query jumbling. diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl index 7e3f335ac09d..2d62ecb1d9d6 100644 --- a/src/backend/nodes/gen_node_support.pl +++ b/src/backend/nodes/gen_node_support.pl @@ -471,6 +471,7 @@ foreach my $infile (@ARGV) && $attr !~ /^read_as\(\w+\)$/ && !elem $attr, qw(copy_as_scalar +custom_query_jumble equal_as_scalar equal_ignore equal_ignore_if_zero @@ -1283,12 +1284,17 @@ _jumble${n}(JumbleState *jstate, Node *node) my $t = $node_type_info{$n}->{field_types}{$f}; my @a = @{ $node_type_info{$n}->{field_attrs}{$f} }; my $query_jumble_ignore = $struct_no_query_jumble; + my $query_jumble_custom = 0; my $query_jumble_location = 0; my $query_jumble_squash = 0; # extract per-field attributes foreach my $a (@a) { + if ($a eq 'custom_query_jumble') + { +$query_jumble_custom = 1; + } if ($a eq 'query_jumble_ignore') { $query_jumble_ignore = 1; @@ -1304,8 +1310,13 @@ _jumble${n}(JumbleState *jstate, Node *node) } # node type - if (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/) - and elem $1, @node_types) + if ($query_jumble_custom) + { + # Custom function that applies to one field of a node. + print $jff "\tJUMBLE_CUSTOM($n, $f);\n"; + } + elsif (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/) + and elem $1, @node_types) { # Squash constants if requested. if ($query_jumble_squash) diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c index 189bfda610aa..9b9cf6f34381 100644 --- a/src/backend/nodes/queryjumblefuncs.c +++ b/src/backend/nodes/queryjumblefuncs.c @@ -333,6 +333,12 @@ do { \ if (expr->str) \ AppendJumble(jstate, (const unsigned char *) (expr->str), strlen(expr->str) + 1); \ } while(0)
Re: Add Postgres module info
I spent awhile reviewing the v5 patch, and here's a proposed v6. Some notes: * I didn't like depending on offsetof(Pg_magic_struct, module_extra) to determine which parts of the struct are checked for compatibility. It just seems way too easy to break that with careless insertion of new fields, and such breakage might not cause obvious failures. I think the right thing is to break out the ABI-checking fields as their own sub-struct, rather than breaking out the new fields as a sub-struct. * I renamed the inquiry function to pg_get_loaded_modules, since it only works on loaded modules but that's hardly clear from the previous name. * It is not clear to me what permission restrictions we should put on pg_get_loaded_modules, but it is clear that "none" is the wrong answer. In particular, exposing the full file path of loaded modules is against our rules: unprivileged users are not supposed to be able to learn anything about the filesystem underneath the server. (This is why for instance an unprivileged user can't read the data_directory GUC.) In the attached I made the library path read as NULL unless the user has pg_read_server_files, but I'm not attached to that specific solution. One thing not to like is that it's very likely that you'd just get a row of NULLs and no useful info about a module at all. Another idea perhaps could be to strip off the directory path and maybe the filename extension if the user doesn't have privilege. Or we could remove the internal permission check and instead gate access to the function altogether with grantable EXECUTE privilege. (This might be the right answer, since it's not clear that Joe Unprivileged User should be able to know what modules are loaded; some of them might have security implications.) In any case, requiring pg_read_server_files feels a little too strong, but I don't see an alternative role I like better. The EXECUTE-privilege answer would at least let installations adjust the function's availability to their liking. * I didn't like anything about the test setup. Making test_misc dependent on other modules is a recipe for confusion, and perhaps for failures in parallel builds. (Yes, I see somebody already made it depend on injection_points. But doubling down on a bad idea doesn't make it less bad.) Also, the test would fail completely in an installation that came with any preloaded modules, which hardly seems like an improbable future situation. I think we need to restrict what modules we're looking at with a WHERE clause to prevent that. After some thought I went back to the upthread idea of just having auto_explain as a test case. Still TBD: * I'm not happy with putting pg_get_loaded_modules into dfmgr.c. It feels like the wrong layer to have a SQL-callable function, and the large expansion in its #include list is evidence that we're adding functionality that doesn't belong there. But I'm not quite sure where to put it instead. Also, the naive way to do that would require exporting DynamicFileList which doesn't feel nice either. Maybe we could make dfmgr.c export some sort of iterator function? * Should we convert our existing modules to use PG_MODULE_MAGIC_EXT? I'm mildly in favor of that, but I think we'd need some automated way to manage their version strings, and I don't know what that ought to look like. Maybe it'd be enough to make all the in-core modules use PG_VERSION as their version string, but I think that might put a dent in the idea of the version strings following semantic versioning rules. regards, tom lane From 2f8e182dbd26e03a9568393307b64cb26b3842fd Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sat, 22 Mar 2025 18:39:15 -0400 Subject: [PATCH v6] Introduce PG_MODULE_MAGIC_EXT macro. This macro allows dynamically loaded shared libraries (modules) to provide a wired-in module name and version, and possibly other compile-time-constant fields in future. This information can be retrieved with the new pg_get_loaded_modules() function. This feature is expected to be particularly useful for modules that do not have any exposed SQL functionality and thus are not associated with a SQL-level extension object. But even for modules that do belong to extensions, being able to verify the actual code version can be useful. Author: Andrei Lepikhov Reviewed-by: Yurii Rashkovskii Reviewed-by: Tom Lane Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68f...@gmail.com --- contrib/auto_explain/auto_explain.c| 5 +- contrib/auto_explain/t/001_auto_explain.pl | 11 +++ doc/src/sgml/func.sgml | 24 +++ doc/src/sgml/xfunc.sgml| 24 +++ src/backend/utils/fmgr/dfmgr.c | 81 -- src/include/catalog/pg_proc.dat| 7 ++ src/include/fmgr.h | 62 ++--- src/tools/pgindent/typedefs.list | 1 + 8 files changed, 196 insertions(+), 19 deletions(-) diff --git a/
Update LDAP Protocol in fe-connect.c to v3
Currently the LDAP usage in fe-connect.c does not explicitly set the protocol version to v3. This causes issues with many LDAP servers as they will often require clients to use the v3 protocol and disallow any use of the v2 protocol. Further the other usage of LDAP in postgres (in `backend/libpq/auth.c`) uses the v3 protocol. This patch changes fe-connect.c so that it uses the v3 protocol similar to `backend/libpq/auth.c`. One further note is that I do not currently see any test coverage over the LDAP functionality in `fe-connect.c`. I am happy to add that to this patch if needed.
Re: Proposal - Allow extensions to set a Plan Identifier
On Fri, Mar 21, 2025 at 09:25:24AM -0400, Sami Imseih wrote: >> planner() is the sole place in the core code where the planner hook >> can be called. Shouldn't we have at least a call to >> pgstat_report_plan_id() after planning a query? At least that should >> be the behavior I'd expect, where a module pushes a planId to a >> PlannedStmt, then core publishes it to the backend entry in non-force >> mode. > > I agree. I was just thinking we rely on the exec_ routines to report the > plan_id > at the start. But, besides the good reason you give, reporting > (slightly) earlier is > better for monitoring tools; as it reduces the likelihood they find an empty > plan_id. Yep. pgstat_report_plan_id() is not something that extensions should do, but they should report the planId in the PlannedStmt and let the backend do the rest. > Overall, v3 LGTM Thanks. If anybody has any objections and/or comments, now would be a good time. I'll revisit this thread at the beginning of next week. -- Michael signature.asc Description: PGP signature
Re: meson and check-tests
Hi, please note, same file `/src/tools/testwrap` on the same line number is being changed in this patch [1] in the same commitfest. [1] https://commitfest.postgresql.org/patch/5602/ Regards, Rustam Allakov
Re: speedup COPY TO for partitioned table.
On Tue, 11 Mar 2025 at 18:24, jian he wrote: > > after my change: > > COPY TO can be used only with plain tables, not views, and does not > copy rows from child tables, > however COPY TO can be used with partitioned tables. > For example, in a table inheritance hierarchy, COPY table TO copies > the same rows as SELECT * FROM ONLY table. > The syntax COPY (SELECT * FROM table) TO ... can be used to dump all > of the rows in an inheritance hierarchy, or view. > I find an issue with the patch: -- Setup CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'testdb', port '5432'); CREATE TABLE t1(id int) PARTITION BY RANGE(id); CREATE TABLE part1 PARTITION OF t1 FOR VALUES FROM (0) TO (5); CREATE TABLE part2 PARTITION OF t1 FOR VALUES FROM (5) TO (15) PARTITION BY RANGE(id); CREATE FOREIGN TABLE part2_1 PARTITION OF part2 FOR VALUES FROM (10) TO (15) SERVER myserver; -- Create table in testdb create table part2_1(id int); -- Copy partitioned table data postgres=# copy t1 to stdout(header); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. Stack trace for the same is: #0 table_beginscan (rel=0x72b109f9aad8, snapshot=0x5daafa77e000, nkeys=0, key=0x0) at ../../../src/include/access/tableam.h:883 #1 0x5daadf89eb9b in DoCopyTo (cstate=0x5daafa71e278) at copyto.c:1105 #2 0x5daadf8913f4 in DoCopy (pstate=0x5daafa6c5fc0, stmt=0x5daafa6f20c8, stmt_location=0, stmt_len=25, processed=0x7ffd3799c2f0) at copy.c:316 #3 0x5daadfc7a770 in standard_ProcessUtility (pstmt=0x5daafa6f21e8, queryString=0x5daafa6f15c0 "copy t1 to stdout(header);", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x5daafa6f25a8, qc=0x7ffd3799c660) at utility.c:738 (gdb) f 0 #0 table_beginscan (rel=0x72b109f9aad8, snapshot=0x5daafa77e000, nkeys=0, key=0x0) at ../../../src/include/access/tableam.h:883 883 return rel->rd_tableam->scan_begin(rel, snapshot, nkeys, key, NULL, flags); The table access method is not available in this care (gdb) p *rel->rd_tableam Cannot access memory at address 0x0 This failure happens when we do table_beginscan on scan part2_1 table Regards, Vignesh
Re: Next commitfest app release is planned for March 18th
On Fri, 21 Mar 2025 at 18:53, Tom Lane wrote: > > Peter Eisentraut writes: > > - If I'm the committer for a patch but not a reviewer, and the patch is > > in "needs review" status, then the patch is formally speaking not > > actionable by me and should not be under "Patches that are ready for > > your review". Perhaps it should be under "Blocked on others" [namely > > the reviewers], or in a different category. I think I agree with this... but is that actually a common situation? I'd expect people to mostly "claim a patch as committer" when it's in the "ready for committer" state. > > - Conversely, if I'm the reviewer for a patch but not the committer, and > > the patch is in "ready for committer" status, then it's also not > > "Patches that are ready for your review". This might similarly be > > "Blocked on others" [namely the committer]. > > Both of these things would work correctly only for patches that you > have already claimed as committer. I'm not sure about other people, > but I rarely claim a patch as committer until I'm actually on the > verge of committing it. I might mark myself as reviewer sometime > sooner than that, in which case your second proposal would be a > net negative for me. I think probably a nice intermediary solution would be to handle patches with an assigned committer differently. If a committer assigned themselves as committer and it's "ready for committer", then I don't think it should be marked as "ready for your review" for other committers. If there's no committer, I think it should be, because *some committer* needs to review the patch but it's unknown which one. > I don't think we should encourage committers > to claim patches early, because then they are a single point of > failure (work-stoppage) in a way that a reviewer is not. Maybe > we need a way for committers to mark patches as things they want > to pay attention to, without thereby blocking other committers > from taking up the patch? Yeah I think we need a way to "star" a patch (not just for committers, but for anyone). After creating this initial version of this dashboard I realized it's also possible to "subscribe to updates" for a patch, which means you'll get emails about it. I think if you "subscribe for updates" you should also see it in your dashboard (which you currently don't). But I also think there should be a way to "star" something, so you see it in your dashboard without receiving email updates about it. > > - Also, my dashboard shows patches from past and future commitfests. I > > don't know what I'm supposed to do with that. The purpose of having a > > "current" commitfest is that you work on that one when it's current. > > This might make sense for the patch author, but I agree it's not > appropriate for anyone else. I'd love to hear some more thoughts on this. Especially from Peter G, because he explicitly requested to see patches from different commitfests in this dashboard. So I'm curious about his reasoning. Ofcourse we can make this configurable or have two separate pages, but I'm wondering if there's a way to change it in a way that works for everyone. Instead of showing all patches from all commitfests, how about we do the following: 1. Show all open patches where you are the author, no matter which commitfest they are in. 2. If there's currently no commitfest "In Progress", then we show all patches (where you are not the author) both from the Previous commitfest and the Open commitfest. I think it would be a shame if everyone's dashboard is empty in the first week after a commitfest, just because authors haven't moved their patches over to the Open commitfest. 3. If there's a commitfest "In Progress", then we only show patches (where you are not the author) from the "In Progress" commitfest to keep reviews focussed. 4. Have a separate section at the bottom of the page with all the patches that you are tracking, but are not matching one of the criteria from the three rules above.
Improve error reporting for few options in pg_createsubscriber
Hi, Currently, error reports for database, publication, subscription, and replication slots do not include the option name. This has been addressed by including the option name in the error messages, ensuring consistency similar to remove option. Additionally, pg_log_error and exit(1) have been replaced with a single pg_fatal statement which means the same. The attached patch implements these changes. Regards, Vignesh From 4d1c8ebcb5ad8c2f17b027d5bf23d063da419c9d Mon Sep 17 00:00:00 2001 From: Vignesh Date: Sat, 22 Mar 2025 18:56:15 +0530 Subject: [PATCH] Improve error reporting consistency in pg_createsubscriber Ensure consistent error reporting in pg_createsubscriber by including the option name in error messages. Additionally, replace pg_log_error and exit calls with pg_fatal. --- src/bin/pg_basebackup/pg_createsubscriber.c | 20 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c index e2d6b7544bf..2b9ec24efc2 100644 --- a/src/bin/pg_basebackup/pg_createsubscriber.c +++ b/src/bin/pg_basebackup/pg_createsubscriber.c @@ -2063,10 +2063,7 @@ main(int argc, char **argv) num_dbs++; } else -{ - pg_log_error("database \"%s\" specified more than once", optarg); - exit(1); -} + pg_fatal("database \"%s\" specified more than once for --database", optarg); break; case 'D': subscriber_dir = pg_strdup(optarg); @@ -2113,10 +2110,7 @@ main(int argc, char **argv) num_pubs++; } else -{ - pg_log_error("publication \"%s\" specified more than once", optarg); - exit(1); -} + pg_fatal("publication \"%s\" specified more than once for --publication", optarg); break; case 3: if (!simple_string_list_member(&opt.replslot_names, optarg)) @@ -2125,10 +2119,7 @@ main(int argc, char **argv) num_replslots++; } else -{ - pg_log_error("replication slot \"%s\" specified more than once", optarg); - exit(1); -} + pg_fatal("replication slot \"%s\" specified more than once for --replication-slot", optarg); break; case 4: if (!simple_string_list_member(&opt.sub_names, optarg)) @@ -2137,10 +2128,7 @@ main(int argc, char **argv) num_subs++; } else -{ - pg_log_error("subscription \"%s\" specified more than once", optarg); - exit(1); -} + pg_fatal("subscription \"%s\" specified more than once for --subscription", optarg); break; default: /* getopt_long already emitted a complaint */ -- 2.43.0
Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
On Sat, Mar 22, 2025 at 6:23 PM vignesh C wrote: > > On Fri, 21 Mar 2025 at 18:59, Shubham Khanna > wrote: > > > > > > During the recent testing, I observed that the tests were failing when > > using wait_for_slot_catchup(). To address this, I reverted to using > > wait_for_subscription_sync(), which was employed previously and has > > proven to be more reliable in ensuring test stability. > > Please let me know if there are any additional adjustments you would > > suggest or if you would like me to investigate further into > > wait_for_slot_catchup(). > > > > I have created a separate patch for the synopsis of '--all' option as > > suggested by Amit at [1]. The attached patch contains the suggested > > changes. > > I believe you added the following because pattern matching is > difficult for the $db1 and $db2 variables having special names: > +# Create a new database on node_p > +$node_p->safe_psql( > + "postgres", qq( > + CREATE DATABASE db1; > +)); > > How about we change it to verify the count of occurrences instead for > this case like below: > # Verify that the required logical replication objects are created. The > # expected count 3 refers to postgres, $db1 and $db2 databases. > is(scalar(() = $stderr =~ /creating publication/g), > 3, "verify publications are created for all databases"); > is(scalar(() = $stderr =~ /creating the replication slot/g), > 3, "verify replication slots are created for all databases"); > is(scalar(() = $stderr =~ /creating subscription/g), > 3, "verify subscriptions are created for all databases"); > > If you are ok, you can merge the changes attached. > I agree that verifying the count of occurrences is a more straightforward and effective approach, especially given the challenges with pattern matching for $db1 and $db2 variables with special names. This method simplifies validation and enhances robustness by explicitly ensuring the expected number of logical replication objects are created. I have reviewed and merged the proposed changes into the patch. The attached patches contain the suggested changes. Thanks and regards, Shubham Khanna. v18-0003-Synopsis-for-all-option.patch Description: Binary data v18-0001-Enhance-pg_createsubscriber-to-fetch-and-append-.patch Description: Binary data v18-0002-Additional-test-cases.patch Description: Binary data
Re: Next commitfest app release is planned for March 18th
Peter Geoghegan writes: > On Sat, Mar 22, 2025 at 7:15 AM Jelte Fennema-Nio wrote: >> Yeah I think we need a way to "star" a patch (not just for committers, >> but for anyone). > Right. In my view "starring" something should definitely not appear in > the "History Log" of the patch; it should be private. +1 > Personally, I think it'd be most useful for the dashboard to show all > open patches. Surely not "all"? We have that display already. Should be more like "patches that I have expressed an interest in or have reason to take current or future action on". In the case of stale patches that haven't been moved forward from a closed commitfest, perhaps we could compromise on Jelte's suggestion of putting those in a separate section at the bottom of the page. However, I still think such patches should be treated differently for the author than other people. The author does have current action to take on the patch, namely moving it forward or withdrawing it. regards, tom lane
Re: Remove redundant if-else in EXPLAIN by using ExplainPropertyText
On Fri, 21 Mar 2025 at 09:24, Ilia Evdokimov wrote: > Thanks to David [0], we found that the if and else branches contained > equivalent code. Since ExplainPropertyText already handles non-text > formats, the extra condition is unnecessary. > > I reviewed other files related to EXPLAIN where similar patterns might > exist, but this was the only instance I found. I looked at; git grep -E "appendStringInfo.*\\n" -- src/backend/commands/explain.c and all those results are appending multiple options. No single option ones. The patch looks good to me and seems worth applying to master. I feel this should be fixed to avoid misleading future patches that add new properties to Memoize into copying the same pattern. David
Re: BitmapHeapScan streaming read user and prelim refactoring
Hello Melanie, 15.03.2025 16:43, Melanie Plageman wrote: On Thu, Mar 13, 2025 at 5:41 PM Melanie Plageman wrote: Overall, I feel pretty good about merging this once Thomas merges the read stream patches. This was committed in 944e81bf99db2b5b70b, 2b73a8cd33b745c, and c3953226a07527a1e2. I've marked it as committed in the CF app. It looks like that change made the bitmapops test unstable (on slow animals?): [1], [2], [3]. I've reproduced such test failures when running TESTS=$(printf 'bitmapops %.0s' `seq 50`) make -s check-tests under Valgrind: ... ok 17 - bitmapops 52719 ms not ok 18 - bitmapops 57566 ms ok 19 - bitmapops 60179 ms ok 20 - bitmapops 32927 ms ok 21 - bitmapops 45127 ms ok 22 - bitmapops 42924 ms ok 23 - bitmapops 61035 ms ok 24 - bitmapops 56316 ms ok 25 - bitmapops 52874 ms not ok 26 - bitmapops 67468 ms ok 27 - bitmapops 55605 ms ok 28 - bitmapops 24021 ms ... diff -U3 /home/vagrant/postgresql/src/test/regress/expected/bitmapops.out /home/vagrant/postgresql/src/test/regress/results/bitmapops.out --- /home/vagrant/postgresql/src/test/regress/expected/bitmapops.out 2025-03-16 01:37:52.716885600 -0700 +++ /home/vagrant/postgresql/src/test/regress/results/bitmapops.out 2025-03-22 03:47:54.014702406 -0700 @@ -24,14 +24,14 @@ SELECT count(*) FROM bmscantest WHERE a = 1 AND b = 1; count --- - 23 + 18 (1 row) -- Test bitmap-or. SELECT count(*) FROM bmscantest WHERE a = 1 OR b = 1; count --- - 2485 + 1044 (1 row) -- clean up diff -U3 /home/vagrant/postgresql/src/test/regress/expected/bitmapops.out /home/vagrant/postgresql/src/test/regress/results/bitmapops.out --- /home/vagrant/postgresql/src/test/regress/expected/bitmapops.out 2025-03-16 01:37:52.716885600 -0700 +++ /home/vagrant/postgresql/src/test/regress/results/bitmapops.out 2025-03-22 03:54:53.129549597 -0700 @@ -31,7 +31,7 @@ SELECT count(*) FROM bmscantest WHERE a = 1 OR b = 1; count --- - 2485 + 1044 (1 row) git bisect for this deviation has pointed at 2b73a8cd3. [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-16%2010%3A34%3A17 [2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2025-03-17%2020%3A07%3A43 [3] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-17%2001%3A16%3A02 Best regards, Alexander Lakhin Neon (https://neon.tech)
Re: Parallel heap vacuum
Hi, On 2025-03-20 01:35:42 -0700, Masahiko Sawada wrote: > One plausible solution would be that we don't use ReadStream in > parallel heap vacuum cases but directly use > table_block_parallelscan_xxx() instead. It works but we end up having > two different scan methods for parallel and non-parallel lazy heap > scan. I've implemented this idea in the attached v12 patches. I think that's a bad idea - this means we'll never be able to use direct IO for parallel VACUUMs, despite a) The CPU overhead of buffered reads being a problem for VACUUM b) Data ending up in the kernel page cache is rather wasteful for VACUUM, as that's often data that won't otherwise be used again soon. I.e. these reads would particularly benefit from using direct IO. c) Even disregarding DIO, loosing the ability to do larger reads, as provided by read streams, looses a fair bit of efficiency (just try doing a pg_prewarm of a large relation with io_combine_limit=1 vs io_combine_limit=1). Greetings, Andres Freund
Re: BitmapHeapScan streaming read user and prelim refactoring
Hi, On 2025-03-22 22:00:00 +0200, Alexander Lakhin wrote: > 15.03.2025 16:43, Melanie Plageman wrote: > > On Thu, Mar 13, 2025 at 5:41 PM Melanie Plageman > > wrote: > > > Overall, I feel pretty good about merging this once Thomas merges the > > > read stream patches. > > This was committed in 944e81bf99db2b5b70b, 2b73a8cd33b745c, and > > c3953226a07527a1e2. > > I've marked it as committed in the CF app. > > It looks like that change made the bitmapops test unstable (on slow > animals?): [1], [2], [3]. > I've reproduced such test failures when running > TESTS=$(printf 'bitmapops %.0s' `seq 50`) make -s check-tests > under Valgrind: > ... > ok 17 - bitmapops 52719 ms > not ok 18 - bitmapops 57566 ms > ok 19 - bitmapops 60179 ms > ok 20 - bitmapops 32927 ms > ok 21 - bitmapops 45127 ms > ok 22 - bitmapops 42924 ms > ok 23 - bitmapops 61035 ms > ok 24 - bitmapops 56316 ms > ok 25 - bitmapops 52874 ms > not ok 26 - bitmapops 67468 ms > ok 27 - bitmapops 55605 ms > ok 28 - bitmapops 24021 ms > ... > > diff -U3 /home/vagrant/postgresql/src/test/regress/expected/bitmapops.out > /home/vagrant/postgresql/src/test/regress/results/bitmapops.out > --- /home/vagrant/postgresql/src/test/regress/expected/bitmapops.out > 2025-03-16 01:37:52.716885600 -0700 > +++ /home/vagrant/postgresql/src/test/regress/results/bitmapops.out > 2025-03-22 03:47:54.014702406 -0700 > @@ -24,14 +24,14 @@ > SELECT count(*) FROM bmscantest WHERE a = 1 AND b = 1; > count > --- > - 23 > + 18 > (1 row) Is it possible that this is the bug reported here: https://postgr.es/m/873c33c5-ef9e-41f6-80b2-2f5e11869f1c%40garret.ru I.e. that the all-visible logic in bitmap heapscans is fundamentally broken? I can reproduce different results on a fast machien by just putting a VACUUM FREEZE bmscantest after the CREATE INDEXes in bitmapops.sql. After I disable the all-visible logic in bitmapheap_stream_read_next(), I can't observe such a difference anymore. Greetings, Andres Freund
Re: RFC: Allow EXPLAIN to Output Page Fault Information
On Wed, 19 Mar 2025 at 14:15, torikoshia wrote: > BTW based on your discussion, I thought this patch could not be merged > anytime soon. Does that align with your understanding? Yeah, that aligns with my understanding. I don't think it's realistic to get this merged before the code freeze, but I think both of the below issues could be resolved. > - With bgworker-based AIO, this patch could mislead users into > underestimating the actual storage I/O load, which is undesirable. To resolve this, I think the patch would need to change to not report anything if bgworker-based AIO is used. So I moved this patch to the next commitfest, and marked it as "waiting for author" there. > - With io_uring-based AIO, this patch could provide meaningful values, > but it may take some time before io_uring sees widespread adoption. I submitted this patch to help make io_uring-based AIO more of a reality: https://commitfest.postgresql.org/patch/5570/
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
Michael Paquier writes: > On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote: >> Are we at the point where the patch is already Ready for Committer? > I'll think a bit more about how to document all that. Anyway, yes, > I'm OK with the per-field custom_query_jumble, so let's move on with > that, so I will do something about that. I'm not terribly happy with the entire proposal. (1) I know it was asserted upthread that there was no performance impact, but I find that hard to believe. (2) This patch inserts catalog lookups into query ID computation, which AFAIK there never were before. This means you can't compute a query ID outside a transaction or in an already-failed transaction. Surely that's going to bite us eventually. (3) I think having query jumbling work differently for temp tables than other tables is a fundamentally bad idea. So my feeling is: if we think this is the behavior we want, let's do it across the board. I suggest that we simply drop the relid from the jumble and use the table alias (that is, eref->aliasname) instead. ISTM this fits well with the general trend in pg_stat_statements to merge statements together more aggressively than the original concept envisioned. regards, tom lane
Re: making EXPLAIN extensible
On Sat, Mar 22, 2025 at 4:46 AM Andrei Lepikhov wrote: > I skimmed through the code and tested how it works. > It looks good and has no apparent architectural dependencies. > But I haven't scrutinised it line-by-line and do not intend to do so. > I wanna say I hate the design of this module. Having a strong necessity > for extra explain tools (in the daily routine, all I have is the only > single explain analyse verbose output to find out planner/executor bug, > reproduce it and make a patch), I don't see a single case when I would > use this module. It adds a burden to fix its output on a node change > (you don't care, but it adds work to Postgres fork maintainers, too, for > nothing). Also, it breaks my understanding of the principles of the > Postgres code design - to start the discussion on how to show more, we > need only the bare minimum of code and output lines. > In my opinion, it should show as few parameters as possible to > demonstrate principles and test the code on a single planner node. It > only deserves src/test/modules because it is not helpful for a broad > audience. Gee, thanks for the ringing endorsement. :-) I think *I* will use it pretty regularly; I already have. In my experience, using debug_query_plan for this sort of thing sucks quite a lot. Finding the information you need in the output takes a long time because the output is quite long. This is much more understandable, at least for me. I agree with you that a trivial test module could demonstrate that the hook works, but a trivial example would not demonstrate that the hook can be used to do something actually useful. It sounds like what I've written also fails the "actually useful" test for you, but it doesn't for me. I'm not going to insist on shipping this if I'm the ONLY one who would ever get any use out of it, but I doubt that's the case. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Reduce TupleHashEntryData struct size by half
On Tue, 2025-03-04 at 17:28 -0800, Jeff Davis wrote: > My results (with wide tables): > > GROUP BY EXCEPT > master: 2151 1732 > entire v8 series: 2054 1740 I'm not sure what I did with the EXCEPT test, above, perhaps I had filled the test table twice by mistake or something? New numbers below: Data: create table c (a int not null); insert into c select a from generate_Series(1,1000) a; create table big(i int, j int); insert into big select g, 1 from generate_series(1,1000) g; create table wide (t text not null); insert into wide select repeat(md5(a::text),10) from generate_Series(1,1000) a; create table wide_copy (t text not null); insert into wide_copy select * from wide; vacuum freeze analyze; Results: c(tps) wide(tps) except(tps) big(ms) big(MiB) 91ecb5e0bc 4768 20912317 3853 768 master 4942 20632323 3831 768 0001 4932 20382361 3712 616 0002 4601 20342365 3753 616 0003 4850 20402418 3749 616 0004 4744 20652282 3665 488 0005 4630 19942307 3665 488 0006 4809 20632394 3646 488 0007 4752 20702479 3659 488 Note: c.sql, wide.sql, and except.sql are measured in tps, higher is better. big.sql is measured in ms and MiB, lower is better. For some reason I'm getting a decline of about 3% in the c.sql test that seems to be associated with the accessor functions, even when inlined. I'm also not seeing as much benefit from the inlining of the MemoryContextMemAllocated(). But these mixed test results are minor compared with the memory savings of 35% and the more consistently- improved performance of 5% on the larger test (which is also integers), so I plan to commit it. Regards, Jeff Davis From f19e8363d52755d54d31f34778c841f97e6edfe2 Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Wed, 8 Jan 2025 11:46:35 -0800 Subject: [PATCH v9 1/7] HashAgg: use Bump allocator for hash TupleHashTable entries. The entries aren't freed until the entire hash table is destroyed, so use the Bump allocator to improve allocation speed, avoid wasting space on the chunk header, and avoid wasting space due to the power-of-two allocations. Discussion: https://postgr.es/m/caaphdvqv1anb4cm36fzrwivxrevbo_lsg_eq3nqdxtjecaa...@mail.gmail.com Reviewed-by: David Rowley --- src/backend/executor/execUtils.c | 17 +++-- src/backend/executor/nodeAgg.c | 111 ++- src/include/nodes/execnodes.h| 5 +- 3 files changed, 104 insertions(+), 29 deletions(-) diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c index 55ab18fb826..772c86e70e9 100644 --- a/src/backend/executor/execUtils.c +++ b/src/backend/executor/execUtils.c @@ -322,19 +322,18 @@ CreateExprContext(EState *estate) ExprContext * CreateWorkExprContext(EState *estate) { - Size minContextSize = ALLOCSET_DEFAULT_MINSIZE; - Size initBlockSize = ALLOCSET_DEFAULT_INITSIZE; Size maxBlockSize = ALLOCSET_DEFAULT_MAXSIZE; - /* choose the maxBlockSize to be no larger than 1/16 of work_mem */ - while (maxBlockSize > work_mem * (Size) 1024 / 16) - maxBlockSize >>= 1; + maxBlockSize = pg_prevpower2_size_t(work_mem * (Size) 1024 / 16); - if (maxBlockSize < ALLOCSET_DEFAULT_INITSIZE) - maxBlockSize = ALLOCSET_DEFAULT_INITSIZE; + /* But no bigger than ALLOCSET_DEFAULT_MAXSIZE */ + maxBlockSize = Min(maxBlockSize, ALLOCSET_DEFAULT_MAXSIZE); - return CreateExprContextInternal(estate, minContextSize, - initBlockSize, maxBlockSize); + /* and no smaller than ALLOCSET_DEFAULT_INITSIZE */ + maxBlockSize = Max(maxBlockSize, ALLOCSET_DEFAULT_INITSIZE); + + return CreateExprContextInternal(estate, ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, maxBlockSize); } /* diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index b4a7698a0b3..fb73e6f991d 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -406,6 +406,7 @@ static void build_hash_tables(AggState *aggstate); static void build_hash_table(AggState *aggstate, int setno, long nbuckets); static void hashagg_recompile_expressions(AggState *aggstate, bool minslot, bool nullcheck); +static void hash_create_memory(AggState *aggstate); static long hash_choose_num_buckets(double hashentrysize, long ngroups, Size memory); static int hash_choose_num_partitions(double input_groups, @@ -1509,7 +1510,7 @@ build_hash_table(AggState *aggstate, int setno, long nbuckets) { AggStatePerHash perhash = &aggstate->perhash[setno]; MemoryContext metacxt = aggstate->hash_metacxt; - MemoryContext hashcxt = aggstate->hashcontext->ecxt_per_tuple_memory; + Mem
Bug - DoS - Handler function lookups consider non-handler functions
Today: create extension tsm_system_rows ; create schema s1; create function s1.system_rows(internal) returns void language c as 'tsm_system_rows.so', 'tsm_system_rows_handler'; set search_path to s1,public,pg_catalog; select count(*) from pg_class tablesample system_rows(0); ERROR: function system_rows must return type tsm_handler LINE 1: select count(*) from pg_class tablesample system_rows(0); The above is a denial-of-service due to our decision to lookup handler functions by name regardless of return type and consider it an error if a function with the wrong return type shows up (in particular, even though one with the correct signature exists and otherwise would have been found). The attached POC fixes this by allowing callers to also specify the OID of the handler type as part of their function lookup criteria. Tablesample is fixed to use this new call though possibly others exist. I'm not particularly fond of what I ended up with naming conventions but figure it's good enough for now. Patch applied and re-running the above: select count(*) from pg_class tablesample system_rows(0); count --- 0 (1 row) I noticed this when reviewing the extensible copy formats patch which used tablesample as a reference. David J. From 09fd47f241859a716374f5b7926f758a5ca8fd3e Mon Sep 17 00:00:00 2001 From: "David G. Johnston" Date: Sat, 22 Mar 2025 09:45:49 -0700 Subject: [PATCH] Handler function lookups ignore non-handler functions --- src/backend/catalog/namespace.c | 35 +++ src/backend/parser/parse_clause.c | 10 +-- src/backend/parser/parse_func.c | 47 +++ src/include/catalog/namespace.h | 7 + src/include/parser/parse_func.h | 5 5 files changed, 78 insertions(+), 26 deletions(-) diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index d97d632a7e..0cb66261a3 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -1115,9 +1115,20 @@ TypeIsVisibleExt(Oid typid, bool *is_missing) return visible; } - +FuncCandidateList +FuncnameGetCandidates(List *names, int nargs, List *argnames, + bool expand_variadic, bool expand_defaults, + bool include_out_arguments, + bool missing_ok) +{ + return HandlerGetCandidates(names, nargs, argnames, + expand_variadic, expand_defaults, + include_out_arguments, + NULL, + missing_ok); +} /* - * FuncnameGetCandidates + * HandlerGetCandidates * Given a possibly-qualified function name and argument count, * retrieve a list of the possible matches. * @@ -1184,14 +1195,20 @@ TypeIsVisibleExt(Oid typid, bool *is_missing) * The caller might end up discarding such an entry anyway, but if it selects * such an entry it should react as though the call were ambiguous. * + * The presence of a handlerkey further restricts the search to functions whose + * return data type matches the handlerkey. These are pseudo-types known to + * the system are are never schema-qualified. + * * If missing_ok is true, an empty list (NULL) is returned if the name was * schema-qualified with a schema that does not exist. Likewise if no * candidate is found for other reasons. */ FuncCandidateList -FuncnameGetCandidates(List *names, int nargs, List *argnames, - bool expand_variadic, bool expand_defaults, - bool include_out_arguments, bool missing_ok) +HandlerGetCandidates(List *names, int nargs, List *argnames, + bool expand_variadic, bool expand_defaults, + bool include_out_arguments, + Oid *handlerkey, + bool missing_ok) { FuncCandidateList resultList = NULL; bool any_special = false; @@ -1374,6 +1391,14 @@ FuncnameGetCandidates(List *names, int nargs, List *argnames, continue; } + if (handlerkey) + { + { +if (procform->prorettype != handlerkey) + continue; + } + } + /* * We must compute the effective argument list so that we can easily * compare it to earlier results. We waste a palloc cycle if it gets diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c index 2e64fcae7b..dbea8c2fda 100644 --- a/src/backend/parser/parse_clause.c +++ b/src/backend/parser/parse_clause.c @@ -924,7 +924,7 @@ transformRangeTableSample(ParseState *pstate, RangeTableSample *rts) */ funcargtypes[0] = INTERNALOID; - handlerOid = LookupFuncName(rts->method, 1, funcargtypes, true); + handlerOid = LookupHandlerName(rts->method, 1, funcargtypes, (Oid *) TSM_HANDLEROID, true); /* we want error to complain about no-such-method, not no-such-function */ if (!OidIsValid(handlerOid)) @@ -934,14 +934,6 @@ transformRangeTableSample(ParseState *pstate, RangeTableSample *rts) NameListToString(rts->method)), parser_errposition(pstate, rts->location))); - /* check that handler has correct return type */ - if (get_func_rettype(handlerOid) != TSM_HANDLEROID) - ereport(ERROR, -(errcode(ERRCODE_WRONG_
Re: Make COPY format extendable: Extract COPY TO format implementations
On Fri, Mar 21, 2025 at 10:23 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > Then this would benefit from the new function I suggest creating since it > apparently has the same, IMO, bug. > > Concretely like I posted here: https://www.postgresql.org/message-id/cakfquwybtck+uw-byfchhp8hyj0r5+upytgmdqevp9phcsz...@mail.gmail.com David J.
Re: Bug - DoS - Handler function lookups consider non-handler functions
"David G. Johnston" writes: > create extension tsm_system_rows ; > create schema s1; > create function s1.system_rows(internal) returns void language c as > 'tsm_system_rows.so', 'tsm_system_rows_handler'; > set search_path to s1,public,pg_catalog; > select count(*) from pg_class tablesample system_rows(0); > ERROR: function system_rows must return type tsm_handler > LINE 1: select count(*) from pg_class tablesample system_rows(0); > The above is a denial-of-service due to our decision to lookup handler > functions by name regardless of return type and consider it an error if a > function with the wrong return type shows up (in particular, even though > one with the correct signature exists and otherwise would have been found). I do not think this is a bug: it's superuser error. Yeah, it'd be great if we could prevent people from creating incorrect support functions, but that'd require solving the halting problem. > The attached POC fixes this by allowing callers to also specify the OID of > the handler type as part of their function lookup criteria. I'm not on board with that at all. The law of unintended consequences comes into play the moment you start messing with function lookup rules. And I'm not at all convinced that "ignore it" is an improvement over "throw an error". And please, let's stop with the scare tactics of calling this sort of thing a denial-of-service. regards, tom lane
Re: Next commitfest app release is planned for March 18th
On Sat, Mar 22, 2025 at 10:49 AM Tom Lane wrote: > > Personally, I think it'd be most useful for the dashboard to show all > > open patches. > > Surely not "all"? We have that display already. Should be more like > "patches that I have expressed an interest in or have reason to take > current or future action on". What I meant was "patches that I have expressed an interest in or have reason to take current or future action on" should be interpreted in the broadest/most permissive way possible by the dashboard. In particular, entries should only *fully* disappear when somebody (some individual human) has actively chosen to make a representation that the patch is closed out. The rest (how we present that information) is details. > In the case of stale patches that haven't been moved forward from a > closed commitfest, perhaps we could compromise on Jelte's suggestion > of putting those in a separate section at the bottom of the page. If there is any gray area, then I very much want us to err on the side of still showing *something*. I really strongly object to having things vanish, absent a 100% unambiguous signal that that's what should happen. In general, the personal dashboard is (quite usefully) oriented around what actions you as a user (or some other CF app user) needs to take -- it is workflow oriented. It seems natural to me to handle stale/in limbo patches (patches that are not officially closed but also aren't in the current or next CF) in just the same way -- by presenting the information in terms of actions that need to be taken by some individual stakeholder. > However, I still think such patches should be treated differently for > the author than other people. The author does have current action to > take on the patch, namely moving it forward or withdrawing it. Right. So maybe for the patch author it appears either under "Your patches that need changes from you", or in a similar section that's just for stale patches that need to either be officially dropped or officially moved forward to an open CF. Whereas it'd be a little different (but not too different) for somebody who sees a patch because they're the reviewer/committer of record (the action item for such a person, if any, is to actually fully close the patch, or to nag the patch author). It probably makes sense to make stale/in limbo entries stick out like a sore thumb. They're *supposed* to be annoying. -- Peter Geoghegan
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
I wrote: > So my feeling is: if we think this is the behavior we want, let's do > it across the board. I suggest that we simply drop the relid from the > jumble and use the table alias (that is, eref->aliasname) instead. I experimented with this trivial fix (shown in-line to keep the cfbot from thinking this is the patch-of-record): diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 23c9e3c5abf..a54bbdc18b7 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -1051,7 +1051,7 @@ typedef struct RangeTblEntry /* user-written alias clause, if any */ Alias *alias pg_node_attr(query_jumble_ignore); /* expanded reference names */ - Alias *eref pg_node_attr(query_jumble_ignore); + Alias *eref; RTEKind rtekind;/* see above */ @@ -1094,7 +1094,7 @@ typedef struct RangeTblEntry * tables to be invalidated if the underlying table is altered. */ /* OID of the relation */ - Oid relid; + Oid relid pg_node_attr(query_jumble_ignore); /* inheritance requested? */ boolinh; /* relation kind (see pg_class.relkind) */ This caused just one diff in existing regression test cases: diff --git a/contrib/pg_stat_statements/expected/planning.out b/contrib/pg_stat_statements/expected/planning.out index 3ee1928cbe9..c25b8b946fd 100644 --- a/contrib/pg_stat_statements/expected/planning.out +++ b/contrib/pg_stat_statements/expected/planning.out @@ -75,8 +75,9 @@ SELECT plans >= 2 AND plans <= calls AS plans_ok, calls, rows, query FROM pg_sta WHERE query LIKE 'SELECT COUNT%' ORDER BY query COLLATE "C"; plans_ok | calls | rows |query --+---+--+-- - t| 4 |4 | SELECT COUNT(*) FROM stats_plan_test -(1 row) + f| 1 |1 | SELECT COUNT(*) FROM stats_plan_test + f| 3 |3 | SELECT COUNT(*) FROM stats_plan_test +(2 rows) -- Cleanup DROP TABLE stats_plan_test; What's happening there is that there's an ALTER TABLE ADD COLUMN in the test, so the executions after the first one see more entries in eref->colnames and come up with a different jumble. I think we probably don't want that behavior; we only want to jumble the table name. So we'd still need the v3-0001 patch in some form to allow annotating RangeTblEntry.eref with a custom jumble method that'd only jumble the aliasname. regards, tom lane
Re: Proposal: Progressive explain
On Wed, Mar 19, 2025 at 6:53 PM Rafael Thofehrn Castro wrote: > The strategy I used here is to use a MemoryContextCallback > (ProgressiveExplainReleaseFunc), configured in the memory context > where the query is being executed, being responsible for calling > ProgressiveExplainCleanup() if the query doesn't end gracefully. Thanks for the pointer. I'm a bit skeptical about what's going on here in ProgressiveExplainReleaseFunc(). It seems like we're depending on shared memory to tell us whether we need to do purely backend-local cleanup, like calling disable_timeout() and resetting ProgressiveExplainPending and activeQueryDesc. I would have expected us to keep track in local memory of whether this kind of work needs to be done. It seems roundabout to take an LWLock, do a hash table lookup to see if there's an entry there, release the LWLock, and then very shortly thereafter take the lock a second time to release the entry that we now know is there. The comment in ProgressiveExplainCleanup about only detaching from the DSA if the query ended gracefully is not ideal from my point of view because it says what the code does instead of why the code does that thing. Also, the function is seemingly called with queryDesc as an argument not because you need it for anything but because you're going to test whether it's null. In that case, you could just pass a Boolean. Even then, something seems odd about this: why do we have to be holding ProgressiveExplainHashLock to dsa_detach the somewhat-inscrutably named area pe_a? And why are we not detaching it in case of error? I am wondering why you chose this relatively unusual error cleanup strategy. What I would have done is invent AtEOXact_ProgressiveExplain and AtSubEOXact_ProgressiveExplain. In some sense this looks simpler, because it doesn't need separate handling for transactions and subtransactions, but it's so unlike what we do in most places that it's hard for me to tell whether it's correct. I feel like the approach you've chosen here would make sense if what we wanted to do was basically release memory or some memory-like resource associated closely with the context -- e.g. expandedrecord.c releases a TupleDesc, but this is doing more than that. I think the effect of this choice is that cleanup of the progressive-EXPLAIN stuff happens much later than it normally would. Most of the time, in the case of an abort, we would want AbortTransaction() to release as many resources as possible, leaving basically no work to do at CleanupTransaction() time. This is so that if a user types "SELECT 1/0;" we release resources, as far as possible, right away, and don't wait for them to subsequently type "ROLLBACK;". The transaction lives on only as a shell. But these resources, if I'm reading this right, are going to stick around until the transaction is actually rolled back, because memory is not freed until CleanupTransaction() time. I wonder what happens if a query inside of an explicit transaction aborts after putting an entry in the progressive-explain view. My guess is that the entry will stick around until the actual rollback happens. In fact, now that I think about it, this is probably why we don't dsa_detach() in ProgressiveExplainCleanup() in cases of error -- the resource owner cleanup will have already released the DSA segments long before the memory context is deleted. I'm sort of inclined to say that this should be rewritten to do error cleanup in a more normal way. It's probably more code to do it that way, but I think having it be more similar to other subsystems is probably worth quite a bit. > > It seems like when we replace a longer entry with a shorter one, we > > forget that it was originally longer. Hence, if the length of a > > progressive EXPLAIN is alternately 2922 characters and 2923 > > characters, we'll reallocate on every other progressive EXPLAIN > > instead of at most once. > > Are you talking about re-printing the plan in the same query execution? Yes. > The logic for the code, using your example, would be to allocate 2922 + > PROGRESSIVE_EXPLAIN_FREE_SIZE (4096, currently) initially. If next plans > alternate between 2922 and 2923 no additional allocation will be done. > A reallocation will be needed only if the plan length ends up exceeding > 2922+4096. At the end of query execution (or cancellation) that DSA will > be freed and a next query execution will have to allocate again using the > same logic. It seems to me that if ProgressiveExplainPrint() reaches /* Update shared memory with new data */ without reallocating, strlen(pe_data->plan) can be reduced. On the next trip through the function, we don't know whether the string we're seeing is the original string -- for which strlen()+PROGRESSIVE_EXPLAIN_FREE_SIZE) gives us the original allocation size -- or whether the string we're seeing is a shorter one that was copied over the original, longer string. PROGRESSIVE_EXPLAIN_FREE_SIZE is big enough that this probably isn't much of
Re: BitmapHeapScan streaming read user and prelim refactoring
Hi, On 2025-03-22 16:42:42 -0400, Andres Freund wrote: > Hm, it's clearly related to the all-visible path, but I think the bug is > actually independent of what was reported there. The bug you reported is > perfectly reproducible, without any concurrency, after all. > > Here's a smaller reproducer: > > CREATE TABLE bmscantest (a int, t text); > > INSERT INTO bmscantest > SELECT (r%53), > 'foo' > FROM generate_series(1,7) r; > > CREATE INDEX i_bmtest_a ON bmscantest(a); > set enable_indexscan=false; > set enable_seqscan=false; > > -- Lower work_mem to trigger use of lossy bitmaps > set work_mem = 64; > > SELECT count(*) FROM bmscantest WHERE a = 1; > vacuum freeze bmscantest; > SELECT count(*) FROM bmscantest WHERE a = 1; > > -- clean up > DROP TABLE bmscantest; > > > The first SELECT reports 1321, the second 572 tuples. The problem is that sometimes recheck is performed for pending empty tuples. The comment about empty tuples says: /* * Bitmap is exhausted. Time to emit empty tuples if relevant. We emit * all empty tuples at the end instead of emitting them per block we * skip fetching. This is necessary because the streaming read API * will only return TBMIterateResults for blocks actually fetched. * When we skip fetching a block, we keep track of how many empty * tuples to emit at the end of the BitmapHeapScan. We do not recheck * all NULL tuples. */ *recheck = false; return bscan->rs_empty_tuples_pending > 0; But we actually emit empty tuples at each page boundary: /* * Out of range? If so, nothing more to look at on this page */ while (hscan->rs_cindex >= hscan->rs_ntuples) { /* * Emit empty tuples before advancing to the next block */ if (bscan->rs_empty_tuples_pending > 0) { /* * If we don't have to fetch the tuple, just return nulls. */ ExecStoreAllNullTuple(slot); bscan->rs_empty_tuples_pending--; return true; } /* * Returns false if the bitmap is exhausted and there are no further * blocks we need to scan. */ if (!BitmapHeapScanNextBlock(scan, recheck, lossy_pages, exact_pages)) return false; } So we don't just process tuples at the end of the scan, but at each page boundary. This leads to wrong results whenever there is a rs_empty_tuples_pending > 0 after a previous page that needed rechecks, because nothing will reset *recheck = false. And indeed, if I add a *recheck = false in that inside the /* * Emit empty tuples before advancing to the next block */ if (bscan->rs_empty_tuples_pending > 0) { the problem goes away. A fix should do slightly more, given that the "at the end" comment is wrong. But I'm wondering if it's worth fixing it, or if we should just rip the logic out alltogether, due to the whole VM checking being unsound as we learned in the thread I referenced earlie, without even bothering to fix the bug here. Greetings, Andres Freund
Re: Using read_stream in index vacuum
> On 22 Mar 2025, at 00:23, Melanie Plageman wrote: > > > I've committed the btree and gist read stream users. Cool! Thanks! > I think we can > come back to the test after feature freeze and make sure it is super > solid. +1. > On 22 Mar 2025, at 02:54, Melanie Plageman wrote: > > On Fri, Mar 21, 2025 at 3:23 PM Melanie Plageman > wrote: >> >> I've committed the btree and gist read stream users. I think we can >> come back to the test after feature freeze and make sure it is super >> solid. > > I've now committed the spgist vacuum user as well. I'll mark the CF > entry as completed. That's great! Thank you! > I wonder if we should do GIN? GIN vacuum is a logical scan. Back in 2017 I was starting to work on it, but made some mistakes, that were reverted by fd83c83 from the released version. And I decided to back off for some time. Perhaps, now I can implement physical scan for GIN, that could benefit from read stream. But I doubt I will find committer for this in 19, let alone 18. We can add some support for read stream for hashbulkdelete(): it's not that linear as B-tree, GiST and SP-GiST, it scans only beginning of hash buckets, but if buckets are small it might be more efficient. >> Looking at the spgist read stream user, I see you didn't convert >> spgprocesspending(). It seems like you could write a callback that >> uses the posting list and streamify this as well. > > It's probably not worth it -- since we process the pending list for > each page of the index. My understanding is that pending lists should be small on real workloads. Thank you! Best regards, Andrey Borodin.
Re: [PATCH] PGSERVICEFILE as part of a normal connection string
On Thu, Mar 20, 2025 at 06:16:44PM +0900, Ryo Kanbayashi wrote: > Sorry, I found a miss on 006_service.pl. > Fixed patch is attached... Please note that the commit fest app needs all the patches of a a set to be posted in the same message. In this case, v2-0001 is not going to get automatic test coverage. Your patch naming policy is also a bit confusing. I would suggest to use `git format-patch -vN -2`, where N is your version number. 0001 would be the new tests for service files, and 0002 the new feature, with its own tests. +if ($windows_os) { + +# Windows: use CRLF +print $fh "[my_srv]", "\r\n"; +print $fh join( "\r\n", split( ' ', $node->connstr ) ), "\r\n"; +} +else { +# Non-Windows: use LF +print $fh "[my_srv]", "\n"; +print $fh join( "\n", split( ' ', $node->connstr ) ), "\n"; +} +close $fh; That's duplicated. Let's perhaps use a $newline variable and print into the file using the $newline? Question: you are doing things this way in the test because fgets() is what is used by libpq to retrieve the lines of the service file, is that right? Please note that the CI is failing. It seems to me that you are missing a done_testing() at the end of the script. If you have a github account, I'd suggest to set up a CI in your own fork of Postgres, this is really helpful to double-check the correctness of a patch before posting it to the lists, and saves in round trips between author and reviewer. Please see src/tools/ci/README in the code tree for details. +# Copyright (c) 2023-2024, PostgreSQL Global Development Group These dates are incorrect. Should be 2025, as it's a new file. +++ b/src/interfaces/libpq/t/007_servicefile_opt.pl @@ -0,0 +1,100 @@ +# Copyright (c) 2023-2024, PostgreSQL Global Development Group Incorrect date again in the second path with the new feature. I'd suggest to merge all the tests in a single script, with only one node initialized and started. -- Michael signature.asc Description: PGP signature
Re: why there is not VACUUM FULL CONCURRENTLY?
Alvaro Herrera wrote: > I rebased this patch series; here's v09. No substantive changes from v08. > I made sure the tree still compiles after each commit. Thanks. > I did look at 0002 again (and renamed the members of the new struct by > adding a p_ prefix, as well as fixing the references to the old names > that were in a few code comments here and there; I don't think these > changes are "substantive"), and ended up wondering why do we need that > change in the first place. According to the comment where the progress > restore function is called, it's because reorderbuffer.c uses a > subtransaction internally. But I went to look at reorderbuffer.c and > noticed that the subtransaction is only used "when using the SQL > function interface, because that creates a transaction already". So > maybe we should look into making REPACK use reorderbuffer without having > to open a transaction block. Which part of reorderbuffer.c do you mean? ISTM that the use of subransactions is more extensive. At least ReorderBufferImmediateInvalidation() appears to rely on it, which in turn is called by xact_decode(). (I don't claim that saving and restoring the progress state is perfect, but I don't have better idea right now.) -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
Re: To Michael Paquier > > >> +#define JUMBLE_CUSTOM(nodetype, item) \ > > >> +_jumble##nodetype##_##item(jstate, expr, expr->item) > > > > In this one, I want to mean that we require a custom per-field > > function to look like that: > > _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field); > > Perhaps this: Or actually more explicit: /* * Per-field custom jumble functions have this signature: * _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field); */ Christoph
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
Re: Michael Paquier > >> + * Note that the argument types are enforced for the per-field custom > >> + * functions. > >> + */ > >> +#define JUMBLE_CUSTOM(nodetype, item) \ > >> + _jumble##nodetype##_##item(jstate, expr, expr->item) > > In this one, I want to mean that we require a custom per-field > function to look like that: > _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field); > > Rather than having more generic shape like that: > _jumbleNodefoo_field(JumbleState *jstate, Node *exp, > const unsigned char *item, Size size); > > So a custom function is defined so as the node type and field type are > arguments. Perhaps this comment would be better if reworded like > that: > "The arguments of this function use the node type and the field type, > rather than a generic argument like AppendJumble() and the other > _jumble() functions." Perhaps this: /* * The per-field custom jumble functions get jstate, the node, and the * field as arguments. */ They are not actually different from _jumbleList and _jumbleA_Const which also get the node (and just not the field). AppendJumble is a different layer, the output, so it's not surprising its signature is different. Are we at the point where the patch is already Ready for Committer? Thanks, Christoph
Re: why there is not VACUUM FULL CONCURRENTLY?
Robert Haas wrote: > On Thu, Mar 20, 2025 at 2:09 PM Antonin Houska wrote: > > Robert Haas wrote: > > > Is there a README or a long comment in here someplace that is a good > > > place to read to understand the overall design of this feature? > > > > I tried to explain how it works in the commit messages. The one in 0004 is > > probably the most important one. > > Thanks. A couple of comments/questions: > > - I don't understand why this commit message seems to think that we > can't acquire a stronger lock while already holding a weaker one. We > can do that, and in some cases we do precisely that. Can you please give me an example? I don't recall seeing a lock upgrade in the tree. That's the reason I tried rather hard to avoid that. > Such locking > patterns can result in deadlock e.g. if I take AccessShareLock and you > take AccessShareLock and then I tried to upgrade to > AccessExclusiveLock and then you try to upgrade to > AccessExclusiveLock, somebody is going to have to ERROR out. But that > doesn't keep us from doing that in some places where it seems better > than the alternatives, and the alternative chosen by the patch > (possibly discovering at the very end that all our work has been in > vain) does not seem better than risking a deadlock. I see. Only the backends that do upgrade their lock are exposed to the risk of deadlock, e.g. two backends running REPACK CONCURRENTLY on the same table, and that should not happen too often. I'll consider your objection - it should make the patch a bit simpler. > - On what basis do you make the statement in the last paragraph that > the decoding-related lag should not exceed one WAL segment? I guess > logical decoding probably keeps up pretty well most of the time but > this seems like a very strong guarantee for something I didn't know we > had any kind of guarantee about. The patch itself does guarantee that by checking the amount of unprocessed WAL regularly when it's copying the data into the new table. If too much WAL appears to be unprocessed, it enforces the decoding before the copying is resumed. The WAL decoding during the "initial load" phase can actually be handled by a background worker (not sure it's necessary in the initial implementation), which would make a significant lag rather unlikely. But even then we should probably enforce certain limit on the lag (e.g. because background worker is not guaranteed to start). > - What happens if we crash? The replication slot we create is RS_TEMPORARY, so it disappears after restart. Everything else is as if the current implementation of CLUSTER ends due to crash. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: Adding extension default version to \dx
On Tue, 25 Feb 2025 at 17:11, Nathan Bossart wrote: > > On Tue, Feb 25, 2025 at 04:42:37PM +0100, Magnus Hagander wrote: > > Thanks goes to both you and the previous responders - I did manage to mute > > this thread away and missed the early replies, but got Jeltes the other day > > which brought it back up on my list to get to within the Not Too Long > > Future (TM). > > Got it, sounds good. The commitfest is close to ending, but this hasn't been committed yet. I think it would be great if either of you could commit it before the commitfest ends. It would be a shame to have this quality of life improvement wait another year.
Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
On Fri, 21 Mar 2025 at 18:59, Shubham Khanna wrote: > > > During the recent testing, I observed that the tests were failing when > using wait_for_slot_catchup(). To address this, I reverted to using > wait_for_subscription_sync(), which was employed previously and has > proven to be more reliable in ensuring test stability. > Please let me know if there are any additional adjustments you would > suggest or if you would like me to investigate further into > wait_for_slot_catchup(). > > I have created a separate patch for the synopsis of '--all' option as > suggested by Amit at [1]. The attached patch contains the suggested > changes. I believe you added the following because pattern matching is difficult for the $db1 and $db2 variables having special names: +# Create a new database on node_p +$node_p->safe_psql( + "postgres", qq( + CREATE DATABASE db1; +)); How about we change it to verify the count of occurrences instead for this case like below: # Verify that the required logical replication objects are created. The # expected count 3 refers to postgres, $db1 and $db2 databases. is(scalar(() = $stderr =~ /creating publication/g), 3, "verify publications are created for all databases"); is(scalar(() = $stderr =~ /creating the replication slot/g), 3, "verify replication slots are created for all databases"); is(scalar(() = $stderr =~ /creating subscription/g), 3, "verify subscriptions are created for all databases"); If you are ok, you can merge the changes attached. Regards, Vignesh diff --git a/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl b/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl index 7e275473eb9..9355cb0ade9 100644 --- a/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl +++ b/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl @@ -419,12 +419,6 @@ command_fails_like( qr/--publication cannot be used with --all/, 'fail if --publication is used with --all'); -# Create a new database on node_p -$node_p->safe_psql( - "postgres", qq( - CREATE DATABASE db1; -)); - # run pg_createsubscriber with '--all' option my ($stdout, $stderr) = run_command( [ @@ -440,20 +434,14 @@ my ($stdout, $stderr) = run_command( ], 'run pg_createsubscriber with --all'); -like( - $stderr, - qr/.*pg_createsubscriber: creating publication .* in database \"postgres\"/, - "expanded commands for all databases"); -like( - $stderr, - qr/.*pg_createsubscriber: creating publication .* in database \"db1\"/, - "expanded commands for all databases"); - -# Drop the newly created database on node_p -$node_p->safe_psql( - "postgres", qq( - DROP DATABASE db1; -)); +# Verify that the required logical replication objects are created. The +# expected count 3 refers to postgres, $db1 and $db2 databases. +is(scalar(() = $stderr =~ /creating publication/g), + 3, "verify publications are created for all databases"); +is(scalar(() = $stderr =~ /creating the replication slot/g), + 3, "verify replication slots are created for all databases"); +is(scalar(() = $stderr =~ /creating subscription/g), + 3, "verify subscriptions are created for all databases"); # Run pg_createsubscriber on node S. --verbose is used twice # to show more information.
Re: RFC: Additional Directory for Extensions
On 2025-03-21 Fr 12:42 PM, Tom Lane wrote: Matheus Alcantara writes: On Thu, Mar 20, 2025 at 7:38 PM Andrew Dunstan wrote: (wondering if this another of these cases where the "path includes postgres" thing bites us, and we're looking in the wrong place) Nope, testing shows it's not that, so I am rather confused about what was going on. I'm not sure if I'm checking on the right place [1] but it seems that the Contrib and ContribInstall is executed after Check step which causes this test failure? No, this is not failing in Check. I did just notice a clue though: on snakefly, the failing step's log [1] includes make[1]: Leaving directory `/opt/postgres/build-farm-18/HEAD/pgsql.build/src/backend' rm -rf '/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install /usr/bin/mkdir -p '/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install/log make -C '../../../..' DESTDIR='/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install install >'/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install/log/install.log 2>&1 make -j1 checkprep >>'/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install/log/install.log 2>&1 PATH="/opt/postgres/build-farm-18/HEAD/pgsql.build/tmp_install/opt/postgres/build-farm-18/HEAD/inst/bin:/opt/postgres/build-farm-18/HEAD/pgsql.build/src/test/modules/test_extensions:$PATH" LD_LIBRARY_PATH="/opt/postgres/build-farm-18/HEAD/pgsql.build/tmp_install/opt/postgres/build-farm-18/HEAD/inst/lib:$LD_LIBRARY_PATH" INITDB_TEMPLATE='/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install/initdb-template initdb --auth trust --no-sync --no-instructions --lc-messages=C --no-clean '/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install/initdb-template >>'/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install/log/initdb-template.log 2>&1 echo "# +++ regress check in src/test/modules/test_extensions +++" && PATH="/opt/postgres/build-farm-18/HEAD/pgsql.build/tmp_install/opt/postgres/build-farm-18/HEAD/inst/bin:/opt/postgres/build-farm-18/HEAD/pgsql.build/src/test/modules/test_extensions:$PATH" LD_LIBRARY_PATH="/opt/postgres/build-farm-18/HEAD/pgsql.build/tmp_install/opt/postgres/build-farm-18/HEAD/inst/lib:$LD_LIBRARY_PATH" INITDB_TEMPLATE='/opt/postgres/build-farm-18/HEAD/pgsql.build'/tmp_install/initdb-template ../../../../src/test/regress/pg_regress --temp-instance=./tmp_check --inputdir=. --bindir= --temp-config=/opt/postgres/build-farm-18/tmp/buildfarm-C9Iy3s/bfextra.conf --no-locale --port=5678 --dbname=contrib_regression test_extensions test_extdepend # +++ regress check in src/test/modules/test_extensions +++ # initializing database system by running initdb showing that the step made its own tmp_install, and that only the core "install" process was executed, so the lack of amcheck in that install tree is not surprising. But concurrent runs on other animals, eg [2], don't show a tmp_install rebuild happening. So those are using an installation tree that *does* include contrib modules. So what this comes down to is "why is snakefly doing a fresh install here?". I don't know the buildfarm client well enough to identify probable causes. I do note that Makefile.global.in conditionalizes tmp_install rebuild on several variables: temp-install: | submake-generated-headers ifndef NO_TEMP_INSTALL ifneq ($(abs_top_builddir),) ifeq ($(MAKELEVEL),0) rm -rf '$(abs_top_builddir)'/tmp_install $(MKDIR_P) '$(abs_top_builddir)'/tmp_install/log $(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install >'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1 $(MAKE) -j1 $(if $(CHECKPREP_TOP),-C $(CHECKPREP_TOP),) checkprep >>'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1 Good catch. This is happening because the owner hasn't updated the animal to REL_19_1. In 19 and 19.1 we fixed detection of exiting installs to take account of the 'Is there postgres or pgsql in the prefix' issue. So it was looking in the wrong place. cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: making EXPLAIN extensible
On 3/21/25 20:54, Robert Haas wrote: On Fri, Mar 21, 2025 at 2:37 PM Tom Lane wrote: Here's v9, which also adds 'SET debug_parallel_query = off' to the pg_overexplain tests, per CI, because the test results are not (and cannot realistically be made) stable under under that option. I skimmed through the code and tested how it works. It looks good and has no apparent architectural dependencies. But I haven't scrutinised it line-by-line and do not intend to do so. I wanna say I hate the design of this module. Having a strong necessity for extra explain tools (in the daily routine, all I have is the only single explain analyse verbose output to find out planner/executor bug, reproduce it and make a patch), I don't see a single case when I would use this module. It adds a burden to fix its output on a node change (you don't care, but it adds work to Postgres fork maintainers, too, for nothing). Also, it breaks my understanding of the principles of the Postgres code design - to start the discussion on how to show more, we need only the bare minimum of code and output lines. In my opinion, it should show as few parameters as possible to demonstrate principles and test the code on a single planner node. It only deserves src/test/modules because it is not helpful for a broad audience. -- regards, Andrei Lepikhov
Re: Parallel heap vacuum
On Thu, Mar 20, 2025 at 4:36 AM Masahiko Sawada wrote: > > When testing the multi passes of table vacuuming, I found an issue. > With the current patch, both leader and parallel workers process stop > the phase 1 as soon as the shared TidStore size reaches to the limit, > and then the leader resumes the parallel heap scan after heap > vacuuming and index vacuuming. Therefore, as I described in the patch, > one tricky part of this patch is that it's possible that we launch > fewer workers than the previous time when resuming phase 1 after phase > 2 and 3. In this case, since the previous parallel workers might have > already allocated some blocks in their chunk, newly launched workers > need to take over their parallel scan state. That's why in the patch > we store workers' ParallelBlockTableScanWorkerData in shared memory. > However, I found my assumption is wrong; in order to take over the > previous scan state properly we need to take over not only > ParallelBlockTableScanWorkerData but also ReadStream data as parallel > workers might have already queued some blocks for look-ahead in their > ReadStream. Looking at ReadStream codes, I find that it's not > realistic to store it into the shared memory. It seems like one way to solve this would be to add functionality to the read stream to unpin the buffers it has in the buffers queue without trying to continue calling the callback until the stream is exhausted. We have read_stream_reset(), but that is to restart streams that have already been exhausted. Exhausted streams are where the callback has returned InvalidBlockNumber. In the read_stream_reset() cases, the read stream user knows there are more blocks it would like to scan or that it would like to restart the scan from the beginning. Your case is you want to stop trying to exhaust the read stream and just unpin the remaining buffers. As long as the worker which paused phase I knows exactly the last block it processed and can communicate this to whatever worker resumes phase I later, it can initialize vacrel->current_block to the last block processed. > One plausible solution would be that we don't use ReadStream in > parallel heap vacuum cases but directly use > table_block_parallelscan_xxx() instead. It works but we end up having > two different scan methods for parallel and non-parallel lazy heap > scan. I've implemented this idea in the attached v12 patches. One question is which scenarios will parallel vacuum phase I without AIO be faster than read AIO-ified vacuum phase I. Without AIO writes, I suppose it would be trivial for phase I parallel vacuum to be faster without using read AIO. But it's worth thinking about the tradeoff. - Melanie
Re: Next commitfest app release is planned for March 18th
On Sat, Mar 22, 2025 at 7:15 AM Jelte Fennema-Nio wrote: > Yeah I think we need a way to "star" a patch (not just for committers, > but for anyone). After creating this initial version of this dashboard > I realized it's also possible to "subscribe to updates" for a patch, > which means you'll get emails about it. I think if you "subscribe for > updates" you should also see it in your dashboard (which you currently > don't). But I also think there should be a way to "star" something, so > you see it in your dashboard without receiving email updates about it. Right. In my view "starring" something should definitely not appear in the "History Log" of the patch; it should be private. > > > - Also, my dashboard shows patches from past and future commitfests. I > > > don't know what I'm supposed to do with that. The purpose of having a > > > "current" commitfest is that you work on that one when it's current. > > > > This might make sense for the patch author, but I agree it's not > > appropriate for anyone else. > > I'd love to hear some more thoughts on this. Especially from Peter G, > because he explicitly requested to see patches from different > commitfests in this dashboard. So I'm curious about his reasoning. > Ofcourse we can make this configurable or have two separate pages, but > I'm wondering if there's a way to change it in a way that works for > everyone. Personally, I think it'd be most useful for the dashboard to show all open patches. I believe that that implies that it'll only show me patches from either the current CF, or the next CF. But thinking about it some more...I guess that there are edge-cases, where patches can be in limbo between being "fully open" and "fully closed". Like when a patch from an old CF isn't brought forward, and also isn't withdrawn/returned. I think that we should apply an expansive definition of "active", that still shows me things that are less than 100% officially closed. Possibly with additional visual cues that they're in this in between/limbo state. > Instead of showing all patches from all commitfests, how about we do > the following: I think that we should show all patches from all commitfests in the "Personal Dashboard", provided that they're not closed. I probably don't have enough experience with the current "Personal Dashboard" yet, but so far it seems like exactly what I had in mind. If I see something in the "Personal Dashboard" that seems like it shouldn't be there, due to not being in either the current or next CF, and am bothered by that, then maybe I should then take it as an opportunity to fix the problem. As I said, maybe there should be some additional visual cue that shows certain "Personal Dashboard" entries are "in limbo". But please don't make entries in this state just vanish, on the grounds that they're theoretically closed -- experience suggests that they're probably not really closed at all. If something appears on my "Personal Dashboard" it appears there because I actively chose to participate, and hiding things from me on bureaucratic grounds seems paternalistic. -- Peter Geoghegan
Re: [PATCH] PGSERVICEFILE as part of a normal connection string
On Sat, Mar 22, 2025 at 4:46 PM Michael Paquier wrote: > > On Thu, Mar 20, 2025 at 06:16:44PM +0900, Ryo Kanbayashi wrote: > > Sorry, I found a miss on 006_service.pl. > > Fixed patch is attached... > > Please note that the commit fest app needs all the patches of a a set > to be posted in the same message. In this case, v2-0001 is not going > to get automatic test coverage. > > Your patch naming policy is also a bit confusing. I would suggest to > use `git format-patch -vN -2`, where N is your version number. 0001 > would be the new tests for service files, and 0002 the new feature, > with its own tests. All right. I attached patches generated with your suggested command :) > +if ($windows_os) { > + > +# Windows: use CRLF > +print $fh "[my_srv]", "\r\n"; > +print $fh join( "\r\n", split( ' ', $node->connstr ) ), "\r\n"; > +} > +else { > +# Non-Windows: use LF > +print $fh "[my_srv]", "\n"; > +print $fh join( "\n", split( ' ', $node->connstr ) ), "\n"; > +} > +close $fh; > > That's duplicated. Let's perhaps use a $newline variable and print > into the file using the $newline? OK. I reflected above comment. > Question: you are doing things this way in the test because fgets() is > what is used by libpq to retrieve the lines of the service file, is > that right? No. I'm doing above way simply because line ending code of service file wrote by users may become CRLF in Windows platform. > Please note that the CI is failing. It seems to me that you are > missing a done_testing() at the end of the script. If you have a > github account, I'd suggest to set up a CI in your own fork of > Postgres, this is really helpful to double-check the correctness of a > patch before posting it to the lists, and saves in round trips between > author and reviewer. Please see src/tools/ci/README in the code tree > for details. Sorry. I'm using Cirrus CI with GitHub and I checked passing the CI. But there were misses when I created patch files... > +# Copyright (c) 2023-2024, PostgreSQL Global Development Group > > These dates are incorrect. Should be 2025, as it's a new file. OK. > +++ b/src/interfaces/libpq/t/007_servicefile_opt.pl > @@ -0,0 +1,100 @@ > +# Copyright (c) 2023-2024, PostgreSQL Global Development Group > > Incorrect date again in the second path with the new feature. I'd > suggest to merge all the tests in a single script, with only one node > initialized and started. OK. Additional test scripts have been merged to a single script ^^ b --- Great regards, Ryo Kanbayashi From 69c4f4eb8e0ed40c434867fbb740a68383623da9 Mon Sep 17 00:00:00 2001 From: Ryo Kanbayashi Date: Sun, 23 Mar 2025 11:41:06 +0900 Subject: [PATCH v3 1/2] add regression test of service connection option. --- src/interfaces/libpq/meson.build | 1 + src/interfaces/libpq/t/006_service.pl | 79 +++ 2 files changed, 80 insertions(+) create mode 100644 src/interfaces/libpq/t/006_service.pl diff --git a/src/interfaces/libpq/meson.build b/src/interfaces/libpq/meson.build index 19f4a52a97a..292fecf3320 100644 --- a/src/interfaces/libpq/meson.build +++ b/src/interfaces/libpq/meson.build @@ -122,6 +122,7 @@ tests += { 't/003_load_balance_host_list.pl', 't/004_load_balance_dns.pl', 't/005_negotiate_encryption.pl', + 't/006_service.pl', ], 'env': { 'with_ssl': ssl_library, diff --git a/src/interfaces/libpq/t/006_service.pl b/src/interfaces/libpq/t/006_service.pl new file mode 100644 index 000..045e25a05d3 --- /dev/null +++ b/src/interfaces/libpq/t/006_service.pl @@ -0,0 +1,79 @@ +# Copyright (c) 2025, PostgreSQL Global Development Group +use strict; +use warnings FATAL => 'all'; +use Config; +use PostgreSQL::Test::Utils; +use PostgreSQL::Test::Cluster; +use Test::More; + +# This tests "service" connection options. + +# Cluster setup which is shared for testing both load balancing methods +my $node = PostgreSQL::Test::Cluster->new('node'); + +# Create a data directory with initdb +$node->init(); + +# Start the PostgreSQL server +$node->start(); + +my $td = PostgreSQL::Test::Utils::tempdir; +my $srvfile = "$td/pgsrv.conf"; + +# Windows: use CRLF +# Non-Windows: use LF +my $newline = $windows_os ? "\r\n" : "\n"; + +# Create a service file +open my $fh, '>', $srvfile or die $!; +print $fh "[my_srv]", $newline; +print $fh join( $newline, split( ' ', $node->connstr ) ), $newline; + +close $fh; + +# Check that service option works as expected +{ +local $ENV{PGSERVICEFILE} = $srvfile; +$node->connect_ok( +'service=my_srv', +'service=my_srv', +sql => "SELECT 'connect1'", +expected_stdout => qr/connect1/ +); + +$node->connect_ok( +'postgres://?service=my_srv', +'postgres://?service=my_srv', +sql => "SELECT 'connect2'", +expected_stdout =
Re: Proposal - Allow extensions to set a Plan Identifier
On Sat, Mar 22, 2025 at 11:50:06PM +0100, Andrei Lepikhov wrote: > planId actually looks less controversial than queryId or plan_node_id. At > the same time, it is not free from the different logic that extensions may > incorporate into this value: I can imagine, for example, the attempt of > uniquely numbering plans related to the same queryId, plain hashing of the > plan tree, hashing without constants, etc. One plan ID should have one single query ID, but the opposite may be not necessarily true: a query ID could be associated with multiple plan patterns (aka multiple plan IDs). What this offers is that we know about which plan one query is currently using in live, for a given query ID. > So, it seems that extensions may conflict with the same field. Are we sure > that will happen if they are loaded in different orders in neighbouring > backends? Depends on what kind of computation once wants to do, and surely without some coordination across the extensions you are using these cannot be shared, but it's no different from the concept of a query ID. The use cases I've seen where this field is useful is when splitting code that uses the planner hook for several purposes across more than one extension. Three of them which are independent, still want to treat know about a plan ID: - Set/get an existing node tree plan based on a specific ID. - Hash calculation of a tree (like Lukas proposal). - Statistics related to the plan tree used (aka custom cumulative stats play here). > I'm not very good at stat collector guts, but would it be possible to allow > extensions to report their state in standard tuple format? In that case, > doubts about queryId or planId may be resolved. I am not exactly sure what you mean here. We are only going to have one plan ID set for each query execution. Any stats plan data related to a plan ID surely needs to be upper-bounded if you don't want to bloat pgstats, with the query ID of the query string it relates to stored in it and the plan ID used as hash key, but it does not change that we're only going to always have one single plan ID in a PlannedStmt or in a backend entry doing a query execution, like a query ID. -- Michael signature.asc Description: PGP signature
Re: pg_stat_statements and "IN" conditions
> Sure, it's important and I'm planning to tackle this next. If you want, > we can collaborate on a patch for that. I spent some time looking some more at this, and I believe all that needs to be done is check for a PRAM node with a type of PARAM_EXTERN. During planning the planner turns the Param into a Const during eval_const_expressions_mutator. If it's as simple as I think it is, I hope we can get this committed for 18. If not, and a longer discussion is needed, a new thread can be started for this. -- Sami Imseih Amazon Web Services (AWS) v1-0001-Allow-query-jumble-to-squash-a-list-external-paramet.patch Description: Binary data
Re: AIO v2.5
On Thu, Mar 20, 2025 at 09:58:37PM -0400, Andres Freund wrote: > Attached v2.11, with the following changes: > - Added an error check for FileStartReadV() failing > > FileStartReadV() actually can fail, if the file can't be re-opened. I > thought it'd be important for the error message to differ from the one > that's issued for read actually failing, so I went with: > > "could not start reading blocks %u..%u in file \"%s\": %m" > > but I'm not sure how good that is. Message looks good. > - Improved error message if io_uring_queue_init() fails > > Added errhint()s for likely cases of failure. > > Added errcode(). I was tempted to use errcode_for_file_access(), but that > doesn't support ENOSYS - perhaps I should add that instead? Either way is fine with me. ENOSYS -> ERRCODE_FEATURE_NOT_SUPPORTED is a good general mapping to have in errcode_for_file_access(), but it's also not a problem to keep it the way v2.11 has it. > - Disable io_uring method when using EXEC_BACKEND, they're not compatible > > I chose to do this with a define aio.h, but I guess we could also do it at > configure time? That seems more complicated though - how would we even know > that EXEC_BACKEND is used on non-windows? Agreed, "make PROFILE=-DEXEC_BACKEND" is a valid way to get EXEC_BACKEND. > Not sure yet how to best disable testing io_uring in this case. We can't > just query EXEC_BACKEND from pg_config.h unfortunately. I guess making the > initdb not fail and checking the error log would work, but that doesn't work > nicely with Cluster.pm. How about "postgres -c io_method=io_uring -C ": --- a/src/test/modules/test_aio/t/001_aio.pl +++ b/src/test/modules/test_aio/t/001_aio.pl @@ -29,7 +29,13 @@ $node_worker->stop(); # Test io_method=io_uring ### -if ($ENV{with_liburing} eq 'yes') +sub have_io_uring +{ + local %ENV = $node_worker->_get_env(); # any node works + return run_log [qw(postgres -c io_method=io_uring -C io_method)]; +} + +if (have_io_uring()) { my $node_uring = create_node('io_uring'); $node_uring->start(); > Questions: > > > - We only "look" at BM_IO_ERROR for writes, isn't that somewhat weird? > > See AbortBufferIO(Buffer buffer) > > It doesn't really matter for the patchset, but it just strikes me as an > oddity. That caught my attention in an earlier review round, but I didn't find it important enough to raise. It's mildly unfortunate to be setting BM_IO_ERROR for reads when the only thing BM_IO_ERROR drives is message "Multiple failures --- write error might be permanent." It's minor, so let's leave it that way for the foreseeable future. > Subject: [PATCH v2.11 01/27] aio, bufmgr: Comment fixes Ready to commit, though other comment fixes might come up in later reviews. One idea so far is to comment on valid states after some IoMethodOps callbacks: --- a/src/include/storage/aio_internal.h +++ b/src/include/storage/aio_internal.h @@ -310,6 +310,9 @@ typedef struct IoMethodOps /* * Start executing passed in IOs. * +* Shall advance state to PGAIO_HS_SUBMITTED. (By the time this returns, +* other backends might have advanced the state further.) +* * Will not be called if ->needs_synchronous_execution() returned true. * * num_staged_ios is <= PGAIO_SUBMIT_BATCH_SIZE. @@ -321,6 +324,12 @@ typedef struct IoMethodOps /* * Wait for the IO to complete. Optional. * +* On return, state shall be PGAIO_HS_COMPLETED_IO, +* PGAIO_HS_COMPLETED_SHARED or PGAIO_HS_COMPLETED_LOCAL. (The callback +* need not change the state if it's already one of those.) If state is +* PGAIO_HS_COMPLETED_IO, state will reach PGAIO_HS_COMPLETED_SHARED +* without further intervention. +* * If not provided, it needs to be guaranteed that the IO method calls * pgaio_io_process_completion() without further interaction by the * issuing backend. > Subject: [PATCH v2.11 02/27] aio: Change prefix of PgAioResultStatus values to > PGAIO_RS_ Ready to commit > Subject: [PATCH v2.11 03/27] Redefine max_files_per_process to control > additionally opened files Ready to commit > Subject: [PATCH v2.11 04/27] aio: Add liburing dependency > --- a/meson.build > +++ b/meson.build > @@ -944,6 +944,18 @@ endif > > > > +### > +# Library: liburing > +### > + > +liburingopt = get_option('liburing') > +liburing = dependency('liburing', required: liburingopt) > +if liburing.found() > + cdata.set('USE_LIBURING', 1) > +endif This is a different style from other deps; is it equivalent to our standard style? Example for lz4: lz4opt = get_option('lz4') if not lz4opt.disabled() lz4 = dependency('liblz4', required: false) # Unfortunately the dependency is named differently wit
Re: Update LDAP Protocol in fe-connect.c to v3
> This is the first complaint I can recall hearing about that, so exactly which ones are "many"? I've tested a 2 before figuring out about the v3 issue. lldap[0] and the docker image osixia/docker-openldap[1]. - lldap gives the following error message when I attempt to connect without the patch "Service Error: while handling incoming messages: while receiving LDAP op: Bind request version is not equal to 3. This is a serious client bug.". With the attached patch this error message does not appear - osixia/docker-openlap gives the following error message without the patch "67df745e conn=1001 op=0 RESULT tag=97 err=2 text=historical protocol version requested, use LDAPv3 instead". " > Also, are we really sufficiently compliant with v3 that just adding this bit is enough? I believe that this bit is all that is needed. Per the man page for ldap_set_option [2]: "The protocol version used by the library defaults to LDAPv2 (now historic), which corresponds to the LDAP_VERSION2 macro. Application developers are encouraged to explicitly set LDAP_OPT_PROTOCOL_VERSION to LDAPv3, using the LDAP_VERSION3 macro, or to allow users to select the protocol version." > src/test/ldap/ doesn't do it for you? Looking through the tests here it seems like they are all tests for the serverside auth functionality that is configurable in pg_hba.conf. I don't see any tests that test the client side "LDAP Lookup of Connection Parameters" described in [3] [0] https://github.com/lldap/lldap [1] https://github.com/osixia/docker-openldap [2] https://linux.die.net/man/3/ldap [3] https://www.postgresql.org/docs/current/libpq-ldap.html On Sat, Mar 22, 2025 at 6:10 PM Tom Lane wrote: > Andrew Jackson writes: > > Currently the LDAP usage in fe-connect.c does not explicitly set the > > protocol version to v3. This causes issues with many LDAP servers as they > > will often require clients to use the v3 protocol and disallow any use of > > the v2 protocol. > > This is the first complaint I can recall hearing about that, so > exactly which ones are "many"? Also, are we really sufficiently > compliant with v3 that just adding this bit is enough? > > > One further note is that I do not currently see any test coverage over > the > > LDAP functionality in `fe-connect.c`. I am happy to add that to this > patch > > if needed. > > src/test/ldap/ doesn't do it for you? > > regards, tom lane >
Re: Update Unicode data to Unicode 16.0.0
On Fri, 21 Mar 2025 13:45:24 -0700 Jeff Davis wrote: > > Maybe we should actually move in the direction of having encodings > > that are essentially specific versions of Unicode. Instead of just > > having UTF-8 that accepts whatever, you could have UTF-8.v16.0.0 or > > whatever, which would only accept code points known to that version > > of > > Unicode. Or maybe this shouldn't be entirely new encodings but > > something vaguely akin to a typmod, so that you could have columns > > of type text[limited_to_unicode_v16_0_0] or whatever. If we actually > > exclude unassigned code points, then we know they aren't there, and > > we > > can make deductions about what is safe to do based on that > > information. > > I like this line of thinking, vaguely similar to my STRICT_UNICODE > database option proposal. Maybe these aren't exactly the right things > to do, but I think there are some possibilities here, and we shouldn't > give up and assume there's a problem when usually there is not. There is "the iPhone paradox" here; if we reject unassigned code points, then websites are going to start throwing database errors for anyone with the latest iPhone who uses a new emoji. (Unless the database is updated very quickly, which is atypical.) Apple tends to get new emojis into consumers hands a year or less after the new Unicode release. -Jeremy
Re: Proposal - Allow extensions to set a Plan Identifier
On 3/22/25 06:48, Michael Paquier wrote: On Fri, Mar 21, 2025 at 09:25:24AM -0400, Sami Imseih wrote: planner() is the sole place in the core code where the planner hook can be called. Shouldn't we have at least a call to pgstat_report_plan_id() after planning a query? At least that should be the behavior I'd expect, where a module pushes a planId to a PlannedStmt, then core publishes it to the backend entry in non-force mode. I agree. I was just thinking we rely on the exec_ routines to report the plan_id at the start. But, besides the good reason you give, reporting (slightly) earlier is better for monitoring tools; as it reduces the likelihood they find an empty plan_id. Yep. pgstat_report_plan_id() is not something that extensions should do, but they should report the planId in the PlannedStmt and let the backend do the rest. Overall, v3 LGTM Thanks. If anybody has any objections and/or comments, now would be a good time. I'll revisit this thread at the beginning of next week. The last time I wrote the email, I mistakenly thought about plan_node_id. I apologize for that. planId actually looks less controversial than queryId or plan_node_id. At the same time, it is not free from the different logic that extensions may incorporate into this value: I can imagine, for example, the attempt of uniquely numbering plans related to the same queryId, plain hashing of the plan tree, hashing without constants, etc. So, it seems that extensions may conflict with the same field. Are we sure that will happen if they are loaded in different orders in neighbouring backends? I'm not very good at stat collector guts, but would it be possible to allow extensions to report their state in standard tuple format? In that case, doubts about queryId or planId may be resolved. -- regards, Andrei Lepikhov
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
Michael Paquier writes: > Alias.aliasname is not qualified, so it means that we'd begin to > assign the same query ID even if using two relations from two schemas > depending on what search_path assigns, no? Right. I'm arguing that that's good. The proposed patch already obscures the difference between similar table names in different (temp) schemas, and I'm suggesting that taking that a bit further would be fine. Note that if the tables we're considering don't have identical rowtypes, the queries would likely jumble differently anyway due to differences in Vars' varattno and vartype. regards, tom lane
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
On Sat, Mar 22, 2025 at 12:24:43PM -0400, Tom Lane wrote: > I experimented with this trivial fix (shown in-line to keep the cfbot > from thinking this is the patch-of-record): > > What's happening there is that there's an ALTER TABLE ADD COLUMN in > the test, so the executions after the first one see more entries > in eref->colnames and come up with a different jumble. I think > we probably don't want that behavior; we only want to jumble the > table name. So we'd still need the v3-0001 patch in some form to > allow annotating RangeTblEntry.eref with a custom jumble method > that'd only jumble the aliasname. Alias.aliasname is not qualified, so it means that we'd begin to assign the same query ID even if using two relations from two schemas depending on what search_path assigns, no? Say: create schema popo1; create schema popo2; create table popo1.aa (a int, b int); create table popo2.aa (a int, b int); set search_path = 'popo1'; select count(*) from aa; set search_path = 'popo2'; select count(*) from aa; =# select query, calls from pg_stat_statements where query ~ 'select count'; query | calls -+--- select count(*) from aa | 2 (1 row) Perhaps that's OK because such queries use the same query string, but just silencing the relid means that we'd lose the namespace reference entirely, making the stats potentially fuzzier depending on the workload. On HEAD, one can guess the query ID with an EXPLAIN and a search_path, as well, so currently it's possible to cross-check the contents of pgss. But we'd lose this possibility here.. -- Michael signature.asc Description: PGP signature
Re: AIO v2.5
On Thu, Mar 20, 2025 at 09:58:37PM -0400, Andres Freund wrote: > Attached v2.11 > Subject: [PATCH v2.11 05/27] aio: Add io_method=io_uring Apart from some isolated cosmetic points, this is ready to commit: > + ereport(ERROR, > + errcode(err), > + errmsg("io_uring_queue_init failed: > %m"), > + hint != NULL ? errhint("%s", hint) : 0); https://www.postgresql.org/docs/current/error-style-guide.html gives the example: BAD:open() failed: %m BETTER: could not open file %s: %m Hence, this errmsg should change, perhaps to: "could not setup io_uring queues: %m". > + pgaio_debug_io(DEBUG3, ioh, > +"wait_one io_gen: %llu, ref_gen: > %llu, cycle %d", > +(long long unsigned) ref_generation, > +(long long unsigned) ioh->generation, In the message string, io_gen appears before ref_gen. In the subsequent args, the order is swapped relative to the message string. > --- a/src/backend/utils/activity/wait_event_names.txt > +++ b/src/backend/utils/activity/wait_event_names.txt > @@ -192,6 +192,8 @@ ABI_compatibility: > > Section: ClassName - WaitEventIO > > +AIO_IO_URING_SUBMIT "Waiting for IO submission via io_uring." > +AIO_IO_URING_COMPLETION "Waiting for IO completion via io_uring." > AIO_IO_COMPLETION"Waiting for IO completion." I'm wondering if there's an opportunity to enrich the last two wait event names and/or descriptions. The current descriptions suggest to me more similarity than is actually there. Inputs to the decision: - AIO_IO_COMPLETION waits for an IO in PGAIO_HS_DEFINED, PGAIO_HS_STAGED, or PGAIO_HS_COMPLETED_IO to reach PGAIO_HS_COMPLETED_SHARED. The three starting states are the states where some other backend owns the next action, so the current backend can only wait to be signaled. - AIO_IO_URING_COMPLETION waits for the kernel to do enough so we can move from PGAIO_HS_SUBMITTED to PGAIO_HS_COMPLETED_IO. Possible names and descriptions, based on PgAioHandleState enum names and comments: AIO_IO_URING_COMPLETED_IO "Waiting for IO result via io_uring." AIO_COMPLETED_SHARED"Waiting for IO shared completion callback." If "shared completion callback" is too internals-focused, perhaps this: AIO_IO_URING_COMPLETED_IO "Waiting for IO result via io_uring." AIO_COMPLETED_SHARED"Waiting for IO completion to update shared memory." > --- a/doc/src/sgml/config.sgml > +++ b/doc/src/sgml/config.sgml > @@ -2710,6 +2710,12 @@ include_dir 'conf.d' > worker (execute asynchronous I/O using worker > processes) > > > + > + > +io_uring (execute asynchronous I/O using > +io_uring, if available) I feel the "if available" doesn't quite fit, since we'll fail if unavailable. Maybe just "(execute asynchronous I/O using Linux io_uring)" with "Linux" there to reduce surprise on other platforms. > Subject: [PATCH v2.11 06/27] aio: Implement support for reads in smgr/md/fd (Still reviewing this one.)
Re: Test to dump and restore objects left behind by regression
On Fri, Mar 21, 2025 at 6:04 PM Alvaro Herrera wrote: > > On 2025-Mar-21, Ashutosh Bapat wrote: > > > On Thu, Mar 20, 2025 at 8:37 PM vignesh C wrote: > > > > Should the copyright be only 2025 in this case: > > > The patch was posted in 2024 to this mailing list. So we better > > protect the copyright since then. I remember a hackers discussion > > where a senior member of the community mentioned that there's not harm > > in mentioning longer copyright periods than being stricter about it. I > > couldn't find the discussion though. > > On the other hand, my impression is that we do update copyright years to > current year, when committing new files of patches that have been around > for long. > > And there's always > https://liferay.dev/blogs/-/blogs/how-and-why-to-properly-write-copyright-statements-in-your-code Right. So shouldn't the copyright notice be 2024-2025 and not just only 2025? - Next year it will be changed to 2024-2026. -- Best Wishes, Ashutosh Bapat