Fixup some appendPQExpBuffer() calls

2025-04-14 Thread David Rowley
In a similar effort to what I did in [1], there's a bunch of appendPQExpBuffer() calls that could use appendPQExpBufferStr() or appendPQExpBufferChar(). With [1], I've been keeping those appendStringInfo calls in check at this time of year for a few years now. With appendPQExpBuffer I've not been,

Get rid of integer divide in FAST_PATH_REL_GROUP() macro

2025-04-13 Thread David Rowley
I noticed a while ago that the new fast-path locking code uses integer division to figure out the fast-path locking group slot. To me, this seems a bit unnecessary as FastPathLockGroupsPerBackend is always a power-of-two value, so we can use bitwise-AND instead. I don't think FAST_PATH_REL_GROUP

pgsql: Doc: use "an SQL" consistently rather than "a SQL"

2025-04-13 Thread David Rowley
Doc: use "an SQL" consistently rather than "a SQL" Per the precedent set by 04539e73f, adjust article prefixes for "SQL" to use "an" consistently rather than "a", i.e., "an es-que-ell" rather than "a sequel". Both of these are new to v18. Also see b1b13d2b5, d866f0374 and 7bdd489d3. Branch -

Re: Can we use Statistics Import and Export feature to perforamance testing?

2025-04-12 Thread David Rowley
On Sat, 12 Apr 2025 at 20:29, Corey Huinker wrote: >> >> at the *actual size* of the relation and takes that into account when >> scaling the statistics (see table_block_relation_estimate_size() in >> tableam.c). If the table sizes don't match between the two servers >> then there's no guarantees

Re: n_ins_since_vacuum stats for aborted transactions

2025-04-10 Thread David Rowley
On Fri, 11 Apr 2025 at 02:01, Sami Imseih wrote: > I created an entry for the July CF > https://commitfest.postgresql.org/patch/5691/ > > ... and I realized I forgot to include David's code comment patch yesterday, > Reattaching both patches. I've pushed the code comment patch. For the docs patc

pgsql: Add code comment explaining ins_since_vacuum and aborted inserts

2025-04-10 Thread David Rowley
ami Imseih Author: David Rowley Reviewed-by: Sami Imseih Discussion: https://postgr.es/m/caaphdvpgv3a-r2egmpoh0l-x3phbzpm3y4dyswfy+uquazw...@mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/530050d8d2850d0453bb56e2bfa7cae216ee8a18 Modi

Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate

2025-04-10 Thread David Rowley
On Mon, 7 Apr 2025 at 19:39, Melanie Plageman wrote: > +++ > C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out > 2025-04-04 14:47:20.358393500 + > @@ -2769,15 +2769,16 @@ > EXPLAIN (COSTS OFF) > SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) > WH

Re: Improve a few appendStringInfo calls new to v18

2025-04-10 Thread David Rowley
On Fri, 11 Apr 2025 at 02:51, Nathan Bossart wrote: > This probably isn't v18 material, but this reminds me of my idea to change > appendStringInfoString() into a macro for appendBinaryStringInfo() so that > the compiler can remove the runtime strlen() calls for string literals [0]. > In most case

Re: Improve a few appendStringInfo calls new to v18

2025-04-10 Thread David Rowley
On Thu, 10 Apr 2025 at 20:24, Heikki Linnakangas wrote: > > On 10/04/2025 06:51, David Rowley wrote: > > Any objections to doing this soonish? Or in a few weeks? > > Sure, let's do it. Why would we wait? Great. Pushed. Was considering waiting as I didn't know if the

pgsql: Improve various new-to-v18 appendStringInfo calls

2025-04-10 Thread David Rowley
Improve various new-to-v18 appendStringInfo calls Similar to 8461424fd, here we adjust a few new locations which were not using the most suitable appendStringInfo* function for the intended purpose. Author: David Rowley https://postgr.es/m/CAApHDvqJnNjueb=eoj8k+8n0g7nj_acpwsicj5rnv4fdeja

Re: Possible documentation inaccuracy in optimizer README

2025-04-10 Thread David Rowley
On Wed, 9 Apr 2025 at 14:33, Tom Lane wrote: > Maybe better: > > Other possibilities will be excluded for lack of join clauses. > (In reality, use of EquivalenceClasses would allow us to > deduce additional join clauses that allow more join > combinations, but here

Re: Add .DS_Store to .gitignore

2025-04-09 Thread David Rowley
On Thu, 10 Apr 2025 at 18:05, Dianjin Wang wrote: > Here is a patch to add the `.DS_Store` to the .gitignore to ignore the > `.DS_Store` file tracking by git when doing some development work in > the macOS terminal. It helps me and my community (Apache Cloudberry) a > lot. I hope this small patch

Re: Possible api miuse bms_next_member

2025-04-09 Thread David Rowley
On Thu, 10 Apr 2025 at 02:18, Tom Lane wrote: > If we did want to do something about this warning, rather than > hacking up the call sites I'd be inclined to invent something like > "bms_first_member()" which does the same thing but additionally > asserts on empty input. Not really convinced it's

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-09 Thread David Rowley
On Thu, 10 Apr 2025 at 16:57, Amit Langote wrote: > > On Thu, Apr 10, 2025 at 12:03 PM David Rowley wrote: > > -Most operations on EquivalenceClasses should ignore child members. > > +Most operations on EquivalenceClasses needn't look at child members. > > > >

pgsql: Update wording in optimizer/README for EquivalenceClasses

2025-04-09 Thread David Rowley
to reflect the new storage location for child members. Reported-by: Amit Langote Author: Amit Langote Author: David Rowley Discussion: https://postgr.es/m/CA+HiwqE8v=EuAP_3F_A2xn8zWx+nG_etW_Fe_DvKO-Fkx=+d...@mail.gmail.com Branch -- master Details --- https

Improve a few appendStringInfo calls new to v18

2025-04-09 Thread David Rowley
Looks like v18 has grown a few appendStringInfo misusages, e.g. using appendStringInfo() when no formatting is needed or just using format "%s" instead of using appendStringInfoString(). I've attached a couple of patches. The 0001 is just my method for finding these, not for commit. 0002 contains

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-09 Thread David Rowley
On Wed, 9 Apr 2025 at 17:38, Amit Langote wrote: > Still, maybe a tiny tweak to the last line could help steer readers > right without diving into storage. How about: > > Most operations on EquivalenceClasses should ignore child members, > which are stored separately from normal members. I think

Re: n_ins_since_vacuum stats for aborted transactions

2025-04-09 Thread David Rowley
On Thu, 10 Apr 2025 at 10:54, Sami Imseih wrote: > Fair enough, and I think I got my answers from this thread. This > design decision was not > discussed in the thread for b07642dbcd8. This discussion is mostly settled down now, but... I also went through that thread to see if it was mentioned a

Re: Memoize ANTI and SEMI JOIN inner

2025-04-09 Thread David Rowley
On Wed, 9 Apr 2025 at 18:48, Richard Guo wrote: > > On Thu, Mar 20, 2025 at 3:02 PM David Rowley wrote: > > For making this work, I think the attached should be about the guts of > > the code changes. I didn't look at the comments. Right now I can't > > think of

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 17:09, Amit Langote wrote: > Should the following paragraph in src/backend/optimizer/README be > updated to reflect the new reality after recent changes? > > An EquivalenceClass can contain "em_is_child" members, which are copies > of members that contain appendrel pa

Re: Possible documentation inaccuracy in optimizer README

2025-04-08 Thread David Rowley
On Tue, 8 Apr 2025 at 16:28, Tom Lane wrote: > We could possibly avoid the inaccuracy by making the examples use > some other operators that are not equijoins. But I wonder if that > would not be more confusing rather than less so. I don't think it'd hurt to mention that we're just ignoring the

Re: pgsql: Rename argument in pg_get_process_memory_contexts().

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 09:22, Daniel Gustafsson wrote: > Rename argument in pg_get_process_memory_contexts(). > src/include/catalog/pg_proc.dat | 2 +- I think this should have done a catversion bump. I suspect we'll get one soon enough, but you never know... David

Re: Remove unnecessary static type qualifiers

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 03:46, Peter Eisentraut wrote: > To avoid creating an array on the stack, you could maybe write it with a > pointer instead, like: > > const char * const query = "..."; > > I haven't tested whether that results in different or better compiled > code. The original code is pro

Re: Feature freeze

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 03:54, Bruce Momjian wrote: > We did have this discussion when AoE was chosen for PG 18 and the idea > was that as long as it is before April 18 midnight wherever you are, it > is not feature freeze yet. I think it maybe once made sense for the moment to stop accepting new p

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 02:24, Tom Lane wrote: > > David Rowley writes: > > I've pushed the patch now. Thanks for all the reviews of my adjustments. > > Shouldn't the CF entry be marked committed? I've done that now. 88f55bc97 added code to do faster lookups

Re: Can we use Statistics Import and Export feature to perforamance testing?

2025-04-08 Thread David Rowley
On Tue, 8 Apr 2025 at 12:21, Ryohei Takahashi (Fujitsu) wrote: > By using Statistics Import and Export feature, is it possible to achieve the > above request by following procedure? > (1) Export the statistics from production environment by using pg_dump > --statistics-only. > (2) On the staging

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-07 Thread David Rowley
On Tue, 8 Apr 2025 at 17:41, Ashutosh Bapat wrote: > Thanks for listing all the patterns. Creating four different iterators > is going to affect functionality and might require duplicate code. But > each of the patterns is using exactly one BMS operation on em_relids > and relids being used as sea

pgsql: Speedup child EquivalenceMember lookup in planner

2025-04-07 Thread David Rowley
: David Rowley Reviewed-by: David Rowley Reviewed-by: Tom Lane Reviewed-by: Andrey Lepikhov Reviewed-by: Alena Rybakina Reviewed-by: Dmitry Dolgov <9erthali...@gmail.com> Reviewed-by: Amit Langote Reviewed-by: Ashutosh Bapat Tested-by: Thom Brown Tested-by: newtglobal postgresql_contri

Re: pgsql: Transfer statistics during pg_upgrade.

2025-04-07 Thread David Rowley
On Thu, 20 Feb 2025 at 22:29, Jeff Davis wrote: > Add support to pg_dump for dumping stats, and use that during > pg_upgrade so that statistics are transferred during upgrade. In most > cases this removes the need for a costly re-analyze after upgrade. I was surprised to see when I did pg_dump -T

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-07 Thread David Rowley
On Tue, 8 Apr 2025 at 04:54, Ashutosh Bapat wrote: > - foreach(lc2, cur_ec->ec_members) > + setup_eclass_member_iterator(&it, cur_ec, rel); > + while ((cur_em = eclass_member_iterator_next(&it)) != NULL) > { > - EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); > - > /* > * Ignore chi

Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate

2025-04-07 Thread David Rowley
On Mon, 7 Apr 2025 at 19:39, Melanie Plageman wrote: > +++ > C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out > 2025-04-04 14:47:20.358393500 + > @@ -2769,15 +2769,16 @@ > EXPLAIN (COSTS OFF) > SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) > WH

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-06 Thread David Rowley
On Sat, 5 Apr 2025 at 16:55, David Rowley wrote: > I am still thinking about the duplicate members being returned from > the iterator for child join rels due to them being duplicated into > each component relid element in ec_childmembers. I did consider if > these could just not be du

Re: Hashed IN only applied to first encountered IN

2025-04-05 Thread David Rowley
On Wed, 2 Apr 2025 at 00:51, David Geier wrote: > The hashed IN optimization is only applied to the first encountered > ScalarArrayOpExpr during the expression tree traversal in > convert_saop_to_hashed_saop_walker(). Reason being that the walker > returns true which aborts the traversal. > I've

Re: Add mention in docs about locking all partitions for generic plans

2025-04-05 Thread David Rowley
On Mon, 24 Mar 2025 at 19:50, Tender Wang wrote: > > David Rowley 于2025年3月24日周一 05:28写道: >> This is no longer true in master, so if we do something here it's only >> v17 and earlier. > > In the case of [1], we still have AccessShareLock on entity_2, even though i

pgsql: Fix planner's failure to identify multiple hashable ScalarArrayO

2025-04-05 Thread David Rowley
Fix planner's failure to identify multiple hashable ScalarArrayOpExprs 50e17ad28 (v14) and 29f45e299 (v15) made it so the planner could identify IN and NOT IN clauses which have Const lists as right-hand arguments and when an appropriate hash function is available for the data types, mark the Scal

Re: BTScanOpaqueData size slows down tests

2025-04-05 Thread David Rowley
On Thu, 3 Apr 2025 at 04:21, Andres Freund wrote: > I was mildly > surprised to see how expensive the new compact attribute checks are. Is this a fairly deform-heavy workload? FWIW, before adding that Assert, I did test to see if there was any measurable impact on the time it took to run all the

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-05 Thread David Rowley
ry for eclass_indexes_array when root->append_rel_list is empty? 12. join_rel_list_index isn't being initialized in all the places that do makeNode(RelOptInfo); Maybe -1 is a good initial value? (See my next point) 13. RelOptInfo.join_rel_list_index is an index into PlannerInfo.join_rel_list. I

Re: pgsql: Add vacuum_truncate configuration parameter.

2025-04-05 Thread David Rowley
On Fri, 21 Mar 2025 at 04:30, Tom Lane wrote: > > Nathan Bossart writes: > > Since there's presently no way to determine whether a Boolean > > storage parameter is explicitly set or has just picked up the > > default value, this commit also introduces an isset_offset member > > to relopt_parse_el

pgsql: Doc: add information about partition locking

2025-04-05 Thread David Rowley
Doc: add information about partition locking The documentation around locking of partitions for the executor startup phase of run-time partition pruning wasn't clear about which partitions were being locked. Fix that. Reviewed-by: Tender Wang Discussion: https://postgr.es/m/CAApHDvp738G75HfkKc

Re: Add mention in docs about locking all partitions for generic plans

2025-04-05 Thread David Rowley
On Mon, 24 Mar 2025 at 22:19, Tender Wang wrote: >> Maybe I was wrong about writing nothing in master's docs. It might >> still be important to detail this. I don't know the best way to phrase >> that, but maybe something along the lines of: "The query planner >> obtains locks for all partitions w

Re: Querying one partition in a function takes locks on all partitions

2025-04-05 Thread David Rowley
On Sat, 29 Mar 2025 at 06:00, Evgeny Morozov wrote: > > On 23/03/2025 2:35 pm, David Rowley wrote: > >> alter table entity_2 add column new_column text; > > Is this just an example command? You can't add a column to a > > partition directly. > > Yes, it was

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-04 Thread David Rowley
On Sat, 5 Apr 2025 at 04:05, Tom Lane wrote: > This patchset has a distinct whiff of unseemly haste. hmm, yes. I would like to give this patch as good a chance at making v18 as I can, and I admit to having optimised for that. Seemingly, we've got a few other good partitioning performance patches

Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

2025-04-04 Thread David Rowley
On Sat, 29 Mar 2025 at 05:46, Ashutosh Bapat wrote: > PFA patches. 0001 and 0002 are the same as the previous set. 0003 > changes the initial hash table size to the length of ec_derives. I'm just not following the logic in making it the length of the ec_derives List. If you have 32 buckets and tr

Re: Postgres Query Plan using wrong index

2025-04-04 Thread David Rowley
On Thu, 3 Apr 2025 at 18:07, Tom Lane wrote: > A simple-minded approach could be to just be pessimistic, and > increase our estimate of how many rows would need to be scanned as a > consequence of noticing that the columns have significant correlation. > The shape of that penalty function would be

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-04 Thread David Rowley
Thank you for having a look at this. On Fri, 4 Apr 2025 at 21:47, Amit Langote wrote: > It looks to me like the following hunks in 0002 probably belong in > 0001, unless you’re planning to commit the patches together anyway: Ah, yeah. Unsure about that as yet, but I've moved it over. > The comm

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-04 Thread David Rowley
On Sat, 5 Apr 2025 at 02:54, Ashutosh Bapat wrote: > I haven't measured if the patches improve performance of simple scans > with thousands of partitions. Have you tried measuring that? I just tried 10k partitions on my Zen4 laptop. create table lp (a int) partition by list(a); select 'create ta

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-03 Thread David Rowley
On Fri, 4 Apr 2025 at 00:34, David Rowley wrote: > I've attached 2 patches, which I think addresses most of this, aside > from the last point. > > These do need more work. I've just attached what I have so far before > I head off for the day. I am planning on running

Re: pgsql: Improve accounting for memory used by shared hash tables

2025-04-03 Thread David Rowley
On Thu, 3 Apr 2025 at 04:16, Tomas Vondra wrote: > Improve accounting for memory used by shared hash tables I've not looked into why, but this is causing an issue in the join_rel_hash during add_join_rel(). See the attached script. ERROR: pfree called with invalid pointer 0x60a15edc44e0 (header

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-03 Thread David Rowley
On Tue, 25 Mar 2025 at 06:49, Tom Lane wrote: > I finally made some time to look at this patchset, and I'm pretty > disappointed, because after 35 versions I'd expect to see something > that looks close to committable. This doesn't really. I like the > basic idea of taking child EC members out o

Re: Postgres Query Plan using wrong index

2025-04-02 Thread David Rowley
On Thu, 3 Apr 2025 at 16:24, Manikandan Swaminathan wrote: > Since you mentioned the planner not knowing about the correlation between the > columns, I’m curious, why doesn’t making a multivariate statistic make a > difference? > > > CREATE STATISTICS col_a_col_b_stats (dependencies) ON col_a, c

pgsql: Doc: add information about partition locking

2025-04-01 Thread David Rowley
Doc: add information about partition locking The documentation around locking of partitions for the executor startup phase of run-time partition pruning wasn't clear about which partitions were being locked. Fix that. Reviewed-by: Tender Wang Discussion: https://postgr.es/m/CAApHDvp738G75HfkKc

pgsql: Doc: add information about partition locking

2025-04-01 Thread David Rowley
Doc: add information about partition locking The documentation around locking of partitions for the executor startup phase of run-time partition pruning wasn't clear about which partitions were being locked. Fix that. Reviewed-by: Tender Wang Discussion: https://postgr.es/m/CAApHDvp738G75HfkKc

pgsql: Doc: add information about partition locking

2025-04-01 Thread David Rowley
Doc: add information about partition locking The documentation around locking of partitions for the executor startup phase of run-time partition pruning wasn't clear about which partitions were being locked. Fix that. Reviewed-by: Tender Wang Discussion: https://postgr.es/m/CAApHDvp738G75HfkKc

Re: Add mention in docs about locking all partitions for generic plans

2025-04-01 Thread David Rowley
On Mon, 31 Mar 2025 at 12:19, David Rowley wrote: > I'll push these in the next few days unless anyone else wants to voice > their opinions. Thanks for the review. Pushed. David

pgsql: Doc: add information about partition locking

2025-04-01 Thread David Rowley
Doc: add information about partition locking The documentation around locking of partitions for the executor startup phase of run-time partition pruning wasn't clear about which partitions were being locked. Fix that. Reviewed-by: Tender Wang Discussion: https://postgr.es/m/CAApHDvp738G75HfkKc

pgsql: Doc: add information about partition locking

2025-04-01 Thread David Rowley
Doc: add information about partition locking The documentation around locking of partitions for the executor startup phase of run-time partition pruning wasn't clear about which partitions were being locked. Fix that. Reviewed-by: Tender Wang Discussion: https://postgr.es/m/CAApHDvp738G75HfkKc

Re: Hashed IN only applied to first encountered IN

2025-04-01 Thread David Rowley
On Wed, 2 Apr 2025 at 01:31, David Rowley wrote: > > On Wed, 2 Apr 2025 at 00:51, David Geier wrote: > > The hashed IN optimization is only applied to the first encountered > > ScalarArrayOpExpr during the expression tree traversal in > > convert_saop_to_hashed_saop_wal

pgsql: Fix planner's failure to identify multiple hashable ScalarArrayO

2025-04-01 Thread David Rowley
Fix planner's failure to identify multiple hashable ScalarArrayOpExprs 50e17ad28 (v14) and 29f45e299 (v15) made it so the planner could identify IN and NOT IN clauses which have Const lists as right-hand arguments and when an appropriate hash function is available for the data types, mark the Scal

pgsql: Fix planner's failure to identify multiple hashable ScalarArrayO

2025-04-01 Thread David Rowley
Fix planner's failure to identify multiple hashable ScalarArrayOpExprs 50e17ad28 (v14) and 29f45e299 (v15) made it so the planner could identify IN and NOT IN clauses which have Const lists as right-hand arguments and when an appropriate hash function is available for the data types, mark the Scal

pgsql: Fix planner's failure to identify multiple hashable ScalarArrayO

2025-04-01 Thread David Rowley
Fix planner's failure to identify multiple hashable ScalarArrayOpExprs 50e17ad28 (v14) and 29f45e299 (v15) made it so the planner could identify IN and NOT IN clauses which have Const lists as right-hand arguments and when an appropriate hash function is available for the data types, mark the Scal

pgsql: Fix planner's failure to identify multiple hashable ScalarArrayO

2025-04-01 Thread David Rowley
Fix planner's failure to identify multiple hashable ScalarArrayOpExprs 50e17ad28 (v14) and 29f45e299 (v15) made it so the planner could identify IN and NOT IN clauses which have Const lists as right-hand arguments and when an appropriate hash function is available for the data types, mark the Scal

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-04-01 Thread David Rowley
On Tue, 1 Apr 2025 at 20:52, Ilia Evdokimov wrote: > With the feature freeze coming up soon, I’d like to ask: do we plan to > include this patch in v18? I don't think there's a clear enough consensus yet on what EXPLAIN should display. We'd need that beforehand. There are now less than 7 days to

pgsql: Fix failing regression test on x86-32 machines

2025-03-31 Thread David Rowley
Fix failing regression test on x86-32 machines 95d6e9af0 added code to display the tuplestore storage type for WindowAgg nodes and added a test to ensure the "Disk" storage method was working correctly by setting work_mem to 64 and running a test which caused the WindowAgg to go to disk. Seemingl

Re: pgsql: Add memory/disk usage for Window aggregate nodes in EXPLAIN.

2025-03-31 Thread David Rowley
On Tue, 1 Apr 2025 at 09:40, Christoph Berg wrote: > =# explain (analyze,buffers off,costs off) select sum(n) over() from > generate_series(1,2048) a(n); > QUERY PLAN >

Re: pgsql: Add memory/disk usage for Window aggregate nodes in EXPLAIN.

2025-03-31 Thread David Rowley
On Tue, 1 Apr 2025 at 04:40, Christoph Berg wrote: > - Storage: Disk Maximum Storage: NkB > + Storage: Memory Maximum Storage: NkB > -> Function Scan on generate_series a (actual time=N.N..N.N rows=N.N > loops=N) We'll probably just need to bump that 2000 row count to something a bit

Re: Memoize ANTI and SEMI JOIN inner

2025-03-31 Thread David Rowley
On Mon, 31 Mar 2025 at 22:03, Richard Guo wrote: > I reviewed this patch and I have some concerns about the following > code: > > if (extra->inner_unique && > (inner_path->param_info == NULL || > bms_num_members(inner_path->param_info->ppi_serials) < > list_length(ext

Re: Memoize ANTI and SEMI JOIN inner

2025-03-30 Thread David Rowley
On Mon, 31 Mar 2025 at 15:33, Alena Rybakina wrote: > I believe it's worth asserting that both inner_unique and single_mode are not > true at the same time — just as a safety check. add_paths_to_joinrel() just chooses not to populate inner_unique for SEMI and ANTI joins because, as of today's ma

Re: Memoize ANTI and SEMI JOIN inner

2025-03-30 Thread David Rowley
On Mon, 31 Mar 2025 at 16:21, Alena Rybakina wrote: > However, is it necessary to check that extra->inner_unique must be false for > SEMI/ANTI joins here, or am I missing something? It looks a little confusing > at this point. If it is necessary, I don't see the reason for it. It was me that wo

Re: Querying one partition in a function takes locks on all partitions

2025-03-30 Thread David Rowley
On Sat, 29 Mar 2025 at 10:30, Renan Alves Fonseca wrote: > Currently, in the SQL function path the plan is always generic. The > planner ignores the function arguments. The plan_cache_mode setting > has no effect in this path. > > I agree that the docs should be more explicit about this. There is

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-26 Thread David Rowley
On Thu, 27 Mar 2025 at 14:51, Michael Paquier wrote: > One habit that I've found really useful to do when it comes to this > area of the code is to apply the tests first to show what kind of > behavior we had before changing the jumbling, then apply the update to > the query jumbling. This has tw

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-26 Thread David Rowley
On Thu, 27 Mar 2025 at 18:03, Sami Imseih wrote: > Maybe I am missing something, but when I applied the > v9-0001 patch alone, it did not solve the problem it > claims to and the pg_s_s regression test fails: That was an artifact of me not having made the split quite in the right place when divid

pgsql: Optimize Query jumble

2025-03-26 Thread David Rowley
Optimize Query jumble f31aad9b0 adjusted query jumbling so it no longer ignores NULL nodes during the jumble. This added some overhead. Here we tune a few things to make jumbling faster again. This makes jumbling perform similar or even slightly faster than prior to that change. Author: David

pgsql: Fix query jumbling to account for NULL nodes

2025-03-26 Thread David Rowley
jumble buffer. This fixes the issue as the order that the NULL is recorded isn't the same in the above two queries. Author: Bykov Ivan Author: Michael Paquier Author: David Rowley Discussion: https://postgr.es/m/aafce7966e234372b2ba876c0193f1e9%40localhost.localdomain Branch -- master De

Re: Partition pruning on parameters grouped into an array does not prune properly

2025-03-26 Thread David Rowley
On Thu, 27 Mar 2025 at 04:19, Andrei Lepikhov wrote: > But if we partition on HASH(x,y) it is not working (see > incorrect-pruning-example.sql): > > PREPARE test2 (int,int) AS > SELECT 1 FROM array_prune > WHERE id1 = ANY(ARRAY[$1]) AND id2 = ANY(ARRAY[$2]); > EXPLAIN (COSTS OFF) EXECUTE test2

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-26 Thread David Rowley
On Wed, 26 Mar 2025 at 14:04, Michael Paquier wrote: > > On Wed, Mar 26, 2025 at 11:56:50AM +1300, David Rowley wrote: > > Here is the v8 version with the bug fix and performance stuff > > separated out. > > Why not. I assume that you would merge these together? I th

Re: Reduce "Var IS [NOT] NULL" quals during constant folding

2025-03-26 Thread David Rowley
On Wed, 26 Mar 2025 at 19:31, Richard Guo wrote: > > On Wed, Mar 26, 2025 at 3:06 PM Tender Wang wrote: > > The comment about notnullattnums in struct RangeTblEntry says that: > > * notnullattnums is zero-based set containing attnums of NOT NULL > > * columns. > > > > But in get_relation_notnull

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-25 Thread David Rowley
On Wed, 26 Mar 2025 at 01:11, David Rowley wrote: > I'm happy to proceed with the v7 version. I'm also happy to credit you > as the primary author of it, given that you came up with the v2b > patch. It might be best to extract the performance stuff that's in v7 > and

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-25 Thread David Rowley
On Tue, 25 Mar 2025 at 21:14, Bykov Ivan wrote: > As I can see, your patch has the same idea as my > v2-0001-Query-ID-Calculation-Fix-Variant-B.patch from [1]. > I think it would be better to extract the jumble buffer update with hash > calculation into a function > (CompressJumble in my patch).

Add mention in docs about locking all partitions for generic plans

2025-03-24 Thread David Rowley
Over in [1], there was some uncertainty about whether locking an unrelated partition was expected behaviour or not for the particular use-case which used a generic plan to scan a partitioned table and all of the partitions. I noticed that we don't mention this in the docs and though that perhaps w

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-24 Thread David Rowley
On Fri, 21 Mar 2025 at 18:45, Michael Paquier wrote: > Potential ideas about optimizing the branches seem worth > investigating, I am not sure that I would be able to dig into that > until the feature freeze gong, unfortunately, and it would be nice to > fix this issue for this release. I'll see

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-03-24 Thread David Rowley
On Tue, 25 Mar 2025 at 11:15, Tom Lane wrote: > FWIW, I share these doubts about whether these values are useful > enough to include in the default EXPLAIN output. My main beef > with them though is that they are basically numbers derived along > the way to producing a cost estimate, and I don't

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-03-24 Thread David Rowley
On Tue, 25 Mar 2025 at 10:23, Andrei Lepikhov wrote: > > On 3/23/25 22:16, David Rowley wrote: > > Once again, I'm not necessarily objecting to hit and evict ratios > > being shown, I just want to know they're actually useful enough to > > show and don't j

pgsql: Add tests for POSITION(bytea, bytea)

2025-03-24 Thread David Rowley
Add tests for POSITION(bytea, bytea) Previously there was no coverage for this function. Author: Aleksander Alekseev Reviewed-by: Peter Smith Reviewed-by: Rustam ALLAKOV Discussion: https://postgr.es/m/caj7c6tmt6xcoomvkncd_tr2obdgcnjefsecdcv8jzky9vkw...@mail.gmail.com Branch -- master D

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-03-23 Thread David Rowley
On Fri, 21 Mar 2025 at 22:02, Andrei Lepikhov wrote: > In cases > I have seen before, the main question is how effective was (or maybe) a > Memoize node == how often the incoming key fits the cache. In that case, > the hit ratio fraction is more understandable for a broad audience. > That's why ac

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-23 Thread David Rowley
On Mon, 24 Mar 2025 at 15:23, Michael Paquier wrote: > And here we get 18564.06 (head) vs 16667.92 (patch) so a 10.7% > difference in this run. Hence the automatic addition of NULL to the > jumbling is disappointing here, even if this is what I'd see this as a > worst case scenario, unlikely what

Re: Patch: Cover POSITION(bytea,bytea) with tests

2025-03-23 Thread David Rowley
On Tue, 18 Mar 2025 at 03:14, Aleksander Alekseev wrote: > Here is the corrected patch. I had a look at this and it all seems good to me. Pushed. David

Re: [PoC] Reducing planning time when tables have many partitions

2025-03-23 Thread David Rowley
Thank you for addressing those comments. On Mon, 24 Mar 2025 at 12:24, Yuya Watari wrote: > It is true that currently, indexes for EquivalenceMembers do not store > information about base rels. However, the subsequent commit (v35-0004) > introduces indexes for base rels to enable faster RestrictI

Re: pgsql: Add vacuum_truncate configuration parameter.

2025-03-23 Thread David Rowley
On Sat, 22 Mar 2025 at 05:40, David G. Johnston wrote: > Not seeing much point in trying to get rid of the on/off switch. It just > won't make sense to choose a tunable value of zero to disable something, and > probably should be prohibited. Can you explain what does not make sense about it?

Re: Querying one partition in a function takes locks on all partitions

2025-03-23 Thread David Rowley
On Sat, 22 Mar 2025 at 05:27, Evgeny Morozov wrote: > select read_partition(1); -- This takes shared locks on entity_1 AND > entity_2 > > -- select count(*) from entity where part_id = 1; -- but this would only > take a shared lock only on entity_1 > > If another session tries something that takes

Re: Remove redundant if-else in EXPLAIN by using ExplainPropertyText

2025-03-22 Thread David Rowley
On Fri, 21 Mar 2025 at 09:24, Ilia Evdokimov wrote: > Thanks to David [0], we found that the if and else branches contained > equivalent code. Since ExplainPropertyText already handles non-text > formats, the extra condition is unnecessary. > > I reviewed other files related to EXPLAIN where simil

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-03-20 Thread David Rowley
On Fri, 21 Mar 2025 at 07:54, Andrei Lepikhov wrote: > I have some doubts here. > The number of distinct values says something only when it has taken > together with the number of calls. Couldn't the reader just look at the Nested Loop's outer side row estimate for that? > Frequently, one of the

pgsql: Simplify EXPLAIN code for Memoize

2025-03-20 Thread David Rowley
code. It seems like a good idea to fix this to help prevent future changes in this area from copying the same pattern. Author: Ilia Evdokimov Reported-by: David Rowley Discussion: https://postgr.es/m/88a71bcd-0b5c-4d0b-8107-757e96f40...@tantorlabs.com Branch -- master Details ---

Re: Remove redundant if-else in EXPLAIN by using ExplainPropertyText

2025-03-20 Thread David Rowley
On Fri, 21 Mar 2025 at 10:12, David Rowley wrote: > The patch looks good to me and seems worth applying to master. Ok. Pushed. David

Re: pgsql: pg_logicalinspect: Fix possible crash when passing a directory p

2025-03-20 Thread David Rowley
On Thu, 13 Mar 2025 at 21:33, Peter Eisentraut wrote: > Ok, this is weird, because we have pg_unreachable() support for MSVC: > > #if defined(HAVE__BUILTIN_UNREACHABLE) && !defined(USE_ASSERT_CHECKING) > #define pg_unreachable() __builtin_unreachable() > #elif defined(_MSC_VER) && !defined(USE_ASS

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-03-20 Thread David Rowley
On Thu, 20 Mar 2025 at 21:48, Ilia Evdokimov wrote: > -> Memoize (cost=0.30..0.41 rows=1 width=4) >Cache Key: t2.thousand >Cache Mode: logical >Cache Estimated Entries: 655 >Cache Estimated NDistinct: 721 >-

Re: Memoize ANTI and SEMI JOIN inner

2025-03-19 Thread David Rowley
On Thu, 20 Mar 2025 at 06:16, Andrei Lepikhov wrote: > How can we be sure that semi or anti-join needs only one tuple? I think > it would be enough to control the absence of join clauses and filters in > the join. Unfortunately, we only have such a guarantee in the plan > creation stage (maybe eve

Re: Add missing tab completion for VACUUM and ANALYZE with ONLY option

2025-03-19 Thread David Rowley
On Wed, 19 Mar 2025 at 23:22, Ilia Evdokimov wrote: > Everything seems fine with the ANALYZE command autocompletion. Regarding > VACUUM, I'm not entirely convinced we should provide autocompletion in every > case. I'd prefer to keep the behavior from the v3 patch because the original > intentio

Re: Optimize truncation logic to reduce AccessExclusive lock impact

2025-03-18 Thread David Rowley
On Tue, 18 Mar 2025 at 19:04, Stepan Neretin wrote: > We propose modifying the truncation condition in should_attempt_truncation to > avoid unnecessary full buffer scans. The new formula ensures we only attempt > truncation when we can free at least 3% of the relation size + 2 pages. This > cha

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-18 Thread David Rowley
On Tue, 18 Mar 2025 at 21:00, Michael Paquier wrote: > If we make the whole cheaper with only extra entropy added for NULLs > in nodes and strings, I'd rather have an insurance policy for the > custom functions. Something like that: > - Forbid a size of 0 in AppendJumble(). > - When dealing with

Re: maintenance_work_mem = 64kB doesn't work for vacuum

2025-03-18 Thread David Rowley
On Tue, 18 Mar 2025 at 19:34, Masahiko Sawada wrote: > I've attached the updated patch. Looks good to me. David

  1   2   3   4   5   6   7   8   9   10   >