Re: Add a bound check to TidRangeEval

2025-06-08 Thread David Rowley
On Sun, 8 Jun 2025 at 21:41, Junwang Zhao wrote: > The comments of TidRangeEval saids: > > ``` > Returns false if we detect the range cannot contain any tuples. > ``` > > After narrowing the upper and lower bounds, we can add an > additional check to verify if the lower bound exceeds the > upper b

Re: [PATCH] Re: Proposal to Enable/Disable Index using ALTER INDEX

2025-06-08 Thread David Rowley
On Mon, 9 Jun 2025 at 06:53, Robert Treat wrote: > > On Sat, Jun 7, 2025 at 9:17 PM David Rowley wrote: > > What are your thoughts on cached plans? In this scenario, do you > > assume that waiting a few days means that connections get reset and > > prepared statements

Re: [PATCH] Re: Proposal to Enable/Disable Index using ALTER INDEX

2025-06-07 Thread David Rowley
On Sun, 8 Jun 2025 at 01:35, Robert Treat wrote: > > On Fri, Jun 6, 2025 at 8:04 PM David Rowley wrote: > > Can you list your proposed series of steps you'd recommend to a DBA > > wishing to remove an index, assuming this feature exists in core as > > you'd li

Re: [PATCH] Re: Proposal to Enable/Disable Index using ALTER INDEX

2025-06-06 Thread David Rowley
On Fri, 6 Jun 2025 at 14:32, Robert Treat wrote: > In production, you aren't watching to see what happen with > pg_stat_all_indexes, because you will first be watching pg_stat_activity to > see if the plans have flipped in some way that leads to an overloaded server > (extra latency, poor cachi

Re: [PATCH] Re: Proposal to Enable/Disable Index using ALTER INDEX

2025-06-05 Thread David Rowley
On Fri, 6 Jun 2025 at 08:14, Robert Treat wrote: > I know this thread is rather old and there doesn't seem to be full > agreement on the ALTER vs GUC implementation idea, and even though I > agree with the sentiment that the GUC system is little more than the > "half-baked take on planner hints",

Re: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table

2025-06-05 Thread David Rowley
On Fri, 6 Jun 2025 at 07:49, Ayush Vatsa wrote: > That said, I’m wondering if this might not be necessary, given that > Gather Merge already accomplishes similar functionality. Would > love to hear your thoughts on whether there’s a distinct advantage to > adding parallelism to Merge Append or if

Re: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table

2025-06-05 Thread David Rowley
On Fri, 6 Jun 2025 at 01:47, Ayush Vatsa wrote: > A small follow-up question - Gather merge won't gather and merge the > output from child in sorted order, but will always need an explicit Sort > node beneath it to do so. Correct? Incorrect. The input node to the Gather Merge needs to be sorted b

Re: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table

2025-06-04 Thread David Rowley
On Thu, 5 Jun 2025 at 07:31, Ayush Vatsa wrote: > Are these Index Scans executed sequentially (one after the other as the Merge > Append requests tuples)? It's a fairly simple answer: Merge Append does not support parallelism. > Or are they possibly executed in parallel, in advance, or concurre

Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE

2025-05-30 Thread David Rowley
On Thu, 29 May 2025 at 16:28, Fujii Masao wrote: > I agree with changing the default value of auto_explain.log_buffers to true. > I think that users who know EXPLAIN ANALYZE includes buffers info in > the result by default since v18 would expect the buffer info also to > be included in auto_explai

Re: Optimize shared LWLock acquisition for high-core-count systems

2025-05-30 Thread David Rowley
On Fri, 30 May 2025 at 23:31, Zhou, Zhiguo wrote: > Please review this patch for consideration in upcoming PostgreSQL releases. Please add the patch to https://commitfest.postgresql.org/53/ David

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-30 Thread David Rowley
On Fri, 30 May 2025 at 11:51, David Rowley wrote: > > On Fri, 30 May 2025 at 11:35, Michael Paquier wrote: > > Thanks, Nathan. Let's proceed with the change then. David, would you > > prefer handling the patch you have written by yourself for the query > > ID par

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-30 Thread David Rowley
Thanks for the review. On Wed, 21 May 2025 at 03:35, Sami Imseih wrote: > 2/ Should "DatumGetInt64(hash_any_extended(.." be turned into a > macro since it's used in > several places? Also, we can add a comment in the macro such as > " > Output the queryId as an int64 rather than a uint64, to

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-29 Thread David Rowley
On Fri, 30 May 2025 at 11:35, Michael Paquier wrote: > Thanks, Nathan. Let's proceed with the change then. David, would you > prefer handling the patch you have written by yourself for the query > ID part? Yes. I'll look at that again shortly. David

Re: Speed up JSON escape processing with SIMD plus other optimisations

2025-05-27 Thread David Rowley
On Wed, 28 May 2025 at 11:24, John Naylor wrote: > https://lemire.me/blog/2025/04/13/detect-control-characters-quotes-and-backslashes-efficiently-using-swar/ > > I don't find this use of SWAR that bad for readability, and there's > only one obtuse clever part that merits a comment. Plus, it seems

Re: JIT works only partially with meson build?

2025-05-26 Thread David Rowley
On Mon, 26 May 2025 at 18:08, Yugo Nagata wrote: > While building PostgreSQL 17 on Windows, I noticed bitcode files (.bc) are not > generated with meson. Does this means that "inlining" of JIT doesn't > work when PostgreSQL is build with meson? Yes. I believe [1] aims to fix this. David [1] ht

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-21 Thread David Rowley
On Thu, 22 May 2025 at 02:58, Peter Eisentraut wrote: > > On 20.05.25 08:38, Michael Paquier wrote: > > On Tue, May 20, 2025 at 05:51:51PM +1200, David Rowley wrote: > >> Given the planId stuff is new and has the same issue, I think that > >> pushes me towards thi

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-20 Thread David Rowley
On Tue, 20 May 2025 at 18:12, Julien Rouhaud wrote: > I don't think it was discussed, but why not go the other way, keep everything > as uint64 and expose an uint64 datatype at the SQL level instead? That's > something I actually want pretty often so I would be happy to have that > natively, whet

Re: wrong query results on bf leafhopper

2025-05-19 Thread David Rowley
On Tue, 20 May 2025 at 16:07, Tom Lane wrote: > Failures like this one [1]: > > @@ -340,9 +340,13 @@ > create function myinthash(myint) returns integer strict immutable language >internal as 'hashint4'; > NOTICE: argument type myint is only a shell > +ERROR: ROWS is not applicable when fun

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-19 Thread David Rowley
On Tue, 20 May 2025 at 17:09, Michael Paquier wrote: > > On Mon, May 19, 2025 at 08:43:25PM -0700, Lukas Fittl wrote: > > Yeah, +1 to making this consistent across both query ID and the new plan > > ID, and changing both to int64 internally seems best. > > Any thoughts from others? I'm OK to take

Re: wrong query results on bf leafhopper

2025-05-19 Thread David Rowley
On Sat, 17 May 2025 at 01:19, Andres Freund wrote: > @@ -42,7 +42,7 @@ > -> Nested Loop (actual rows=1000.00 loops=N) > -> Seq Scan on tenk1 t2 (actual rows=1000.00 loops=N) > Filter: (unique1 < 1000) > - Rows Removed by Filter: 9000 > +

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-19 Thread David Rowley
On Tue, 20 May 2025 at 03:05, Peter Eisentraut wrote: > Or why not store query IDs as int64_t > internally, too? I had the same thought. Changing to int64 seems like a good and less bug-prone tidy-up. I expected we ended up with uint64 as the previous type was uint32, and uint64 is the natural se

Re: Add explicit initialization for all PlannerGlobal fields

2025-05-13 Thread David Rowley
On Tue, 13 May 2025 at 04:03, Richard Guo wrote: > While adding a new field to PlannerGlobal in another patch, I noticed > that although most fields are explicitly initialized, a few are not. > This doesn't cause any functional issues, since makeNode() zeroes all > fields by default. However, the

Re: PG 18 release notes draft committed

2025-05-09 Thread David Rowley
On Fri, 2 May 2025 at 14:44, Bruce Momjian wrote: > I will continue improving it until beta 1, and until the final release. > I will probably add markup in 1-3 weeks. Let the feedback begin. ;-) Minor detail, but "Improve the performance and reduce memory usage of hash joins and GROUP BY" proba

Re: Improve docs for n_distinct_inherited

2025-05-07 Thread David Rowley
On Thu, 8 May 2025 at 15:23, David G. Johnston wrote: > Not liking the proposal, not sure it is even correct. Somehow "children of > inheritance parent tables" are omitted. I don't see the quoted text anywhere in this area, so I'm not sure I follow what you mean with the omission. > In both c

Re: PG 18 release notes draft committed

2025-05-07 Thread David Rowley
On Thu, 8 May 2025 at 13:11, Bruce Momjian wrote: > Ah, I see those now. It was two commits. For some reason the edits are > hard for me; applied patch attached. Thanks. There's been lots of work done in the last year, so lots of work to understand it all to a level where you can write meanin

Re: PG 18 release notes draft committed

2025-05-07 Thread David Rowley
On Thu, 8 May 2025 at 12:36, Bruce Momjian wrote: > +Change pg_backend_memory_contexts.level and > pg_log_backend_memory_contexts()() to be one-based (Melih Mutlu, Fujii Masao) There's an extra set of parentheses on that function name. There's also an inconsistency between the commit link you ad

Improve docs for n_distinct_inherited

2025-05-07 Thread David Rowley
In [1] there's a bug report about ALTER TABLE ... ALTER COLUMN SET (n_distinct = N) not working for partitioned tables. Of course, you need to use n_distinct_inherited for partitioned tables, but the docs don't say that. I went through a few iterations of the wording to make this clearer and lande

Re: PG 18 release notes draft committed

2025-05-07 Thread David Rowley
On Fri, 2 May 2025 at 14:44, Bruce Momjian wrote: > You can see the most current HTML-built version here: > > https://momjian.us/pgsql_docs/release-18.html You might have left it out on purpose as the output isn't likely to be read by a machine, but for "Change pg_backend_memory_contexts.

Re: PG 18 release notes draft committed

2025-05-06 Thread David Rowley
On Wed, 7 May 2025 at 13:17, Bruce Momjian wrote: > > On Wed, May 7, 2025 at 09:27:14AM +1200, David Rowley wrote: > > On Wed, 7 May 2025 at 07:44, Bruce Momjian wrote: > > A query such as: SELECT * FROM table_with_lots_of_partitions ORDER BY > > col; will plan much fa

Re: PG 18 release notes draft committed

2025-05-06 Thread David Rowley
On Wed, 7 May 2025 at 07:44, Bruce Momjian wrote: > I think what you are saying is that this has to do with partition > processing of joins, but not the pruning process. I don't think a > non-partition joins are likely to hit 32 EquivalenceClasses. A query such as: SELECT * FROM table_with_lots_

Re: PG 18 release notes draft committed

2025-05-05 Thread David Rowley
On Tue, 6 May 2025 at 03:59, Bruce Momjian wrote: > > On Mon, May 5, 2025 at 09:42:10PM +1200, David Rowley wrote: > > I agree that 88f55bc97 and d69d45a5a should be in their own item. > > Likely no need to go into detail about the speed up being about > > "Equival

Re: PG 18 release notes draft committed

2025-05-05 Thread David Rowley
On Fri, 2 May 2025 at 14:44, Bruce Momjian wrote: > You can see the most current HTML-built version here: > > https://momjian.us/pgsql_docs/release-18.html Thanks for working on these. For "Improve the performance of hash joins (David Rowley)", 0f5738202 did the same

Re: PG 18 release notes draft committed

2025-05-05 Thread David Rowley
On Fri, 2 May 2025 at 16:01, Amit Langote wrote: > +Allow partitions to be pruned earlier and quicker, and skipped in > more places (Amit Langote, Ashutosh Bapat, Yuya Watari, David Rowley) > > Alternatively, 2 and 3 can be combined as: > 2. Speed up partition planni

Re: Introduce some randomness to autovacuum

2025-04-30 Thread David Rowley
On Thu, 1 May 2025 at 17:35, Junwang Zhao wrote: > > On Thu, May 1, 2025 at 8:12 AM David Rowley wrote: > > It sounds like the aim is to fix the problem with autovacuum vacuuming > > the same table over and over and being unable to remove enough dead > > tuples due to so

Re: Introduce some randomness to autovacuum

2025-04-30 Thread David Rowley
On Thu, 1 May 2025 at 03:29, Nathan Bossart wrote: > That being said, I am -1 for this proposal. Autovacuum parameters and > scheduling are already quite complicated, and making it nondeterministic > would add an additional layer of complexity (and may introduce its own > problems). But more imp

Re: Typo in multixact.c and jsonfuncs.c documentation

2025-04-30 Thread David Rowley
On Wed, 30 Apr 2025 at 19:13, Fujii Masao wrote: > This commit seems to have caused an indent-check failure on the buildfarm > member koel. > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=koel&dt=2025-04-30%2002%3A19%3A00 Thanks. Fixed. David

Re: Typo in multixact.c and jsonfuncs.c documentation

2025-04-29 Thread David Rowley
On Wed, 30 Apr 2025 at 13:25, Junwang Zhao wrote: > Found a trivial typo in multixact.c, after searching the code base > there is another such typo in jsonfuncs.c. Thanks. Pushed. David

Re: Get rid of integer divide in FAST_PATH_REL_GROUP() macro

2025-04-26 Thread David Rowley
On Sun, 27 Apr 2025 at 08:44, Tomas Vondra wrote: > Thanks. Those changes seem fine to me to. Thanks for checking. > Do you intend to push these, or do you want me to do it? I made a few tweaks to the comments and pushed. David

Re: Disallow redundant indexes

2025-04-24 Thread David Rowley
On Thu, 24 Apr 2025 at 21:27, Japin Li wrote: > I propose that PostgreSQL prevent redundant index creation by: > > - Checking for identical existing indexes during CREATE INDEX. > - Failing with an error (like Oracle's ORA-01408) if a duplicate is found. > - Providing a GUC parameter (allow_redund

Re: [PATCH] Documentation: Fix minor grammatical and formatting issues

2025-04-21 Thread David Rowley
On Tue, 22 Apr 2025 at 12:27, David G. Johnston wrote: > On Monday, April 21, 2025, David Rowley wrote: >> Does anyone have any opinion on the wording I'm proposing in the attached? > > I like it. It removes the problematic wording and moves the reference to > —all clo

Re: [PATCH] Documentation: Fix minor grammatical and formatting issues

2025-04-21 Thread David Rowley
On Tue, 22 Apr 2025 at 08:53, Noboru Saito wrote: > Regarding "to to" and "that that", I agree that they might be > technically acceptable. > However, I personally find them a bit harder to parse and they > increase the chance of misreading for me. > Your suggested alternative for "to to" is much

Re: [PATCH] Documentation: Fix minor grammatical and formatting issues

2025-04-20 Thread David Rowley
On Mon, 21 Apr 2025 at 15:06, Noboru Saito wrote: > 1. Remove unnecessary blank lines (blankline.diff) Looks good. > 2. Fix repeated "to to" in several command reference files (toto.diff) -Specifies the name of the database to connect to to discover which +Specifies the name of

Re: Typos in the code and README

2025-04-20 Thread David Rowley
On Mon, 21 Apr 2025 at 11:21, Michael Paquier wrote: > Thanks for sharing the command able to spot all these. I just pushed a few more. The previous regex didn't account for the duplicate word being on the next line. I dug up the following to find the ones just committed in 78eda9e26. ag "\s([a-

Re: Typos in the code and README

2025-04-20 Thread David Rowley
On Sun, 20 Apr 2025 at 13:32, David Rowley wrote: > Here are a few more fixes of a similar ilk. All new in 2025, > predominantly from the last few days before feature freeze. I've pushed those ones. David

Re: Typos in the code and README

2025-04-19 Thread David Rowley
On Sat, 19 Apr 2025 at 20:00, Alexander Lakhin wrote: > I've gathered the following collection of typos and inconsistencies > appeared since 2025-01-01: Here are a few more fixes of a similar ilk. All new in 2025, predominantly from the last few days before feature freeze. The few I wasn't a ful

Re: Align memory context level numbering in pg_log_backend_memory_contexts()

2025-04-18 Thread David Rowley
On Fri, 18 Apr 2025 at 20:54, Fujii Masao wrote: > Shouldn't the example output of pg_log_backend_memory_contexts() in > the documentation also be updated to use 1-based numbering for consistency? > Patch attached. Yeah. I failed to notice we had an example of the output. Want to take care of it

Re: Align memory context level numbering in pg_log_backend_memory_contexts()

2025-04-18 Thread David Rowley
On Fri, 18 Apr 2025 at 00:25, Rahila Syed wrote: > Regarding v2 patch, > - int level = 0; > > Retaining the level variable will enhance the code readability, IMO. When I read that, I suspected it might have been leftover from a refactor during the develop

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-04-18 Thread David Rowley
On Wed, 16 Apr 2025 at 10:09, Ilia Evdokimov wrote: > I've prepared the updated patches as discussed, including the addition > of estimated lookups in the EXPLAIN output for Memoize. Please let me > know if you have any feedback or further suggestions. While this is fresh, as I might forget befor

Re: Fixup some appendPQExpBuffer() calls

2025-04-18 Thread David Rowley
On Thu, 17 Apr 2025 at 19:50, Daniel Gustafsson wrote: > > On 17 Apr 2025, at 01:44, David Rowley wrote: > > 1) Commit the attached to master > > 2) Do nothing. > > > > I'd like to do #1. > > I vote for #1 as well. Thanks for the judgment sense check. I

Re: n_ins_since_vacuum stats for aborted transactions

2025-04-18 Thread David Rowley
On Sat, 12 Apr 2025 at 07:33, Sami Imseih wrote: > What do you think of the attached? I looked at the v3 patch and I'm having trouble getting excited about it. I'd say this part is misleading: @@ -3956,7 +3961,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage

Re: Fixup some appendPQExpBuffer() calls

2025-04-16 Thread David Rowley
On Mon, 14 Apr 2025 at 19:39, David Rowley wrote: > 1) I should commit the attached 0002 patch now, or; > 2) Should commit only the new-to-v18 ones now, or; > 3) do nothing. > > I think #3 isn't a good option as we (or I) have made efforts in the > past to keep these in

Re: Align memory context level numbering in pg_log_backend_memory_contexts()

2025-04-16 Thread David Rowley
On Thu, 17 Apr 2025 at 02:20, torikoshia wrote: > Regarding the implementation: > In the initial patch attached, I naïvely incremented the level just > before emitting the log line. > However, it might be cleaner to simply initialize the level variable to > 1 from the start. This could help avoid

Re: Align memory context level numbering in pg_log_backend_memory_contexts()

2025-04-15 Thread David Rowley
On Wed, 16 Apr 2025 at 01:23, torikoshia wrote: >=# select name, level from pg_backend_memory_contexts; > name | level >+--- > TopMemoryContext | 1 >=# select pg_log_backend_memory_contexts(pg_backend_pid());

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-04-15 Thread David Rowley
On Wed, 16 Apr 2025 at 04:25, Robert Haas wrote: > > On Mon, Apr 14, 2025 at 8:23 PM David Rowley wrote: > > "Estimates: capacity=N distinct keys=N lookups=N hit ratio=N.N%" > > Is lookups=N here the estimated number of lookups i.e. what we think > nloops will end

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-04-15 Thread David Rowley
On Tue, 15 Apr 2025 at 21:44, Ilia Evdokimov wrote: > Wrapping the line in costs or verbose would help with test stability, but > doing that only to simplify test output doesn't look like the right reason. > Maybe it makes more sense to mask these values in other tests that use > Memoize too. W

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-04-14 Thread David Rowley
On Tue, 15 Apr 2025 at 11:14, Ilia Evdokimov wrote: > On 14.04.2025 23:53, David Rowley wrote: > If we can't get consensus for everything people want to add at once > then maybe the patch could be broken into two, with 0001 being pretty > much the v4 patch and then have 0002 add

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-04-14 Thread David Rowley
On Tue, 15 Apr 2025 at 04:31, Robert Haas wrote: > I don't think we should use ANALYZE for this because, IME, that should > just be about whether the query gets executed. Since this looks like > information that is available at plan time, I think it should be > displayed all the time or made conti

Fixup some appendPQExpBuffer() calls

2025-04-14 Thread David Rowley
In a similar effort to what I did in [1], there's a bunch of appendPQExpBuffer() calls that could use appendPQExpBufferStr() or appendPQExpBufferChar(). With [1], I've been keeping those appendStringInfo calls in check at this time of year for a few years now. With appendPQExpBuffer I've not been,

Get rid of integer divide in FAST_PATH_REL_GROUP() macro

2025-04-13 Thread David Rowley
I noticed a while ago that the new fast-path locking code uses integer division to figure out the fast-path locking group slot. To me, this seems a bit unnecessary as FastPathLockGroupsPerBackend is always a power-of-two value, so we can use bitwise-AND instead. I don't think FAST_PATH_REL_GROUP

Re: Can we use Statistics Import and Export feature to perforamance testing?

2025-04-12 Thread David Rowley
On Sat, 12 Apr 2025 at 20:29, Corey Huinker wrote: >> >> at the *actual size* of the relation and takes that into account when >> scaling the statistics (see table_block_relation_estimate_size() in >> tableam.c). If the table sizes don't match between the two servers >> then there's no guarantees

Re: n_ins_since_vacuum stats for aborted transactions

2025-04-10 Thread David Rowley
On Fri, 11 Apr 2025 at 02:01, Sami Imseih wrote: > I created an entry for the July CF > https://commitfest.postgresql.org/patch/5691/ > > ... and I realized I forgot to include David's code comment patch yesterday, > Reattaching both patches. I've pushed the code comment patch. For the docs patc

Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate

2025-04-10 Thread David Rowley
On Mon, 7 Apr 2025 at 19:39, Melanie Plageman wrote: > +++ > C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out > 2025-04-04 14:47:20.358393500 + > @@ -2769,15 +2769,16 @@ > EXPLAIN (COSTS OFF) > SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) > WH

Re: Improve a few appendStringInfo calls new to v18

2025-04-10 Thread David Rowley
On Fri, 11 Apr 2025 at 02:51, Nathan Bossart wrote: > This probably isn't v18 material, but this reminds me of my idea to change > appendStringInfoString() into a macro for appendBinaryStringInfo() so that > the compiler can remove the runtime strlen() calls for string literals [0]. > In most case

Re: Improve a few appendStringInfo calls new to v18

2025-04-10 Thread David Rowley
On Thu, 10 Apr 2025 at 20:24, Heikki Linnakangas wrote: > > On 10/04/2025 06:51, David Rowley wrote: > > Any objections to doing this soonish? Or in a few weeks? > > Sure, let's do it. Why would we wait? Great. Pushed. Was considering waiting as I didn't know if the

Re: Add .DS_Store to .gitignore

2025-04-09 Thread David Rowley
On Thu, 10 Apr 2025 at 18:05, Dianjin Wang wrote: > Here is a patch to add the `.DS_Store` to the .gitignore to ignore the > `.DS_Store` file tracking by git when doing some development work in > the macOS terminal. It helps me and my community (Apache Cloudberry) a > lot. I hope this small patch

Re: Possible api miuse bms_next_member

2025-04-09 Thread David Rowley
On Thu, 10 Apr 2025 at 02:18, Tom Lane wrote: > If we did want to do something about this warning, rather than > hacking up the call sites I'd be inclined to invent something like > "bms_first_member()" which does the same thing but additionally > asserts on empty input. Not really convinced it's

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-09 Thread David Rowley
On Thu, 10 Apr 2025 at 16:57, Amit Langote wrote: > > On Thu, Apr 10, 2025 at 12:03 PM David Rowley wrote: > > -Most operations on EquivalenceClasses should ignore child members. > > +Most operations on EquivalenceClasses needn't look at child members. > > > >

Improve a few appendStringInfo calls new to v18

2025-04-09 Thread David Rowley
Looks like v18 has grown a few appendStringInfo misusages, e.g. using appendStringInfo() when no formatting is needed or just using format "%s" instead of using appendStringInfoString(). I've attached a couple of patches. The 0001 is just my method for finding these, not for commit. 0002 contains

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-09 Thread David Rowley
On Wed, 9 Apr 2025 at 17:38, Amit Langote wrote: > Still, maybe a tiny tweak to the last line could help steer readers > right without diving into storage. How about: > > Most operations on EquivalenceClasses should ignore child members, > which are stored separately from normal members. I think

Re: n_ins_since_vacuum stats for aborted transactions

2025-04-09 Thread David Rowley
On Thu, 10 Apr 2025 at 10:54, Sami Imseih wrote: > Fair enough, and I think I got my answers from this thread. This > design decision was not > discussed in the thread for b07642dbcd8. This discussion is mostly settled down now, but... I also went through that thread to see if it was mentioned a

Re: Memoize ANTI and SEMI JOIN inner

2025-04-09 Thread David Rowley
On Wed, 9 Apr 2025 at 18:48, Richard Guo wrote: > > On Thu, Mar 20, 2025 at 3:02 PM David Rowley wrote: > > For making this work, I think the attached should be about the guts of > > the code changes. I didn't look at the comments. Right now I can't > > think of

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 17:09, Amit Langote wrote: > Should the following paragraph in src/backend/optimizer/README be > updated to reflect the new reality after recent changes? > > An EquivalenceClass can contain "em_is_child" members, which are copies > of members that contain appendrel pa

Re: Remove unnecessary static type qualifiers

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 03:46, Peter Eisentraut wrote: > To avoid creating an array on the stack, you could maybe write it with a > pointer instead, like: > > const char * const query = "..."; > > I haven't tested whether that results in different or better compiled > code. The original code is pro

Re: Feature freeze

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 03:54, Bruce Momjian wrote: > We did have this discussion when AoE was chosen for PG 18 and the idea > was that as long as it is before April 18 midnight wherever you are, it > is not feature freeze yet. I think it maybe once made sense for the moment to stop accepting new p

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-08 Thread David Rowley
On Wed, 9 Apr 2025 at 02:24, Tom Lane wrote: > > David Rowley writes: > > I've pushed the patch now. Thanks for all the reviews of my adjustments. > > Shouldn't the CF entry be marked committed? I've done that now. 88f55bc97 added code to do faster lookups

Re: Can we use Statistics Import and Export feature to perforamance testing?

2025-04-08 Thread David Rowley
On Tue, 8 Apr 2025 at 12:21, Ryohei Takahashi (Fujitsu) wrote: > By using Statistics Import and Export feature, is it possible to achieve the > above request by following procedure? > (1) Export the statistics from production environment by using pg_dump > --statistics-only. > (2) On the staging

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-07 Thread David Rowley
On Tue, 8 Apr 2025 at 17:41, Ashutosh Bapat wrote: > Thanks for listing all the patterns. Creating four different iterators > is going to affect functionality and might require duplicate code. But > each of the patterns is using exactly one BMS operation on em_relids > and relids being used as sea

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-07 Thread David Rowley
On Tue, 8 Apr 2025 at 04:54, Ashutosh Bapat wrote: > - foreach(lc2, cur_ec->ec_members) > + setup_eclass_member_iterator(&it, cur_ec, rel); > + while ((cur_em = eclass_member_iterator_next(&it)) != NULL) > { > - EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); > - > /* > * Ignore chi

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-06 Thread David Rowley
On Sat, 5 Apr 2025 at 16:55, David Rowley wrote: > I am still thinking about the duplicate members being returned from > the iterator for child join rels due to them being duplicated into > each component relid element in ec_childmembers. I did consider if > these could just not be du

Re: Hashed IN only applied to first encountered IN

2025-04-05 Thread David Rowley
On Wed, 2 Apr 2025 at 00:51, David Geier wrote: > The hashed IN optimization is only applied to the first encountered > ScalarArrayOpExpr during the expression tree traversal in > convert_saop_to_hashed_saop_walker(). Reason being that the walker > returns true which aborts the traversal. > I've

Re: Add mention in docs about locking all partitions for generic plans

2025-04-05 Thread David Rowley
On Mon, 24 Mar 2025 at 19:50, Tender Wang wrote: > > David Rowley 于2025年3月24日周一 05:28写道: >> This is no longer true in master, so if we do something here it's only >> v17 and earlier. > > In the case of [1], we still have AccessShareLock on entity_2, even though i

Re: BTScanOpaqueData size slows down tests

2025-04-05 Thread David Rowley
On Thu, 3 Apr 2025 at 04:21, Andres Freund wrote: > I was mildly > surprised to see how expensive the new compact attribute checks are. Is this a fairly deform-heavy workload? FWIW, before adding that Assert, I did test to see if there was any measurable impact on the time it took to run all the

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-05 Thread David Rowley
ry for eclass_indexes_array when root->append_rel_list is empty? 12. join_rel_list_index isn't being initialized in all the places that do makeNode(RelOptInfo); Maybe -1 is a good initial value? (See my next point) 13. RelOptInfo.join_rel_list_index is an index into PlannerInfo.join_rel_list. I

Re: Add mention in docs about locking all partitions for generic plans

2025-04-05 Thread David Rowley
On Mon, 24 Mar 2025 at 22:19, Tender Wang wrote: >> Maybe I was wrong about writing nothing in master's docs. It might >> still be important to detail this. I don't know the best way to phrase >> that, but maybe something along the lines of: "The query planner >> obtains locks for all partitions w

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-04 Thread David Rowley
On Sat, 5 Apr 2025 at 04:05, Tom Lane wrote: > This patchset has a distinct whiff of unseemly haste. hmm, yes. I would like to give this patch as good a chance at making v18 as I can, and I admit to having optimised for that. Seemingly, we've got a few other good partitioning performance patches

Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

2025-04-04 Thread David Rowley
On Sat, 29 Mar 2025 at 05:46, Ashutosh Bapat wrote: > PFA patches. 0001 and 0002 are the same as the previous set. 0003 > changes the initial hash table size to the length of ec_derives. I'm just not following the logic in making it the length of the ec_derives List. If you have 32 buckets and tr

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-04 Thread David Rowley
Thank you for having a look at this. On Fri, 4 Apr 2025 at 21:47, Amit Langote wrote: > It looks to me like the following hunks in 0002 probably belong in > 0001, unless you’re planning to commit the patches together anyway: Ah, yeah. Unsure about that as yet, but I've moved it over. > The comm

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-04 Thread David Rowley
On Sat, 5 Apr 2025 at 02:54, Ashutosh Bapat wrote: > I haven't measured if the patches improve performance of simple scans > with thousands of partitions. Have you tried measuring that? I just tried 10k partitions on my Zen4 laptop. create table lp (a int) partition by list(a); select 'create ta

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-03 Thread David Rowley
On Fri, 4 Apr 2025 at 00:34, David Rowley wrote: > I've attached 2 patches, which I think addresses most of this, aside > from the last point. > > These do need more work. I've just attached what I have so far before > I head off for the day. I am planning on running

Re: [PoC] Reducing planning time when tables have many partitions

2025-04-03 Thread David Rowley
On Tue, 25 Mar 2025 at 06:49, Tom Lane wrote: > I finally made some time to look at this patchset, and I'm pretty > disappointed, because after 35 versions I'd expect to see something > that looks close to committable. This doesn't really. I like the > basic idea of taking child EC members out o

Re: Add mention in docs about locking all partitions for generic plans

2025-04-01 Thread David Rowley
On Mon, 31 Mar 2025 at 12:19, David Rowley wrote: > I'll push these in the next few days unless anyone else wants to voice > their opinions. Thanks for the review. Pushed. David

Re: Hashed IN only applied to first encountered IN

2025-04-01 Thread David Rowley
On Wed, 2 Apr 2025 at 01:31, David Rowley wrote: > > On Wed, 2 Apr 2025 at 00:51, David Geier wrote: > > The hashed IN optimization is only applied to the first encountered > > ScalarArrayOpExpr during the expression tree traversal in > > convert_saop_to_hashed_saop_wal

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2025-04-01 Thread David Rowley
On Tue, 1 Apr 2025 at 20:52, Ilia Evdokimov wrote: > With the feature freeze coming up soon, I’d like to ask: do we plan to > include this patch in v18? I don't think there's a clear enough consensus yet on what EXPLAIN should display. We'd need that beforehand. There are now less than 7 days to

Re: pgsql: Add memory/disk usage for Window aggregate nodes in EXPLAIN.

2025-03-31 Thread David Rowley
On Tue, 1 Apr 2025 at 09:40, Christoph Berg wrote: > =# explain (analyze,buffers off,costs off) select sum(n) over() from > generate_series(1,2048) a(n); > QUERY PLAN >

Re: pgsql: Add memory/disk usage for Window aggregate nodes in EXPLAIN.

2025-03-31 Thread David Rowley
On Tue, 1 Apr 2025 at 04:40, Christoph Berg wrote: > - Storage: Disk Maximum Storage: NkB > + Storage: Memory Maximum Storage: NkB > -> Function Scan on generate_series a (actual time=N.N..N.N rows=N.N > loops=N) We'll probably just need to bump that 2000 row count to something a bit

Re: Memoize ANTI and SEMI JOIN inner

2025-03-31 Thread David Rowley
On Mon, 31 Mar 2025 at 22:03, Richard Guo wrote: > I reviewed this patch and I have some concerns about the following > code: > > if (extra->inner_unique && > (inner_path->param_info == NULL || > bms_num_members(inner_path->param_info->ppi_serials) < > list_length(ext

Re: Memoize ANTI and SEMI JOIN inner

2025-03-30 Thread David Rowley
On Mon, 31 Mar 2025 at 15:33, Alena Rybakina wrote: > I believe it's worth asserting that both inner_unique and single_mode are not > true at the same time — just as a safety check. add_paths_to_joinrel() just chooses not to populate inner_unique for SEMI and ANTI joins because, as of today's ma

Re: Memoize ANTI and SEMI JOIN inner

2025-03-30 Thread David Rowley
On Mon, 31 Mar 2025 at 16:21, Alena Rybakina wrote: > However, is it necessary to check that extra->inner_unique must be false for > SEMI/ANTI joins here, or am I missing something? It looks a little confusing > at this point. If it is necessary, I don't see the reason for it. It was me that wo

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-26 Thread David Rowley
On Thu, 27 Mar 2025 at 14:51, Michael Paquier wrote: > One habit that I've found really useful to do when it comes to this > area of the code is to apply the tests first to show what kind of > behavior we had before changing the jumbling, then apply the update to > the query jumbling. This has tw

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-26 Thread David Rowley
On Thu, 27 Mar 2025 at 18:03, Sami Imseih wrote: > Maybe I am missing something, but when I applied the > v9-0001 patch alone, it did not solve the problem it > claims to and the pg_s_s regression test fails: That was an artifact of me not having made the split quite in the right place when divid

  1   2   3   4   5   6   7   8   9   10   >