date:20230930

Re: On login trigger: take three

2023-09-30 Thread Alexander Korotkov

On Fri, Sep 29, 2023 at 1:15 PM Daniel Gustafsson  wrote:
> > On 28 Sep 2023, at 23:50, Alexander Korotkov  wrote:
>
> > I don't think I can reproduce the performance regression pointed out
> > by Pavel Stehule [1].
>
> > I can't confirm the measurable overhead.
>
>
> Running the same pgbench command on my laptop looking at the average 
> connection
> times, and the averaging that over five runs (low/avg/high) I see ~5% increase
> over master with the patched version (compiled without assertions and debug):
>
> Patched event_triggers on:  6.858 ms/7.038 ms/7.434 ms
> Patched event_triggers off: 6.601 ms/6.958 ms/7.539 ms
> Master: 6.676 ms/6.697 ms/6.760 ms
>
> This is all quite unscientific with a lot of jitter so grains of salt are to 
> be
> applied, but I find it odd that you don't see any measurable effect.  Are you
> seeing the same/similar connection times between master and with this patch
> applied?

Thank you for doing experiments on your side. I've rechecked. It
appears that I didn't do enough runs, thus I didn't see the overhead
as more than an error. Now, I also can confirm ~5% overhead.

I spent some time thinking about how to overcome this overhead, but I
didn't find a brilliant option.  Previously pg_database flag was
proposed but then criticized as complex and error-prone.  I can also
imagine shmem caching mechanism.  But it would require overcoming
possible race conditions between shared cache invalidation and
transaction commit etc.  So, that would be also complex and
error-prone.  Any better ideas?

> A few small comments on the patch:
>
> + prevent successful login to the system. Such bugs may be fixed by
> + restarting the system in single-user mode (as event triggers are
> This paragraph should be reworded to recommend the GUC instead of single-user
> mode (while retaining mention of single-user mode, just not as the primary
> option).
>
>
> + Also, it's recommended to evade long-running queries in
> s/evade/avoid/ perhaps?

Fixed.

> Thanks for working on this!

Thank you as well!

--
Regards,
Alexander Korotkov


0001-Add-support-of-event-triggers-on-authenticated-l-v42.patch
Description: Binary data

commitfest app down for repairs

2023-09-30 Thread Joe Conway

The committfest app is down for repairs. We will reply back here once it 
is back up.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: commitfest app down for repairs

2023-09-30 Thread Joe Conway


On 9/30/23 08:47, Joe Conway wrote:

The committfest app is down for repairs. We will reply back here once it
is back up.

The commitfest app is back up.

We restored to a backup from one day prior. We will take a look at what 
changed in between, but it might be up to folks to redo some things.


A cooling off period was added to the commitfest app for new community 
accounts, similar to what was done with the wiki a few years ago.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: bug: ANALYZE progress report with inheritance tables

2023-09-30 Thread Heikki Linnakangas


On 28/09/2023 19:06, Heikki Linnakangas wrote:

On 22/01/2023 18:23, Justin Pryzby wrote:

pg_stat_progress_analyze was added in v13 (a166d408e).

For tables with inheritance children, do_analyze_rel() and
acquire_sample_rows() are called twice.  The first time through,
pgstat_progress_start_command() has memset() the progress array to zero.

But the 2nd time, ANALYZE_BLOCKS_DONE is already set from the previous
call, and BLOCKS_TOTAL can be set to some lower value (and in any case a
value unrelated to the pre-existing value of BLOCKS_DONE).  So the
progress report briefly shows a bogus combination of values and, with
these assertions, fails regression tests in master and v13, unless
BLOCKS_DONE is first zeroed.


Good catch!

I think the counts need do be reset even earlier, in
acquire_inherited_sample_rows(), at the same time that we update
PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID. See attached patch.
Otherwise, there's a brief moment where we have already updated the
child table ID, but the PROGRESS_ANALYZE_BLOCKS_TOTAL
PROGRESS_ANALYZE_BLOCKS_DONE still show the counts from the previous
child table. And if it's a foreign table, the FDW's sampling function
might not update the progress report at all, in which case the old
values will be displayed until the table is fully processed.


Committed and backported. Thank you!

--
Heikki Linnakangas
Neon (https://neon.tech)

Remove ParallelReadyList and worker_spi_state from typedefs.list

2023-09-30 Thread vignesh C

Hi,

Remove ParallelReadyList and worker_spi_state from typedefs.list,
these structures have been removed as part of an earlier commits,
ParallelReadyList was removed as part of
9bfd44bbde4261181bf94738f3b041c629c65a7e. worker_spi_state was removed
as part of af720b4c50a122647182f4a030bb0ea8f750fe2f.
Attached a patch for fixing the same.

Regards,
Vignesh
From 8f5b1514cbbbddbd3cd42ba0d847c9cc3d619f6c Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Sat, 30 Sep 2023 20:11:08 +0530
Subject: [PATCH] Remove ParallelReadyList and worker_spi_state from
 typedefs.list

Remove ParallelReadyList and worker_spi_state from typedefs.list, these
structures have been removed as part of an earlier commit.
---
 src/tools/pgindent/typedefs.list | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 8de90c4958..d5a26c946f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1913,7 +1913,6 @@ ParallelHashJoinBatch
 ParallelHashJoinBatchAccessor
 ParallelHashJoinState
 ParallelIndexScanDesc
-ParallelReadyList
 ParallelSlot
 ParallelSlotArray
 ParallelSlotResultHandler
@@ -3868,7 +3867,6 @@ wchar2mb_with_len_converter
 wchar_t
 win32_deadchild_waitinfo
 wint_t
-worker_spi_state
 worker_state
 worktable
 wrap
-- 
2.34.1

Re: Remove ParallelReadyList and worker_spi_state from typedefs.list

2023-09-30 Thread Tom Lane

vignesh C  writes:
> Remove ParallelReadyList and worker_spi_state from typedefs.list,
> these structures have been removed as part of an earlier commits,
> ParallelReadyList was removed as part of
> 9bfd44bbde4261181bf94738f3b041c629c65a7e. worker_spi_state was removed
> as part of af720b4c50a122647182f4a030bb0ea8f750fe2f.
> Attached a patch for fixing the same.

I don't think we need to trouble with removing such entries by hand.
I still anticipate updating typedefs.list from the buildfarm at least
once per release cycle, and that will take care of cleaning out
obsolete entries.

regards, tom lane

Re: [feature]COPY FROM enable FORCE_NULL/FORCE_NOT_NULL on all columns

2023-09-30 Thread Andrew Dunstan



On 2023-07-31 Mo 20:46, Zhang Mingli wrote:




On Aug 1, 2023, at 03:35, Andrew Dunstan  wrote:

I was hoping it be able to get to it today but that's not happening. 
If you want to submit a revised patch as above that will be good. I 
hope to get to it later this week.


HI, Andrew

Patch rebased and updated like above, thanks.



Pushed at last, thanks.


cheers


andrew


--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: Eliminate redundant tuple visibility check in vacuum

2023-09-30 Thread Andres Freund

Hi,

On 2023-09-28 11:25:04 -0400, Robert Haas wrote:
> I went ahead and committed 0001. If Andres still wants to push for
> more renaming there, that can be a follow-up patch.

Agreed.

> And we can see if he or anyone else has any comments on this new version of
> 0002. To me we're down into the level of details that probably don't matter
> very much one way or the other, but others may disagree.

The only thought I have is that it might be worth to amend the comment in
lazy_scan_prune() to mention that such a tuple won't need to be frozen,
because it was visible to another session when vacuum started.

Greetings,

Andres Freund

Re: Annoying build warnings from latest Apple toolchain

2023-09-30 Thread Andres Freund

Hi,

On 2023-09-29 12:14:40 -0400, Tom Lane wrote:
> I wrote:
> > Andres Freund  writes:
> >> On 2023-09-28 16:46:08 -0400, Tom Lane wrote:
> >>> Well, it's only important on platforms where we can't restrict
> >>> libpq.so from exporting all symbols.  I don't know how close we are
> >>> to deciding that such cases are no longer interesting to worry about.
> >>> Makefile.shlib seems to know how to do it everywhere except Windows,
> >>> and I imagine we know how to do it over in the MSVC scripts.
> 
> >> Hm, then I'd argue that we don't need to care about it anymore. The meson
> >> build does the necessary magic on windows, as do the current msvc scripts.
> 
> > If we take that argument seriously, then I'm inclined to adjust my
> > upthread patch for Makefile.global.in so that it removes the extra
> > inclusions of libpgport/libpgcommon everywhere, not only macOS.
> > The rationale would be that it's not worth worrying about ABI
> > stability details on any straggler platforms.
> 
> Looking closer, it's only since v16 that we have export list support
> on all officially-supported platforms.

Oh, right, before that Solaris didn't support it. I guess we could backpatch
that commit, it's simple enough, but I don't think I care enough about Solaris
to do so.


> Therefore, I think the prudent thing to do in the back branches is use the
> patch I posted before, to suppress the duplicate -l switches only on macOS.
> In HEAD, I propose we simplify life by doing it everywhere, as attached.

Makes sense.

Greetings,

Andres Freund

Re: Annoying build warnings from latest Apple toolchain

2023-09-30 Thread Tom Lane

Andres Freund  writes:
> On 2023-09-29 12:14:40 -0400, Tom Lane wrote:
>> Looking closer, it's only since v16 that we have export list support
>> on all officially-supported platforms.

> Oh, right, before that Solaris didn't support it. I guess we could backpatch
> that commit, it's simple enough, but I don't think I care enough about Solaris
> to do so.

HPUX would be an issue too, as we never did figure out how to do
export control on that.  However, I doubt it would be a great idea
to back-patch export control in minor releases, even if we had
the patch at hand.  That would be an ABI break of its own, although
it'd only affect clients that hadn't done things quite correctly.

>> Therefore, I think the prudent thing to do in the back branches is use the
>> patch I posted before, to suppress the duplicate -l switches only on macOS.
>> In HEAD, I propose we simplify life by doing it everywhere, as attached.

> Makes sense.

Done that way.  Were you going to push the -Wl,-exported_symbols_list
change?

regards, tom lane

Re: Annoying build warnings from latest Apple toolchain

2023-09-30 Thread Andres Freund

Hi,

On 2023-09-30 13:28:01 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-09-29 12:14:40 -0400, Tom Lane wrote:
> >> Looking closer, it's only since v16 that we have export list support
> >> on all officially-supported platforms.
> 
> > Oh, right, before that Solaris didn't support it. I guess we could backpatch
> > that commit, it's simple enough, but I don't think I care enough about 
> > Solaris
> > to do so.
> 
> HPUX would be an issue too, as we never did figure out how to do
> export control on that.

Ah, right.


> However, I doubt it would be a great idea
> to back-patch export control in minor releases, even if we had
> the patch at hand.  That would be an ABI break of its own, although
> it'd only affect clients that hadn't done things quite correctly.

Agreed.


> >> Therefore, I think the prudent thing to do in the back branches is use the
> >> patch I posted before, to suppress the duplicate -l switches only on macOS.
> >> In HEAD, I propose we simplify life by doing it everywhere, as attached.
> 
> > Makes sense.
> 
> Done that way.  Were you going to push the -Wl,-exported_symbols_list
> change?

Done now.

Greetings,

Andres Freund

Re: SHARED locks barging behaviour

2023-09-30 Thread Andres Freund

Hi,

On 2023-09-30 00:50:11 +0200, Laurenz Albe wrote:
> On Fri, 2023-09-29 at 17:45 -0400, Bruce Momjian wrote:
> > On Tue, Jan 17, 2023 at 12:18:28PM -0500, Arul Ajmani wrote:
> > > I'm trying to better understand the following barging behaviour with 
> > > SHARED
> > > locks.
> > ...
> > > Given there is a transaction waiting to acquire a FOR UPDATE lock, I was
> > > surprised to see the second FOR SHARE transaction return immediately 
> > > instead of
> > > waiting. I have two questions:
> > > 
> > > 1) Could this barging behaviour potentially starve out the transaction 
> > > waiting
> > > to acquire the FOR UPDATE lock, if there is a continuous queue of 
> > > transactions
> > > that acquire a FOR SHARE lock briefly?
> > 
> > Yes, see below.
> > 
> > > 2) Assuming this is by design, I couldn't find (in code) where this 
> > > explicit
> > > policy choice is made. I was looking around LockAcquireExtended, but it 
> > > seems
> > > like the decision is made above this layer. Could someone more familiar 
> > > with
> > > this code point me at the right place? 
> > 
> > I know this from January, but I do have an answer.  [...]
> 
> You answer the question where this is implemented.  But the more important 
> question
> is whether this is intentional.  This code was added by 0ac5ad5134f 
> (introducing
> FOR KEY SHARE and FOR NO KEY UPDATE).  My feeling is that it is not 
> intentional that
> a continuous stream of share row locks can starve out an exclusive row lock, 
> since
> PostgreSQL behaves differently with other locks.
> 
> On the other hand, if nobody has complained about it in these ten years, 
> perhaps
> it is just fine the way it is, if by design or not.

I'd be very hesitant to change the behaviour at this point - the likelihood of
existing workloads slowing down substantially, or even breaking due to an
additional source of deadlocks, seems substantial.

Greetings,

Andres Freund

Re: Annoying build warnings from latest Apple toolchain

2023-09-30 Thread Andres Freund

Hi,

On 2023-09-29 11:11:49 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-09-28 22:53:09 -0400, Tom Lane wrote:
> >> (Perhaps we should apply the above to HEAD alongside the meson.build fix, 
> >> to
> >> get more test coverage?)
> 
> > The macos animals BF seem to run Ventura, so I think it'd not really provide
> > additional coverage that CI and your manual testing already has. So probably
> > not worth it from that angle?
> 
> My thought was that if it's in the tree we'd get testing from
> non-buildfarm sources.

I'm not against it, but it also doesn't quite seem necessary, given your
testing on Catalina.


> FWIW, I'm going to update sifaka/indri to Sonoma before too much longer
> (they're already using Xcode 15.0 which is the Sonoma toolchain).
> I recently pulled longfin up to Ventura, and plan to leave it on that
> for the next year or so.  I don't think anyone else is running macOS
> animals.

It indeed looks like nobody is. I wonder if it's worth setting up a gcc macos
animal, it's not too hard to imagine that breaking.

Greetings,

Andres Freund

Fix receiving large legal tsvector from binary format

2023-09-30 Thread Ерохин Денис Владимирович

Hello,

 

There is a problem on receiving large tsvector from binary format with
getting error "invalid tsvector: maximum total lexeme length exceeded".
Required simple steps to reproduce problem:

- Make table with one column of type 'tsvector'

- Add row with large tsvector (900 Kb will be enougth)

- Save table to file with "copy to with binary"

- Restore table from file with "copy from with binary"

 

At last step we can't restore a legal tsvector because there is error of
required memory size calculation for tsvector during reading of its
binary.

 

It seems function "tsvectorrecv" can be improved:

v1-0001-Fix-receiving-of-big-tsvector-from-binary - patch fixes problem of
required size calculation for tsvector where wrong type is used (WordEntry
instead of WordEntryPos). Size of wrong type is bigger than type of needed
type therefore total size turns out larger than needed. So next test of
that size fails during maximum size check.

v1-0002-Replace-magic-one-in-tsvector-binary-receiving - patch removes
magic ones from code. Those ones are used to add "sizeof(uint16)" to
required size where number of WordEntryPos'es will be stored. Now it works
only because size of WordEntryPos is equal to size of uint16. But it is
not obviously during code reading and causes question "what for this magic
one is?"

v1-0003-Optimize-memory-allocation-during-tsvector-binary - patch makes
memory allocation more accuracy and rarely. It seems memory to store
tsvector's data is allocated redundant and reallocations happen more often
than necessary.

 

Best regards,

Denis Erokhin

Jatoba

 



v1-0001-Fix-receiving-of-big-tsvector-from-binary.patch
Description: Binary data


v1-0002-Replace-magic-one-in-tsvector-binary-receiving.patch
Description: Binary data


v1-0003-Optimize-memory-allocation-during-tsvector-binary.patch
Description: Binary data

Doc: Minor update for enable_partitionwise_aggregate

2023-09-30 Thread Andy Atkinson

Hello. While reading the docs for the enable_partitionwise_aggregate
parameter on the Query Planning page, I thought the description had a small
mistake that could be improved.

The current wording is: "which allows grouping or aggregation on a
partitioned tables performed separately "

Page: https://www.postgresql.org/docs/current/runtime-config-query.html

I think possible better alternatives could be:

   - (Option 1) a "partitioned table's partitions" (the possessive form of
   "it's"). The "enable_partition_pruning" parameter uses "the partitioned
   table's partitions" in this form. I think this option is good, but I had a
   slight preference for option 2.
   - (Option 2) Or to just cut out the first part and say "to be performed
   separately for each partition", which seemed simpler. So the sentence
   reads: "which allows grouping or aggregation to be performed separately for
   each partition"
   - (Option 3) dropping the "a" so it says "which allows grouping or
   aggregation on partitioned tables performed separately". I don't think this
   is as good though because the aggregation happens on the partitions, so it
   feels slightly off to me to say the "partitioned tables" instead of the
   partitions.

I tested toggling this parameter on and off with a test partitioned table,
and looked at the query execution plan, and saw how the aggregation
happened on the partitions first when it was enabled.

This is my first ever submission to pgsql-hackers. :) I used this guide
from Lætitia to prepare the patch file for Option 2 above, which is
attached. I am having a problem with the "make STYLE=website html" step, so
I hadn't seen the preview (still fixing this up).
https://mydbanotebook.org/post/patching-doc/

Let me know what you think!

Thanks.
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 924309af26..167248a5da 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5271,9 +5271,8 @@ ANY num_sync (

Re: Various small doc improvements; plpgsql, schemas, permissions, oidvector

2023-09-30 Thread Karl O. Pinc

Version 4.

Added: v4-0012-Explain-role-management.patch

On Mon, 25 Sep 2023 23:37:44 -0500
"Karl O. Pinc"  wrote:

> > On Mon, 25 Sep 2023 17:55:59 -0500
> > "Karl O. Pinc"  wrote:
> >   
> > > On Mon, 25 Sep 2023 14:14:37 +0200
> > > Daniel Gustafsson  wrote:
> >   
> > > > Once done you can do "git format-patch origin/master -v 1" which
> > > > will generate a set of n patches named v1-0001 through v1-000n.

> > > I am not particularly confident in the top-line commit
> > > descriptions.
> >   
> > > The bulk of the commit descriptions are very wordy
> >   
> > > Listing all the attachments here for future discussion: 

v4-0001-Change-section-heading-to-better-reflect-saving-a.patch
v4-0002-Change-section-heading-to-better-describe-referen.patch
v4-0003-Better-section-heading-for-plpgsql-exception-trap.patch
v4-0004-Describe-how-to-raise-an-exception-in-the-excepti.patch
v4-0005-Improve-sentences-in-overview-of-system-configura.patch
v4-0006-Provide-examples-of-listing-all-settings.patch
v4-0007-Cleanup-summary-of-role-powers.patch
v4-0008-Explain-the-difference-between-role-attributes-an.patch
v4-0009-Document-the-oidvector-type.patch
v4-0010-Improve-sentences-about-the-significance-of-the-s.patch
v4-0011-Add-a-sub-section-to-describe-schema-resolution.patch
v4-0012-Explain-role-management.patch

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
>From 122665c4155698abe88e2bd17639a991791b94e3 Mon Sep 17 00:00:00 2001
From: "Karl O. Pinc" 
Date: Sun, 24 Sep 2023 15:49:30 -0500
Subject: [PATCH v4 01/12] Change section heading to better reflect saving a
 result in variable(s)

The current section title of "Executing a Command with a Single-Row
Result" does not reflect what the section is really about.  Other
sections make clear how to _execute_ commands, single-row result or not.
What this section is about is how to _save_ a single row of results into
variable(s).

It would be nice to talk about saving results into variables in the
section heading but I couldn't come up with anything pithy.  "Saving a
Single-Row of a Command's Result" seems good enough, especially since
there's few other places to save results other than in variables.
---
 doc/src/sgml/plpgsql.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index f55e901c7e..8747e84245 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1126,7 +1126,7 @@ PERFORM create_mv('cs_session_page_requests_mv', my_query);

 

-Executing a Command with a Single-Row Result
+Saving a Single-Row of a Command's Result
 
 
  SELECT INTO
-- 
2.30.2

>From 4de4a31d41124dfa793cc5cce0516673811ea414 Mon Sep 17 00:00:00 2001
From: "Karl O. Pinc" 
Date: Sun, 24 Sep 2023 15:52:21 -0500
Subject: [PATCH v4 02/12] Change section heading to better describe reference
 of existing types

The section heading of "Copying Types" does not reflect what the
section is about.  It is not about making copies of data types but
about using the data type of existing columns (or rows) in new type
declarations without having to know what the existing type is.

"Re-Using the Type of Columns and Variables" seems adequate.  Getting
something in there about declartions seems too wordy.  I thought
perhaps "Referencing" instead of "Re-Using", but "referencing" isn't
perfect and "re-using" is generic enough, shorter, and simpler to read.
---
 doc/src/sgml/plpgsql.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 8747e84245..874578265e 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -672,7 +672,7 @@ DECLARE

 
   
-   Copying Types
+   Re-Using the Type of Columns and Variables
 
 
 variable%TYPE
-- 
2.30.2

>From 80c2b8ef7ad6e610f5c7bdc61b827983a87110e2 Mon Sep 17 00:00:00 2001
From: "Karl O. Pinc" 
Date: Sun, 24 Sep 2023 16:03:29 -0500
Subject: [PATCH v4 03/12] Better section heading for plpgsql exception
 trapping

The docs seem to use "error" and "exception" interchangeably, perhaps
50% each.  But they never say that the are the same thing, and in the
larger world they are not.  Errors tend to be something that drop on
the floor and usually halt execution whereas exceptions can be trapped
and give the programmer more control over the flow of the program.
(Although, to be fair, exceptions are a subset of errors.)  "Trapping
Errors" is not a good section title for these reasons, and because
when it comes to programmatically raising errors in Pl/PgSQL you
don't, you raise exceptions.  The current section heading does not
stand out in a scan of the table of contents when you're looking for
exception handling, IMHO.

"Error Processing and Trapping Exceptions" is a little long but it
does accurately reflect that the section is about how Pl/PgSQL behaves
under error conditions with quite a lot about how

Re: POC: GROUP BY optimization

2023-09-30 Thread Andrei Lepikhov

New version of the patch. Fixed minor inconsistencies and rebased onto 
current master.


--
regards,
Andrey Lepikhov
Postgres Professional
From 2f5a42c8a53286f830e3376ff4d3f8b7d4215b4b Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Wed, 13 Sep 2023 11:20:03 +0700
Subject: [PATCH] Explore alternative orderings of group-by pathkeys during
 optimization.

When evaluating a query with a multi-column GROUP BY clause using sort,
the cost may depend heavily on the order in which the keys are compared when
building the groups. Grouping does not imply any ordering, so we can compare
the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg,
we simply compared keys in the order specified in the query. This commit
explores alternative ordering of the keys, trying to find a cheaper one.

In principle, we might generate grouping paths for all permutations of the keys
and leave the rest to the optimizer. But that might get very expensive, so we
try to pick only a couple interesting orderings based on both local and global
information.

When planning the grouping path, we explore statistics (number of distinct
values, cost of the comparison function) for the keys and reorder them
to minimize comparison costs. Intuitively, it may be better to perform more
expensive comparisons (for complex data types, etc.) last because maybe
the cheaper comparisons will be enough. Similarly, the higher the cardinality
of a key, the lower the probability we'll need to compare more keys. The patch
generates and costs various orderings, picking the cheapest ones.

The ordering of group keys may interact with other parts of the query, some of
which may not be known while planning the grouping. For example, there may be
an explicit ORDER BY clause or some other ordering-dependent operation higher up
in the query, and using the same ordering may allow using either incremental
sort or even eliminating the sort entirely.

The patch generates orderings and picks those, minimizing the comparison cost
(for various path keys), and then adds orderings that might be useful for
operations higher up in the plan (ORDER BY, etc.). Finally, it always keeps
the ordering specified in the query, assuming the user might have additional
insights.

This introduces a new GUC enable_group_by_reordering so that the optimization
may be disabled if needed.
---
 .../postgres_fdw/expected/postgres_fdw.out|  36 +-
 src/backend/optimizer/path/costsize.c | 363 ++-
 src/backend/optimizer/path/equivclass.c   |  13 +-
 src/backend/optimizer/path/pathkeys.c | 600 ++
 src/backend/optimizer/plan/planner.c  | 465 --
 src/backend/optimizer/util/pathnode.c |   4 +-
 src/backend/utils/adt/selfuncs.c  |  38 +-
 src/backend/utils/misc/guc_tables.c   |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/nodes/pathnodes.h |  10 +
 src/include/optimizer/cost.h  |   4 +-
 src/include/optimizer/paths.h |   7 +
 src/include/utils/selfuncs.h  |   5 +
 src/test/regress/expected/aggregates.out  | 244 ++-
 .../regress/expected/incremental_sort.out |   2 +-
 src/test/regress/expected/join.out|  51 +-
 src/test/regress/expected/merge.out   |  15 +-
 .../regress/expected/partition_aggregate.out  |  44 +-
 src/test/regress/expected/partition_join.out  |  75 +--
 src/test/regress/expected/sysviews.out|   3 +-
 src/test/regress/expected/union.out   |  60 +-
 src/test/regress/sql/aggregates.sql   |  99 +++
 src/test/regress/sql/incremental_sort.sql |   2 +-
 23 files changed, 1769 insertions(+), 382 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 144c114d0f..63af7feabe 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2319,18 +2319,21 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT 
DISTINCT t2.c1, t3.c1 FROM
 -- join with pseudoconstant quals
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER 
= SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-   
QUERY PLAN  
 
-
+  QUERY PLAN   

+--

Re: On login trigger: take three

commitfest app down for repairs

Re: commitfest app down for repairs

Re: bug: ANALYZE progress report with inheritance tables

Remove ParallelReadyList and worker_spi_state from typedefs.list

Re: Remove ParallelReadyList and worker_spi_state from typedefs.list

Re: [feature]COPY FROM enable FORCE_NULL/FORCE_NOT_NULL on all columns

Re: Eliminate redundant tuple visibility check in vacuum

Re: Annoying build warnings from latest Apple toolchain

Re: Annoying build warnings from latest Apple toolchain

Re: Annoying build warnings from latest Apple toolchain

Re: SHARED locks barging behaviour

Re: Annoying build warnings from latest Apple toolchain

Fix receiving large legal tsvector from binary format

Doc: Minor update for enable_partitionwise_aggregate

Re: Various small doc improvements; plpgsql, schemas, permissions, oidvector

Re: POC: GROUP BY optimization

17 matches

Site Navigation

Mail list logo

Footer information