On Mon, May 5, 2025 at 10:14:22PM +1200, David Rowley wrote: > On Fri, 2 May 2025 at 14:44, Bruce Momjian <br...@momjian.us> wrote: > > You can see the most current HTML-built version here: > > > > https://momjian.us/pgsql_docs/release-18.html > > Thanks for working on these. > > For "Improve the performance of hash joins (David Rowley)", 0f5738202 > did the same thing for GROUP BY and hashed subplans too. It might be > worth adjusting this to some more generic text which covers all of > these. Something like "Speed up hash value generation in Hash Join, > GROUP BY, hashed Subplan and hashed set operations</p><p>This change > also allows JIT compilation for obtaining hash values for these > operations". The set operations I likely should have mentioned in the > commit message.
Okay, text added. > There's also Jeff's work in cc721c459, 4d143509c, a0942f441, 626df47ad > which does work to reduce the memory overheads of hashed GROUP BY, > hashed Subplans and hashed set operations. I think Jeff might have > understated the possible performance gains from these commits. I very > much think this is worth something like "Reduce memory overheads for > hashed GROUP BY, subplans and set operation processing (Jeff Davis)". > > A quick test with: explain analyze select a from > generate_series(1,1000000) a group by a; > > v17: Batches: 1 Memory Usage: 90145kB > v18: Batches: 1 Memory Usage: 57385kB > > A 37% reduction for this case. Not insignificant. Commits added and Jeff's name added, patch attached. -- Bruce Momjian <br...@momjian.us> https://momjian.us EDB https://enterprisedb.com Do not let urgent matters crowd out time for investment in the future.
diff --git a/doc/src/sgml/release-18.sgml b/doc/src/sgml/release-18.sgml index 86c4a231684..b281e210aae 100644 --- a/doc/src/sgml/release-18.sgml +++ b/doc/src/sgml/release-18.sgml @@ -363,12 +363,28 @@ Allow merge joins to use incremental sorts (Richard Guo) <!-- Author: David Rowley <drow...@postgresql.org> 2024-08-20 [adf97c156] Speed up Hash Join by making ExprStates support hashing +Author: David Rowley <drow...@postgresql.org> +2024-12-11 [0f5738202] Use ExprStates for hashing in GROUP BY and SubPlans +Author: Jeff Davis <jda...@postgresql.org> +2025-03-24 [4d143509c] Create accessor functions for TupleHashEntry. +Author: Jeff Davis <jda...@postgresql.org> +2025-03-24 [a0942f441] Add ExecCopySlotMinimalTupleExtra(). +Author: Jeff Davis <jda...@postgresql.org> +2025-03-24 [626df47ad] Remove 'additional' pointer from TupleHashEntryData. --> <listitem> <para> -Improve the performance of hash joins (David Rowley) +Improve the performance and reduce memory usage of hash joins and GROUP BY (David Rowley, Jeff Davis) <ulink url="&commit_baseurl;adf97c156">§</ulink> +<ulink url="&commit_baseurl;0f5738202">§</ulink> +<ulink url="&commit_baseurl;4d143509c">§</ulink> +<ulink url="&commit_baseurl;a0942f441">§</ulink> +<ulink url="&commit_baseurl;626df47ad">§</ulink> +</para> + +<para> +This also improves hash set operations used by EXCEPT, and hash lookups of subplan values. </para> </listitem>