Hi All,

When we implemented partitionwise join we disabled it by default (by
setting enable_partitionwise_join to false by default) since it
consumed a lot of memory and took a lot of time [1]. We also set
enable_partitionwise_aggregate to false by default since partitionwise
aggregates require partitionwise join. I have come across at least a
few cases in the field where users didn't know about these GUCs and
suffered from bad performance when queries involved partitioned
tables. Their queries improved a lot by just turning these GUCs ON.
Given that the partitionwise operations improve performance of queries
on partitioned tables, I think these GUCs need to be turned ON by
default.

Irrespective of whether the GUC is turned ON or OFF by default, we
need to reduce the memory consumed by the partitionwise planning and
the time those strategies take during planning. In this email I will
discuss the planner memory consumption problem.

Please find below the memory consumption measurements on master.

Experiment
----------
Created two tables t1 and t1_parted respectively. t1 is an
unpartitioned table whereas t1_parted is a partitioned table with 1000
partitions. Both of them are empty. Being empty does not affect
planning time and memory consumption much, but it takes far less time
to set up and then run EXPLAIN ANALYZE. I am using a self-join on the
partition key to measure memory and planning time. Again it's
something that makes setup easier yet good enough for measurements.
Attached are two scripts. setup.sql should be run to create tables,
partitions and helper functions. queries.sql executes and measures
time and memory of 2-way, 3-way, 4-way and 5-way self joins
respectively. It runs each query four times, once to warm up caches
and then thrice for measurement. Below I report averages of three
runs.

The first attached patch is used to measure memory consumption and
report it as part of EXPLAIN ANALYZE. It simply takes the difference
between memory used in the CurrentMemoryContext before and after
calling pg_plan_query() in ExplainOneQuery(). It does not account for
the memory consumed outside the CurrentMemoryContext while planning
the query e.g. caches. The memory consumed in those contexts does not
necessarily contribute to planning of a given query. Hence they are
ignored. I think the patch by itself makes a good improvement to
EXPLAIN ANALYZE, so we may want to consider it to be accepted in the
core code.

Measurements
------------
In all the tables below, the first column shows the number of tables
joined. Second column is for queries on unpartitioned table. Third
column for queries on partitioned table with partitionwise join turned
OFF. Fourth column is for queries on partitioned table with
partitionwise join turned ON.

Table 1: Planning time in ms.

Number of | unpartitioned | partitioned with  | partitioned   |
tables    |               | PWJ off           | PWJ on        |
---------------------------------------------------------------
        2 |          0.10 |            464.39 |        501.75 |
        3 |          0.17 |          2,714.28 |      3,035.07 |
        4 |          0.34 |          8,856.82 |     10,416.31 |
        5 |          0.84 |         22,519.02 |     29,769.19 |

Table 2: Memory consumption

Number of | unpartitioned | partitioned with  | partitioned   |
tables    |               | PWJ off           | PWJ on        |
---------------------------------------------------------------
        2 |    29.056 KiB |        19.520 MiB |    40.298 MiB |
        3 |    79.088 KiB |        45.227 MiB |   146.877 MiB |
        4 |   208.552 KiB |        83.540 MiB |   445.453 MiB |
        5 |   561.592 KiB |       149.283 MiB |  1563.253 MiB |

The time required to plan queries on partitioned table is very high
compared to the time taken to plan queries on unpartitioned tables. In
fact it's way higher than the factor of 1000 expected when there 1000
partitions. But the difference between planning time when
partitionwise join is turned OFF vs ON is less compared to difference
between unpartitioned vs partitioned case. The problem of reducing
planning time for queries on partitioned tables is being worked on by
Yuya and David in [2]. That work focuses on queries without
partitionwise join. But it might fix the problem for partitionwise
join as well.

I am focusing on reducing the memory consumption when partitionwise
join is enabled. As visible from table 2 when partitionwise join is
*not* used, the memory consumed in planning is proportional to the
number of partitions as expected. But when partitionwise join is
enabled the memory consumption increases exponentially with the number
of tables being joined. This is because the number of ways a join can
be computed is exponentially proportional to the number of tables
being joined. Table 3 has a breakup of memory consumed.

The memory consumption is broken by the objects that consume memory
during planning. The second attached patch is used to measure breakup
by functionality . Here's a brief explanation of the rows in the
table.

1. Restrictlist translations: Like other expressions the Restrictinfo
lists of parent are translated to obtain Restrictinfo lists to be
applied to child partitions (base as well as join). The first row
shows the memory consumed by the translated RestrictInfos. We can't
avoid these translations but closer examination reveals that a given
RestrictInfo gets translated multiple times proportional to the join
orders. These repeated translations can be avoided. I will start a
separate thread to discuss this topic.

2. Paths: this is the memory consumed when creating child join paths
and the Append paths in parent joins. It includes memory consumed by
the paths as well as translated expressions. I don't think we can
avoid creating these paths. But once the best paths are chosen for the
lower level relations, the unused paths can be freed. I will start a
separate thread to discuss this topic.

3. targetlist translation: child join relations' targetlists are
created by translating parent relations' targetlist. This row shows
the memory consumed by the translated targetlists. This translation
can't be avoided.

4. child SpecialJoinInfo: This is memory consumed in child joins'
SpecialJoinInfos translated from SpecialJoinInfo applicable to parent
joins. The child SpecialJoinInfos are translated on the fly when
computing child joins but are never freed. May be we can free them on
the fly as well or even better save them somewhere and fetch as and
when required. I will start a separate thread to discuss this topic.

5. Child join RelOptInfos: memory consumed by child join relations.
This is unavoidable as we need the RelOptInfos representing the child
joins.

Table 3: Partitionwise join planning memory breakup
Num joins          |        2   |        3   |        4   |        5   |
------------------------------------------------------------------------
1. translated      |    1.8 MiB |   13.1 MiB |   58.0 MiB |  236.5 MiB |
restrictlists      |            |            |            |            |
------------------------------------------------------------------------
2. creating child  |   11.6 MiB |   59.4 MiB |  207.6 MiB |  768.2 MiB |
join paths         |            |            |            |            |
------------------------------------------------------------------------
3. translated      |  723.5 KiB |    3.3 MiB |   10.6 MiB |   28.5 MiB |
targetlists        |            |            |            |            |
------------------------------------------------------------------------
4. child           |  926.8 KiB |    9.0 MiB |   45.7 MiB |  245.5 MiB |
SpecialJoinInfo    |            |            |            |            |
------------------------------------------------------------------------
5. Child join rels |    1.6 MiB |    7.9 MiB |   23.8 MiB |   67.5 MiB |
------------------------------------------------------------------------

Rows 1, 2 and 4 show most of the memory consumption. Those are also
the ones where there are opportunities to save memory. As said above,
we will discuss them in their respective email threads since each of
these fixes are independent and require separate discussion.

Here's memory consumption numbers after applying all the POC patches
mentioned above.
Number of tables  | no         | all POC    | %         | Absolute   |
joined            | patches    | patches    | reduction | reduction  |
----------------------------------------------------------------------
                2 |   40.3 MiB |   36.5 MiB |     9.43% |    3.8 MiB |
                3 |  146.9 MiB |  120.2 MiB |    18.13% |   26.6 MiB |
                4 |  445.5 MiB |  328.8 MiB |    26.19% |  116.7 MiB |
                5 | 1563.3 MiB |  953.2 MiB |    39.03% |  610.1 MiB |

Even if we are able to recover some memory as mentioned above, a lot
of memory is still consumed by the translated objects (expressions,
RestrictLists etc.). In case of partitioned tables, these translated
object trees differ only in leaf Var nodes. Thus if we device a way to
use the same Var node for parents as well as its children, we won't
need to translate the expressions. But that's a much larger effort and
probably something not even feasible.

While subproblems and their solutions will be discussed in separate
email threads, this thread is to discuss
1. generic memory consumption problems not covered above but seen by others
2. Is the memory consumption reduction after the POC patches good
enough for us to make partitionwise operations turned ON by default?
3. Is there anything else that stops us from turning the partitionwise
operations ON by default?

[1] 
https://www.postgresql.org/message-id/ca+tgmoyggdp6k-hxnagrykzh79w6nv2fevpyr_jembrorda...@mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

Attachment: setup.sql
Description: application/sql

Attachment: queries.sql
Description: application/sql

From a2ff55d53033a3bc00d587d1ad4354b54bc2a998 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
Date: Mon, 19 Jun 2023 10:34:41 +0530
Subject: [PATCH 1/3] Partitionwise join memory consumption measurement

Add function to measure memory consumed by various function used during
partitionwise join. The memory consumed is reported at the end of
standard_planner() execution.
---
 src/backend/optimizer/path/allpaths.c |  8 +++
 src/backend/optimizer/path/joinrels.c | 31 +++++++++++-
 src/backend/optimizer/plan/planner.c  |  9 ++++
 src/backend/optimizer/util/pathnode.c |  5 ++
 src/backend/optimizer/util/relnode.c  | 18 ++++++-
 src/backend/utils/mmgr/mcxt.c         | 73 ++++++++++++++++++++++++++-
 src/include/utils/memutils.h          |  7 +++
 7 files changed, 146 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 9bdc70c702..d82a2ac509 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -50,6 +50,8 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "nodes/memnodes.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -4302,6 +4304,10 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 	int			num_parts;
 	RelOptInfo **part_rels;
 
+	MemoryContextCounters mem_start;
+ 
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &mem_start, __FUNCTION__);
+
 	/* Handle only join relations here. */
 	if (!IS_JOIN_REL(rel))
 		return;
@@ -4366,6 +4372,8 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 	/* Build additional paths for this rel from child-join paths. */
 	add_paths_to_append_rel(root, rel, live_children);
 	list_free(live_children);
+
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &mem_start, __FUNCTION__);
 }
 
 
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 2feab2184f..c340a83c89 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -21,6 +21,7 @@
 #include "optimizer/paths.h"
 #include "partitioning/partbounds.h"
 #include "utils/memutils.h"
+#include "nodes/memnodes.h"
 
 
 static void make_rels_by_clause_joins(PlannerInfo *root,
@@ -895,6 +896,13 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 							RelOptInfo *rel2, RelOptInfo *joinrel,
 							SpecialJoinInfo *sjinfo, List *restrictlist)
 {
+	MemoryContextCounters mem_start;
+	char *label;
+
+	label = joinrel->reloptkind == RELOPT_OTHER_JOINREL ? "child_join_path_creation" : "parent_join_path_creation";
+
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &mem_start, label);
+
 	/*
 	 * Consider paths using each rel as both outer and inner.  Depending on
 	 * the join type, a provably empty outer or inner rel might mean the join
@@ -1045,6 +1053,8 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 
 	/* Apply partitionwise join technique, if possible. */
 	try_partitionwise_join(root, rel1, rel2, joinrel, sjinfo, restrictlist);
+
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &mem_start, label);
 }
 
 
@@ -1487,6 +1497,10 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 	ListCell   *lcr1 = NULL;
 	ListCell   *lcr2 = NULL;
 	int			cnt_parts;
+	MemoryContextCounters start_mem;
+
+	/* Start measuring memory */
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &start_mem, __FUNCTION__);
 
 	/* Guard against stack overflow due to overly deep partition hierarchy. */
 	check_stack_depth();
@@ -1546,6 +1560,7 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		Relids		child_joinrelids;
 		AppendRelInfo **appinfos;
 		int			nappinfos;
+		MemoryContextCounters restrict_mem;
 
 		if (joinrel->partbounds_merged)
 		{
@@ -1648,6 +1663,8 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		/* Find the AppendRelInfo structures */
 		appinfos = find_appinfos_by_relids(root, child_joinrelids, &nappinfos);
 
+		MemoryContextFuncStatsStart(CurrentMemoryContext, &restrict_mem, "restrictlist translation");
+
 		/*
 		 * Construct restrictions applicable to the child join from those
 		 * applicable to the parent join.
@@ -1656,6 +1673,9 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 			(List *) adjust_appendrel_attrs(root,
 											(Node *) parent_restrictlist,
 											nappinfos, appinfos);
+
+		MemoryContextFuncStatsEnd(CurrentMemoryContext, &restrict_mem, "restrictlist translation");
+
 		pfree(appinfos);
 
 		child_joinrel = joinrel->part_rels[cnt_parts];
@@ -1676,6 +1696,10 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 									child_joinrel, child_sjinfo,
 									child_restrictlist);
 	}
+
+
+	/* Stop measuring memory and print the stats. */
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &start_mem, __FUNCTION__);
 }
 
 /*
@@ -1687,12 +1711,15 @@ static SpecialJoinInfo *
 build_child_join_sjinfo(PlannerInfo *root, SpecialJoinInfo *parent_sjinfo,
 						Relids left_relids, Relids right_relids)
 {
-	SpecialJoinInfo *sjinfo = makeNode(SpecialJoinInfo);
 	AppendRelInfo **left_appinfos;
 	int			left_nappinfos;
 	AppendRelInfo **right_appinfos;
 	int			right_nappinfos;
+	MemoryContextCounters mem_start;
+	SpecialJoinInfo *sjinfo;
 
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &mem_start, __FUNCTION__);
+	sjinfo = makeNode(SpecialJoinInfo);
 	memcpy(sjinfo, parent_sjinfo, sizeof(SpecialJoinInfo));
 	left_appinfos = find_appinfos_by_relids(root, left_relids,
 											&left_nappinfos);
@@ -1718,6 +1745,8 @@ build_child_join_sjinfo(PlannerInfo *root, SpecialJoinInfo *parent_sjinfo,
 	pfree(left_appinfos);
 	pfree(right_appinfos);
 
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &mem_start, __FUNCTION__);
+
 	return sjinfo;
 }
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 1e4dd27dba..0d742c2587 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -67,6 +67,8 @@
 #include "utils/rel.h"
 #include "utils/selfuncs.h"
 #include "utils/syscache.h"
+#include "utils/memutils.h"
+#include "nodes/memnodes.h"
 
 /* GUC parameters */
 double		cursor_tuple_fraction = DEFAULT_CURSOR_TUPLE_FRACTION;
@@ -295,6 +297,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	Plan	   *top_plan;
 	ListCell   *lp,
 			   *lr;
+	MemoryContextCounters mem_start;
+
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &mem_start, __FUNCTION__);
 
 	/*
 	 * Set up global state for this planner invocation.  This data is needed
@@ -565,6 +570,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	if (glob->partition_directory != NULL)
 		DestroyPartitionDirectory(glob->partition_directory);
 
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &mem_start, __FUNCTION__);
+
+	MemoryContextFuncStatsReport();
+
 	return result;
 }
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 5f5596841c..af5eea2e31 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -4063,6 +4063,9 @@ do { \
 	ParamPathInfo *new_ppi;
 	ParamPathInfo *old_ppi;
 	Relids		required_outer;
+	MemoryContextCounters	mem_start;
+
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &mem_start, __FUNCTION__);
 
 	/*
 	 * If the path is not parameterized by parent of the given relation, it
@@ -4311,6 +4314,8 @@ do { \
 		ADJUST_CHILD_ATTRS(new_path->pathtarget->exprs);
 	}
 
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &mem_start, __FUNCTION__);
+
 	return new_path;
 }
 
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 15e3910b79..e305bc894a 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -32,6 +32,8 @@
 #include "parser/parse_relation.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "nodes/memnodes.h"
+#include "utils/memutils.h"
 
 
 typedef struct JoinHashEntry
@@ -861,16 +863,22 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 					 RelOptInfo *inner_rel, RelOptInfo *parent_joinrel,
 					 List *restrictlist, SpecialJoinInfo *sjinfo)
 {
-	RelOptInfo *joinrel = makeNode(RelOptInfo);
+	RelOptInfo *joinrel;
 	AppendRelInfo **appinfos;
 	int			nappinfos;
 
+	MemoryContextCounters mem_start;
+	MemoryContextCounters tlist_mem;
+	MemoryContextCounters jlist_mem;
+
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &mem_start, __FUNCTION__);
+
 	/* Only joins between "other" relations land here. */
 	Assert(IS_OTHER_REL(outer_rel) && IS_OTHER_REL(inner_rel));
 
 	/* The parent joinrel should have consider_partitionwise_join set. */
 	Assert(parent_joinrel->consider_partitionwise_join);
-
+	joinrel = makeNode(RelOptInfo);
 	joinrel->reloptkind = RELOPT_OTHER_JOINREL;
 	joinrel->relids = bms_union(outer_rel->relids, inner_rel->relids);
 	joinrel->relids = add_outer_joins_to_relids(root, joinrel->relids, sjinfo,
@@ -939,14 +947,18 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	appinfos = find_appinfos_by_relids(root, joinrel->relids, &nappinfos);
 
 	/* Set up reltarget struct */
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &tlist_mem, "targetlist");
 	build_child_join_reltarget(root, parent_joinrel, joinrel,
 							   nappinfos, appinfos);
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &tlist_mem, "targetlist");
 
 	/* Construct joininfo list. */
+	MemoryContextFuncStatsStart(CurrentMemoryContext, &jlist_mem, "joininfo");
 	joinrel->joininfo = (List *) adjust_appendrel_attrs(root,
 														(Node *) parent_joinrel->joininfo,
 														nappinfos,
 														appinfos);
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &jlist_mem, "joininfo");
 
 	/*
 	 * Lateral relids referred in child join will be same as that referred in
@@ -991,6 +1003,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 
 	pfree(appinfos);
 
+	MemoryContextFuncStatsEnd(CurrentMemoryContext, &mem_start, __FUNCTION__);
+
 	return joinrel;
 }
 
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 9fc83f11f6..54320536e1 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -150,7 +150,7 @@ MemoryContext CurTransactionContext = NULL;
 MemoryContext PortalContext = NULL;
 
 static void MemoryContextCallResetCallbacks(MemoryContext context);
-static void MemoryContextStatsInternal(MemoryContext context, int level,
+void MemoryContextStatsInternal(MemoryContext context, int level,
 									   bool print, int max_children,
 									   MemoryContextCounters *totals,
 									   bool print_to_stderr);
@@ -754,7 +754,7 @@ MemoryContextStatsDetail(MemoryContext context, int max_children,
  * Print this context if print is true, but in any case accumulate counts into
  * *totals (if given).
  */
-static void
+void
 MemoryContextStatsInternal(MemoryContext context, int level,
 						   bool print, int max_children,
 						   MemoryContextCounters *totals,
@@ -838,6 +838,75 @@ MemoryContextStatsInternal(MemoryContext context, int level,
 	}
 }
 
+HTAB *mem_cons_ht = NULL;
+typedef struct
+{
+	char	label[NAMEDATALEN];
+	Size	consumed_mem;
+} MemConsEntry;
+
+void
+MemoryContextFuncStatsStart(MemoryContext context,
+							MemoryContextCounters *start_counts,
+							const char *label)
+{
+	if (!mem_cons_ht)
+	{
+		HASHCTL ctl;
+
+		ctl.keysize = NAMEDATALEN;
+		ctl.entrysize = sizeof(MemConsEntry);
+
+		mem_cons_ht = hash_create("planner memory consumption", 10, &ctl, HASH_ELEM | HASH_STRINGS);
+	}
+
+	memset(start_counts, 0, sizeof(*start_counts));
+	MemoryContextStatsInternal(context, 0, false, 100, start_counts, false);
+}
+
+void
+MemoryContextFuncStatsEnd(MemoryContext context,
+						  MemoryContextCounters *start_counts,
+						  const char *label)
+{
+	MemoryContextCounters end_counts;
+	Size	start_used_space = start_counts->totalspace - start_counts->freespace;
+	Size	end_used_space;
+	bool	found;
+	MemConsEntry   *entry;
+
+	memset(&end_counts, 0, sizeof(end_counts));
+	MemoryContextStatsInternal(context, 0, false, 100, &end_counts, false);
+	end_used_space = end_counts.totalspace - end_counts.freespace;
+
+/*
+	elog(NOTICE, "%s,%s,%zu,%zu,%ld", label, context->name,
+		 start_used_space, end_used_space, end_used_space - start_used_space);
+*/
+	entry = hash_search(mem_cons_ht, label, HASH_ENTER, &found);
+	if (!found)
+	{
+		strncpy(entry->label, label, NAMEDATALEN);
+		entry->consumed_mem = 0;
+	}
+	Assert(strncmp(entry->label, label, NAMEDATALEN) == 0);
+	entry->consumed_mem = entry->consumed_mem + (end_used_space - start_used_space);
+}
+
+void
+MemoryContextFuncStatsReport(void)
+{
+	HASH_SEQ_STATUS status;
+	MemConsEntry   *entry;
+
+	/* Walk through the hash table printing memory consumed by the label. */
+	hash_seq_init(&status, mem_cons_ht);
+	while ((entry = (MemConsEntry *) hash_seq_search(&status)) != NULL)
+		elog(NOTICE, "%s, %ld", entry->label, entry->consumed_mem);
+	hash_destroy(mem_cons_ht);
+	mem_cons_ht = NULL;
+}
+
 /*
  * MemoryContextStatsPrint
  *		Print callback used by MemoryContextStatsInternal
diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index 21640d62a6..0027a27981 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -92,6 +92,13 @@ extern void MemoryContextStatsDetail(MemoryContext context, int max_children,
 									 bool print_to_stderr);
 extern void MemoryContextAllowInCriticalSection(MemoryContext context,
 												bool allow);
+extern void MemoryContextFuncStatsStart(MemoryContext context,
+										MemoryContextCounters *start_counts,
+										const char *func_label);
+extern void MemoryContextFuncStatsEnd(MemoryContext context,
+									  MemoryContextCounters *start_counts,
+									  const char *func_label);
+extern void MemoryContextFuncStatsReport(void);
 
 #ifdef MEMORY_CONTEXT_CHECKING
 extern void MemoryContextCheck(MemoryContext context);
-- 
2.25.1

From 0f8b6153c95b5dd42c6549af47ea27ef97fd7c04 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
Date: Wed, 12 Jul 2023 14:34:14 +0530
Subject: [PATCH 1/7] Report memory used for planning a query in EXPLAIN
 ANALYZE

The memory used in the CurrentMemoryContext and its children is sampled
before and after calling pg_plan_query() from ExplainOneQuery(). The
difference in the two samples is reported as the memory consumed while
planning the query. This may not account for the memory allocated in
memory contexts which are not children of CurrentMemoryContext. These
contexts are usually other long lived contexts, e.g.
CacheMemoryContext, which are shared by all the queries run in a
session. The consumption in those can not be attributed only to a given
query and hence should not be reported any way.

The memory consumption is reported as "Planning Memory" property in
EXPLAIN ANALYZE output.

Ashutosh Bapat
---
 src/backend/commands/explain.c | 12 ++++++++++--
 src/backend/commands/prepare.c |  7 ++++++-
 src/backend/utils/mmgr/mcxt.c  | 19 +++++++++++++++++++
 src/include/commands/explain.h |  3 ++-
 src/include/utils/memutils.h   |  1 +
 5 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 8570b14f62..9f859949f0 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -397,16 +397,20 @@ ExplainOneQuery(Query *query, int cursorOptions,
 					planduration;
 		BufferUsage bufusage_start,
 					bufusage;
+		Size		mem_consumed;
 
 		if (es->buffers)
 			bufusage_start = pgBufferUsage;
 		INSTR_TIME_SET_CURRENT(planstart);
+		mem_consumed = MemoryContextMemUsed(CurrentMemoryContext);
 
 		/* plan the query */
 		plan = pg_plan_query(query, queryString, cursorOptions, params);
 
 		INSTR_TIME_SET_CURRENT(planduration);
 		INSTR_TIME_SUBTRACT(planduration, planstart);
+		mem_consumed = MemoryContextMemUsed(CurrentMemoryContext)
+						- mem_consumed;
 
 		/* calc differences of buffer counters. */
 		if (es->buffers)
@@ -417,7 +421,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 
 		/* run it (if needed) and produce output */
 		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
-					   &planduration, (es->buffers ? &bufusage : NULL));
+					   &planduration, (es->buffers ? &bufusage : NULL), &mem_consumed);
 	}
 }
 
@@ -527,7 +531,7 @@ void
 ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
-			   const BufferUsage *bufusage)
+			   const BufferUsage *bufusage, const Size *mem_consumed)
 {
 	DestReceiver *dest;
 	QueryDesc  *queryDesc;
@@ -628,6 +632,10 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		double		plantime = INSTR_TIME_GET_DOUBLE(*planduration);
 
 		ExplainPropertyFloat("Planning Time", "ms", 1000.0 * plantime, 3, es);
+
+		if (mem_consumed)
+			ExplainPropertyUInteger("Planning Memory", "bytes",
+										(uint64) *mem_consumed, es);
 	}
 
 	/* Print info about runtime of triggers */
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 18f70319fc..84544ce481 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -583,10 +583,12 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	instr_time	planduration;
 	BufferUsage bufusage_start,
 				bufusage;
+	Size		mem_consumed;
 
 	if (es->buffers)
 		bufusage_start = pgBufferUsage;
 	INSTR_TIME_SET_CURRENT(planstart);
+	mem_consumed = MemoryContextMemUsed(CurrentMemoryContext);
 
 	/* Look it up in the hash table */
 	entry = FetchPreparedStatement(execstmt->name, true);
@@ -623,6 +625,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
+	mem_consumed = MemoryContextMemUsed(CurrentMemoryContext)
+					- mem_consumed;
 
 	/* calc differences of buffer counters. */
 	if (es->buffers)
@@ -640,7 +644,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 		if (pstmt->commandType != CMD_UTILITY)
 			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+						   &planduration, (es->buffers ? &bufusage : NULL),
+						   &mem_consumed);
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 9fc83f11f6..43af271f33 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -747,6 +747,25 @@ MemoryContextStatsDetail(MemoryContext context, int max_children,
 								 grand_totals.totalspace - grand_totals.freespace)));
 }
 
+/*
+ * Compute the memory used in the given context and its children.
+ *
+ * XXX: Instead of this separate function we could modify
+ * MemoryContextStatsDetail() to report used memory and disable printing the
+ * detailed stats.
+ */
+extern Size
+MemoryContextMemUsed(MemoryContext context)
+{
+	MemoryContextCounters grand_totals;
+
+	memset(&grand_totals, 0, sizeof(grand_totals));
+
+	MemoryContextStatsInternal(context, 0, false, 100, &grand_totals, false);
+
+	return grand_totals.totalspace - grand_totals.freespace;
+}
+
 /*
  * MemoryContextStatsInternal
  *		One recursion level for MemoryContextStats
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3d3e632a0c..21e3d2f309 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -92,7 +92,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
-						   const BufferUsage *bufusage);
+						   const BufferUsage *bufusage,
+						   const Size *mem_consumed);
 
 extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc);
 extern void ExplainPrintTriggers(ExplainState *es, QueryDesc *queryDesc);
diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index 21640d62a6..d7c477f229 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -92,6 +92,7 @@ extern void MemoryContextStatsDetail(MemoryContext context, int max_children,
 									 bool print_to_stderr);
 extern void MemoryContextAllowInCriticalSection(MemoryContext context,
 												bool allow);
+extern Size MemoryContextMemUsed(MemoryContext context);
 
 #ifdef MEMORY_CONTEXT_CHECKING
 extern void MemoryContextCheck(MemoryContext context);
-- 
2.25.1

Reply via email to