On Tue, Jul 23, 2024 at 2:05 PM Richard Guo <guofengli...@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 3:48 PM Ashutosh Bapat
> <ashutosh.bapat....@gmail.com> wrote:
> > This is different. But it needs a rebase. PFA rebased patch. The revised 
> > commit message explains the change better.
>
> I looked through this patch and I think it is in a good shape.
>
> Some minor comments:
>
> * I think it's better to declare 'child_relids' as type Relids
> rather than Bitmapset *.  While they are the same thing, Bitmapset *
> isn't typically used in joinrels.c.

Agreed. Sorry. Fixed now.

> And I doubt that it is necessary
> to explicitly initialize it to NULL.

Done.

>
> * This patch changes the signature of function build_child_join_rel().
> I recommend arranging the two newly added parameters as 'int
> nappinfos, AppendRelInfo **appinfos' to maintain consistency with
> other functions that include these parameters.

Yes. Done.

>
> * At the end of the for loop in try_partitionwise_join(), we attempt
> to free the variables used within the loop.  I suggest freeing these
> variables in the order or reverse order of their allocation, rather
> than arbitrarily.

Done. Are you suggesting it for aesthetic purposes or there's
something else (read, less defragmentation, performance gain etc.)? I
am curious.

> Also, in non-partitionwise-join planning, we do not
> free local variables in this manner.  I understand that we do this for
> partitionwise-join because accumulating local variables can lead to
> significant memory usage, particularly with many partitions.  I think
> it would be beneficial to add a comment in try_partitionwise_join()
> explaining this behavior.

Sounds useful. Done.

PFA patches:
0001 - same as previous one
0002 - addresses your review comments

--
Best Wishes,
Ashutosh Bapat
From 53be9fd80aa89bf7612105ddfbbe48200b1f0ce0 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com>
Date: Wed, 26 Jul 2023 12:08:55 +0530
Subject: [PATCH 1/2] Avoid repeated computation in try_partitionwise_join()
 and build_child_join_rel()

try_partitionwise_join() computes bms_union() of Bitmapsets representing
joining child relations and fetches AppendRelInfos corresponding child
base relations participating in the join. The same computation is
repeated in build_child_join_rel(). Avoid this repeated computation by
performing it only once in try_partitionwise_join() and passing the
AppendRelInfos gathered there to build_child_join_rel().

Bitmapsets representing child relids consume large memory when thousands
of partitions are involved. By performing the bms_union() only once and
freeing the result when it's no longer required, we save memory. The
memory savings are considerable when many partitioned tables with
thousands of partitions are joined using partitionwise join.

Author: Ashutosh Bapat
Reviewed-by: David Christensen
Discussion: https://www.postgresql.org/message-id/flat/CAExHW5snUW7pD2RdtaBa1T_TqJYaY6W_YPVjWDrgSf33i-0uqA%40mail.gmail.com#1f9e0950518310dd300e21574979646f
---
 src/backend/optimizer/path/joinrels.c | 10 ++++++----
 src/backend/optimizer/util/relnode.c  | 18 +++---------------
 src/include/optimizer/pathnode.h      |  3 ++-
 3 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index a3677f824f..f9ab82a23f 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1547,6 +1547,7 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		RelOptInfo *child_joinrel;
 		AppendRelInfo **appinfos;
 		int			nappinfos;
+		Bitmapset  *child_relids = NULL;
 
 		if (joinrel->partbounds_merged)
 		{
@@ -1642,9 +1643,8 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 											   child_rel2->relids);
 
 		/* Find the AppendRelInfo structures */
-		appinfos = find_appinfos_by_relids(root,
-										   bms_union(child_rel1->relids,
-													 child_rel2->relids),
+		child_relids = bms_union(child_rel1->relids, child_rel2->relids);
+		appinfos = find_appinfos_by_relids(root, child_relids,
 										   &nappinfos);
 
 		/*
@@ -1662,7 +1662,8 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		{
 			child_joinrel = build_child_join_rel(root, child_rel1, child_rel2,
 												 joinrel, child_restrictlist,
-												 child_sjinfo);
+												 child_sjinfo, appinfos,
+												 nappinfos);
 			joinrel->part_rels[cnt_parts] = child_joinrel;
 			joinrel->live_parts = bms_add_member(joinrel->live_parts, cnt_parts);
 			joinrel->all_partrels = bms_add_members(joinrel->all_partrels,
@@ -1681,6 +1682,7 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 
 		pfree(appinfos);
 		free_child_join_sjinfo(child_sjinfo);
+		bms_free(child_relids);
 	}
 }
 
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index e05b21c884..989bee0fa8 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -876,15 +876,15 @@ build_join_rel(PlannerInfo *root,
  * 'restrictlist': list of RestrictInfo nodes that apply to this particular
  *		pair of joinable relations
  * 'sjinfo': child join's join-type details
+ * 'appinfos' and 'nappinfos': AppendRelInfo array for child relids
  */
 RelOptInfo *
 build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 					 RelOptInfo *inner_rel, RelOptInfo *parent_joinrel,
-					 List *restrictlist, SpecialJoinInfo *sjinfo)
+					 List *restrictlist, SpecialJoinInfo *sjinfo,
+					 AppendRelInfo **appinfos, int nappinfos)
 {
 	RelOptInfo *joinrel = makeNode(RelOptInfo);
-	AppendRelInfo **appinfos;
-	int			nappinfos;
 
 	/* Only joins between "other" relations land here. */
 	Assert(IS_OTHER_REL(outer_rel) && IS_OTHER_REL(inner_rel));
@@ -892,16 +892,6 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	/* The parent joinrel should have consider_partitionwise_join set. */
 	Assert(parent_joinrel->consider_partitionwise_join);
 
-	/*
-	 * Find the AppendRelInfo structures for the child baserels.  We'll need
-	 * these for computing the child join's relid set, and later for mapping
-	 * Vars to the child rel.
-	 */
-	appinfos = find_appinfos_by_relids(root,
-									   bms_union(outer_rel->relids,
-												 inner_rel->relids),
-									   &nappinfos);
-
 	joinrel->reloptkind = RELOPT_OTHER_JOINREL;
 	joinrel->relids = adjust_child_relids(parent_joinrel->relids,
 										  nappinfos, appinfos);
@@ -1017,8 +1007,6 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 										nappinfos, appinfos,
 										parent_joinrel, joinrel);
 
-	pfree(appinfos);
-
 	return joinrel;
 }
 
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 112e7c23d4..71fda9f2a4 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -346,6 +346,7 @@ extern Bitmapset *get_param_path_clause_serials(Path *path);
 extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										RelOptInfo *outer_rel, RelOptInfo *inner_rel,
 										RelOptInfo *parent_joinrel, List *restrictlist,
-										SpecialJoinInfo *sjinfo);
+										SpecialJoinInfo *sjinfo,
+										AppendRelInfo **appinfos, int nappinfos);
 
 #endif							/* PATHNODE_H */
-- 
2.34.1

From ba0eed8f13f65d28abfc934ec08b47c9e9ad9dd1 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com>
Date: Wed, 24 Jul 2024 11:55:17 +0530
Subject: [PATCH 2/2] Address Richard's comments

---
 src/backend/optimizer/path/joinrels.c | 12 ++++++++----
 src/backend/optimizer/util/relnode.c  |  4 ++--
 src/include/optimizer/pathnode.h      |  2 +-
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index f9ab82a23f..bf4605795a 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1547,7 +1547,7 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		RelOptInfo *child_joinrel;
 		AppendRelInfo **appinfos;
 		int			nappinfos;
-		Bitmapset  *child_relids = NULL;
+		Relids		child_relids;
 
 		if (joinrel->partbounds_merged)
 		{
@@ -1662,8 +1662,7 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		{
 			child_joinrel = build_child_join_rel(root, child_rel1, child_rel2,
 												 joinrel, child_restrictlist,
-												 child_sjinfo, appinfos,
-												 nappinfos);
+												 child_sjinfo, nappinfos, appinfos);
 			joinrel->part_rels[cnt_parts] = child_joinrel;
 			joinrel->live_parts = bms_add_member(joinrel->live_parts, cnt_parts);
 			joinrel->all_partrels = bms_add_members(joinrel->all_partrels,
@@ -1680,9 +1679,14 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 									child_joinrel, child_sjinfo,
 									child_restrictlist);
 
+		/*
+		 * When there are thousands of partitions involved, this loop will
+		 * accumulate a lot of memory in objects useful only in this loop.
+		 * Save it.
+		 */
 		pfree(appinfos);
-		free_child_join_sjinfo(child_sjinfo);
 		bms_free(child_relids);
+		free_child_join_sjinfo(child_sjinfo);
 	}
 }
 
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 989bee0fa8..971d1c7aae 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -876,13 +876,13 @@ build_join_rel(PlannerInfo *root,
  * 'restrictlist': list of RestrictInfo nodes that apply to this particular
  *		pair of joinable relations
  * 'sjinfo': child join's join-type details
- * 'appinfos' and 'nappinfos': AppendRelInfo array for child relids
+ * 'nappinfos' and 'appinfos': AppendRelInfo array for child relids
  */
 RelOptInfo *
 build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 					 RelOptInfo *inner_rel, RelOptInfo *parent_joinrel,
 					 List *restrictlist, SpecialJoinInfo *sjinfo,
-					 AppendRelInfo **appinfos, int nappinfos)
+					 int nappinfos, AppendRelInfo **appinfos)
 {
 	RelOptInfo *joinrel = makeNode(RelOptInfo);
 
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 71fda9f2a4..f00bd55f39 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -347,6 +347,6 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										RelOptInfo *outer_rel, RelOptInfo *inner_rel,
 										RelOptInfo *parent_joinrel, List *restrictlist,
 										SpecialJoinInfo *sjinfo,
-										AppendRelInfo **appinfos, int nappinfos);
+										int nappinfos, AppendRelInfo **appinfos);
 
 #endif							/* PATHNODE_H */
-- 
2.34.1

Reply via email to