Re: missing indexes in indexlist with partitioned tables

Arne Roland Tue, 01 Nov 2022 18:51:04 -0700

Hi!

I mainly changed the comments, the Assert and some casing.


> From: David Rowley <dgrowle...@gmail.com>
> Sent: Monday, October 3, 2022 00:51
>
> * In the header comment in get_relation_info(), I don't think we need
> to mention join removals explicitly. At a stretch, maybe mentioning
> "unique proofs" might be ok, but "various optimizations" might be
> better. If you mention "join removal", I fear that will just become
> outdated too quickly as further optimisations are added.  Likewise for
> the comment about "join pruning" you've added in the function body.
> FWIW, we call these "join removals" anyway.

I made them a little bit more vague. I also updated the description of 
indexlist in general.

> * I think we should put RelationGetNumberOfBlocks() back to what it
> was and just ensure we don't call that for partitioned indexes in
> get_relation_info(). (Yes, I know that was my change)

I don't think it's relevant who did it. I don't see much importance either way. 
I reverted it to the old state.

> * I can't quite figure out why you're doing "DROP TABLE a CASCADE;" in
> inherits.sql.  You've not changed anything else in that file. Did you
> mean to do this in join.sql?

The problem I encountered, was that simple copy of the test wasn't possible, 
because the tables were named the same way. It seemed intuitive to me to make 
the tests , such that there are no side-effects. I added a comment to the 
creation of those tables to make clear, that there are intended side effects by 
not dropping those tables.

> * The new test in join.sql. I understand that you've mostly copied the
> test from another place in the file and adjusted it to use a
> partitioned table. However, I don't really think you need to INSERT
> any data into those tables. I also think that using the table name of
> "a" is dangerous as it could conflict with another table by the same
> name in a parallel run of the tests.  The non-partitioned version is a
> TEMP table.  Also, it's slightly painful to look at the inconsistent
> capitalisation of SQL keywords in those queries you've added, again, I
> understand those are copied from above, but I see no need to duplicate
> the inconsistencies.  Perhaps it's fine to copy the upper case
> keywords in the DDL and keep all lowercase in the queries.  Also, I
> think you probably should just add a single simple join removal test
> for partitioned tables.  You're not exercising any code that the
> non-partitioned case isn't by adding any additional tests.  All that I
> think is worth testing here is ensuring nobody thinks that partitioned
> tables can get away with an empty indexlist again.

I am not sure, how thorough the tests on partitioned tables need to be. I 
guess, I will turn up more issues in production, than any test will be able to 
cover.

As a general sentiment, I wouldn't agree. The empty indexlist isn't the only 
interesting thing to test. The more we add optimizations, the more non trivial 
intersections of those start to break things again, we have fixed. A notable 
part of the complexity of the optimizer stems from the fact, that we apply most 
transformations in a fixed order. We obviously have to do that for performance 
reasons. But as long as we have that, we are prone to have cases where the 
ordering breaks part. Partitioned tables are a prominent cases here, because we 
always have the appendrel.

I removed some test cases here to half the amount of partitioned tables needed 
here. I don't see the value in having one simple explain less. But I do not 
have strong feelings about this. Are there any further opinions?

> * I had a bit of a 2nd thought on the test change in
> partition_join.sql.  I know I added the "c" column so that join
> removals didn't apply. I'm now thinking it's probably better to just
> change the queries to use JOIN ON rather than JOIN USING. Also, apply
> the correct alias to the ORDER BY.   This method should save from
> slowing the test down due to the additional column. We have some
> pretty slow buildfarm machines that this might actually make a
> meaningful difference to.

There is no real point in changing this, because we can just access the column 
that is hidden anyways to make the join removal impossible.
That is sort of the tick my version v6 was going for. Tbh I don't care much 
either way as long as the test still tests for the fractional merge append. I 
just switched back.

Afaiac creating and dropping a table is sort of the worst thing we can do, when 
thinking about tests. There is just so much work involved there. If I am 
concerned about test runtimes, I'd try to minimize that. This is even more true 
regarding tests for partitioned tables. To get a parted table with two 
partitions, we always have to create three tables. I do think there is a lot of 
potential to speed up the test times with that, but I'd suggest to handle that 
in a different thread.

> Thanks again for working on this.

Thank you for your input!

Arne

From 0f8f13e1fcc3c6c8f0cb812801d21547666a2229 Mon Sep 17 00:00:00 2001
From: Arne <a...@index.de>
Date: Mon, 31 Oct 2022 20:24:00 +0100
Subject: [PATCH v9_indexlist_contains_partitioned_indexes] indexlist

---
 src/backend/optimizer/util/plancat.c         | 226 ++++++++++---------
 src/backend/utils/adt/selfuncs.c             |   4 +
 src/include/nodes/pathnodes.h                |  11 +-
 src/test/regress/expected/inherit.out        |   2 +
 src/test/regress/expected/join.out           |  52 ++++-
 src/test/regress/expected/partition_join.out |   6 +-
 src/test/regress/sql/inherit.sql             |   4 +
 src/test/regress/sql/join.sql                |  34 ++-
 src/test/regress/sql/partition_join.sql      |   6 +-
 9 files changed, 231 insertions(+), 114 deletions(-)

diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 9defe37836..f231c5abbb 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -106,10 +106,12 @@ static void set_baserel_partition_constraint(Relation relation,
  * cases these are left as zeroes, but sometimes we need to compute attr
  * widths here, and we may as well cache the results for costsize.c.
  *
- * If inhparent is true, all we need to do is set up the attr arrays:
- * the RelOptInfo actually represents the appendrel formed by an inheritance
- * tree, and so the parent rel's physical size and index information isn't
- * important for it.
+ * If inhparent is true, we don't care about physical size, index am
+ * and estimates.
+ * We still need index uniqueness for join removal.
+ * When it comes to the path, the RelOptInfo actually represents
+ * the appendrel formed by an inheritance tree, and so the parent
+ * rel's physical size isn't important for it.
  */
 void
 get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
@@ -175,10 +177,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	/*
 	 * Make list of indexes.  Ignore indexes on system catalogs if told to.
-	 * Don't bother with indexes for an inheritance parent, either.
+	 * Don't bother with indexes from traditional inheritance parents.
+	 * We care about partitioned indexes for further optimisations, like join
+	 * removal, even if we don't have any am for them.
 	 */
-	if (inhparent ||
-		(IgnoreSystemIndexes && IsSystemRelation(relation)))
+	if ((inhparent && relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+		|| (IgnoreSystemIndexes && IsSystemRelation(relation)))
 		hasindex = false;
 	else
 		hasindex = relation->rd_rel->relhasindex;
@@ -209,8 +213,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 			IndexAmRoutine *amroutine;
 			IndexOptInfo *info;
 			int			ncolumns,
-						nkeycolumns;
-			int			i;
+						nkeycolumns,
+						i;
 
 			/*
 			 * Extract info from the relation descriptor for the index.
@@ -231,16 +235,6 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 				continue;
 			}
 
-			/*
-			 * Ignore partitioned indexes, since they are not usable for
-			 * queries.
-			 */
-			if (indexRelation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
-			{
-				index_close(indexRelation, NoLock);
-				continue;
-			}
-
 			/*
 			 * If the index is valid, but cannot yet be used, ignore it; but
 			 * mark the plan we are generating as transient. See
@@ -285,108 +279,133 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 			info->relam = indexRelation->rd_rel->relam;
 
-			/* We copy just the fields we need, not all of rd_indam */
-			amroutine = indexRelation->rd_indam;
-			info->amcanorderbyop = amroutine->amcanorderbyop;
-			info->amoptionalkey = amroutine->amoptionalkey;
-			info->amsearcharray = amroutine->amsearcharray;
-			info->amsearchnulls = amroutine->amsearchnulls;
-			info->amcanparallel = amroutine->amcanparallel;
-			info->amhasgettuple = (amroutine->amgettuple != NULL);
-			info->amhasgetbitmap = amroutine->amgetbitmap != NULL &&
-				relation->rd_tableam->scan_bitmap_next_block != NULL;
-			info->amcanmarkpos = (amroutine->ammarkpos != NULL &&
-								  amroutine->amrestrpos != NULL);
-			info->amcostestimate = amroutine->amcostestimate;
-			Assert(info->amcostestimate != NULL);
-
-			/* Fetch index opclass options */
-			info->opclassoptions = RelationGetIndexAttOptions(indexRelation, true);
-
 			/*
-			 * Fetch the ordering information for the index, if any.
+			 * We don't have am for partitioned indexes, therefore we skip
+			 * gathering the info. This is ok, because we don't use
+			 * partitioned indexes in paths. We resolve inheritance before
+			 * generating paths.
 			 */
-			if (info->relam == BTREE_AM_OID)
+			if (indexRelation->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
 			{
+				/* We copy just the fields we need, not all of rd_indam */
+				amroutine = indexRelation->rd_indam;
+				info->amcanorderbyop = amroutine->amcanorderbyop;
+				info->amoptionalkey = amroutine->amoptionalkey;
+				info->amsearcharray = amroutine->amsearcharray;
+				info->amsearchnulls = amroutine->amsearchnulls;
+				info->amcanparallel = amroutine->amcanparallel;
+				info->amhasgettuple = (amroutine->amgettuple != NULL);
+				info->amhasgetbitmap = amroutine->amgetbitmap != NULL &&
+					relation->rd_tableam->scan_bitmap_next_block != NULL;
+				info->amcanmarkpos = (amroutine->ammarkpos != NULL &&
+					amroutine->amrestrpos != NULL);
+				info->amcostestimate = amroutine->amcostestimate;
+				Assert(info->amcostestimate != NULL);
+
+				/* Fetch index opclass options */
+				info->opclassoptions = RelationGetIndexAttOptions(indexRelation, true);
+
 				/*
-				 * If it's a btree index, we can use its opfamily OIDs
-				 * directly as the sort ordering opfamily OIDs.
+				 * Fetch the ordering information for the index, if any.
 				 */
-				Assert(amroutine->amcanorder);
-
-				info->sortopfamily = info->opfamily;
-				info->reverse_sort = (bool *) palloc(sizeof(bool) * nkeycolumns);
-				info->nulls_first = (bool *) palloc(sizeof(bool) * nkeycolumns);
-
-				for (i = 0; i < nkeycolumns; i++)
+				if (info->relam == BTREE_AM_OID)
 				{
-					int16		opt = indexRelation->rd_indoption[i];
+					/*
+					 * If it's a btree index, we can use its opfamily OIDs
+					 * directly as the sort ordering opfamily OIDs.
+					 */
+					Assert(amroutine->amcanorder);
 
-					info->reverse_sort[i] = (opt & INDOPTION_DESC) != 0;
-					info->nulls_first[i] = (opt & INDOPTION_NULLS_FIRST) != 0;
-				}
-			}
-			else if (amroutine->amcanorder)
-			{
-				/*
-				 * Otherwise, identify the corresponding btree opfamilies by
-				 * trying to map this index's "<" operators into btree.  Since
-				 * "<" uniquely defines the behavior of a sort order, this is
-				 * a sufficient test.
-				 *
-				 * XXX This method is rather slow and also requires the
-				 * undesirable assumption that the other index AM numbers its
-				 * strategies the same as btree.  It'd be better to have a way
-				 * to explicitly declare the corresponding btree opfamily for
-				 * each opfamily of the other index type.  But given the lack
-				 * of current or foreseeable amcanorder index types, it's not
-				 * worth expending more effort on now.
-				 */
-				info->sortopfamily = (Oid *) palloc(sizeof(Oid) * nkeycolumns);
-				info->reverse_sort = (bool *) palloc(sizeof(bool) * nkeycolumns);
-				info->nulls_first = (bool *) palloc(sizeof(bool) * nkeycolumns);
+					info->sortopfamily = info->opfamily;
+					info->reverse_sort = (bool *)palloc(sizeof(bool) * nkeycolumns);
+					info->nulls_first = (bool *)palloc(sizeof(bool) * nkeycolumns);
 
-				for (i = 0; i < nkeycolumns; i++)
-				{
-					int16		opt = indexRelation->rd_indoption[i];
-					Oid			ltopr;
-					Oid			btopfamily;
-					Oid			btopcintype;
-					int16		btstrategy;
-
-					info->reverse_sort[i] = (opt & INDOPTION_DESC) != 0;
-					info->nulls_first[i] = (opt & INDOPTION_NULLS_FIRST) != 0;
-
-					ltopr = get_opfamily_member(info->opfamily[i],
-												info->opcintype[i],
-												info->opcintype[i],
-												BTLessStrategyNumber);
-					if (OidIsValid(ltopr) &&
-						get_ordering_op_properties(ltopr,
-												   &btopfamily,
-												   &btopcintype,
-												   &btstrategy) &&
-						btopcintype == info->opcintype[i] &&
-						btstrategy == BTLessStrategyNumber)
+					for (i = 0; i < nkeycolumns; i++)
 					{
-						/* Successful mapping */
-						info->sortopfamily[i] = btopfamily;
+						int16		opt = indexRelation->rd_indoption[i];
+
+						info->reverse_sort[i] = (opt & INDOPTION_DESC) != 0;
+						info->nulls_first[i] = (opt & INDOPTION_NULLS_FIRST) != 0;
 					}
-					else
+				}
+				else if (amroutine->amcanorder)
+				{
+					/*
+					 * Otherwise, identify the corresponding btree opfamilies by
+					 * trying to map this index's "<" operators into btree.  Since
+					 * "<" uniquely defines the behavior of a sort order, this is
+					 * a sufficient test.
+					 *
+					 * XXX This method is rather slow and also requires the
+					 * undesirable assumption that the other index AM numbers its
+					 * strategies the same as btree.  It'd be better to have a way
+					 * to explicitly declare the corresponding btree opfamily for
+					 * each opfamily of the other index type.  But given the lack
+					 * of current or foreseeable amcanorder index types, it's not
+					 * worth expending more effort on now.
+					 */
+					info->sortopfamily = (Oid *)palloc(sizeof(Oid) * nkeycolumns);
+					info->reverse_sort = (bool *)palloc(sizeof(bool) * nkeycolumns);
+					info->nulls_first = (bool *)palloc(sizeof(bool) * nkeycolumns);
+
+					for (i = 0; i < nkeycolumns; i++)
 					{
-						/* Fail ... quietly treat index as unordered */
-						info->sortopfamily = NULL;
-						info->reverse_sort = NULL;
-						info->nulls_first = NULL;
-						break;
+						int16		opt = indexRelation->rd_indoption[i];
+						Oid			ltopr;
+						Oid			btopfamily;
+						Oid			btopcintype;
+						int16		btstrategy;
+
+						info->reverse_sort[i] = (opt & INDOPTION_DESC) != 0;
+						info->nulls_first[i] = (opt & INDOPTION_NULLS_FIRST) != 0;
+
+						ltopr = get_opfamily_member(info->opfamily[i],
+							info->opcintype[i],
+							info->opcintype[i],
+							BTLessStrategyNumber);
+						if (OidIsValid(ltopr) &&
+							get_ordering_op_properties(ltopr,
+								&btopfamily,
+								&btopcintype,
+								&btstrategy) &&
+							btopcintype == info->opcintype[i] &&
+							btstrategy == BTLessStrategyNumber)
+						{
+							/* Successful mapping */
+							info->sortopfamily[i] = btopfamily;
+						}
+						else
+						{
+							/* Fail ... quietly treat index as unordered */
+							info->sortopfamily = NULL;
+							info->reverse_sort = NULL;
+							info->nulls_first = NULL;
+							break;
+						}
 					}
 				}
+				else
+				{
+					info->sortopfamily = NULL;
+					info->reverse_sort = NULL;
+					info->nulls_first = NULL;
+				}
 			}
 			else
 			{
 				info->sortopfamily = NULL;
 				info->reverse_sort = NULL;
 				info->nulls_first = NULL;
+
+				info->amcanorderbyop = false;
+				info->amoptionalkey = false;
+				info->amsearcharray = false;
+				info->amsearchnulls = false;
+				info->amcanparallel = false;
+				info->amhasgettuple = false;
+				info->amhasgetbitmap = false;
+				info->amcanmarkpos = false;
+				info->amcostestimate = NULL;
 			}
 
 			/*
@@ -433,7 +452,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 					info->tuples = rel->tuples;
 			}
 
-			if (info->relam == BTREE_AM_OID)
+			if (info->relam == BTREE_AM_OID &&
+				indexRelation->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
 			{
 				/* For btrees, get tree height while we have the index open */
 				info->tree_height = _bt_getrootheight(indexRelation);
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 69e0fb98f5..2b543b7199 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5963,6 +5963,10 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
 	rte = root->simple_rte_array[rel->relid];
 	Assert(rte->rtekind == RTE_RELATION);
 
+	/* ignore partitioned tables.  Any indexes here are not real indexes */
+	if (rte->relkind == RELKIND_PARTITIONED_TABLE)
+		return false;
+
 	/* Search through the indexes to see if any match our problem */
 	foreach(lc, rel->indexlist)
 	{
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 09342d128d..a6ec525da7 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -628,7 +628,7 @@ typedef struct PartitionSchemeData *PartitionScheme;
  *		lateral_relids - required outer rels for LATERAL, as a Relids set
  *			(includes both direct and indirect lateral references)
  *
- * If the relation is a base relation it will have these fields set:
+ * If the relation is a base relation, it will have these fields set:
  *
  *		relid - RTE index (this is redundant with the relids field, but
  *				is provided for convenience of access)
@@ -643,8 +643,6 @@ typedef struct PartitionSchemeData *PartitionScheme;
  *					   Vars and PlaceHolderVars)
  *		lateral_referencers - relids of rels that reference this one laterally
  *				(includes both direct and indirect lateral references)
- *		indexlist - list of IndexOptInfo nodes for relation's indexes
- *					(always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -660,6 +658,13 @@ typedef struct PartitionSchemeData *PartitionScheme;
  *		For otherrels that are appendrel members, these fields are filled
  *		in just as for a baserel, except we don't bother with lateral_vars.
  *
+ * If the relation is either a base relation or a partitioned table, it will have a set:
+ *		indexlist - list of IndexOptInfo nodes for relation's indexes
+ *		                        (for patitioned indexes not all of the
+ *		                            IndexOptInfo fields are set, since they
+ *		                            can't be accessed)
+ *					(always NIL if it's not a table)
+ *
  * If the relation is either a foreign table or a join of foreign tables that
  * all belong to the same foreign server and are assigned to the same user to
  * check access permissions as (cf checkAsUser), these fields will be set:
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index 2d49e765de..80fc76ed32 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1,6 +1,8 @@
 --
 -- Test inheritance features
 --
+-- XXX These tables are not dropped. Why?
+-- XXX Are they needed for later tests? For dump-restore? Dropping them doesn't fail any test from check-world.
 CREATE TABLE a (aa TEXT);
 CREATE TABLE b (bb TEXT) INHERITS (a);
 CREATE TABLE c (cc TEXT) INHERITS (a);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 9b69a8c122..ff23808013 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4588,7 +4588,8 @@ select a.q2, b.q1
 reset enable_hashjoin;
 reset enable_nestloop;
 --
--- test join removal
+-- test join removal for plain tables
+-- we have almost the smare tests for partitioned tables below
 --
 begin;
 CREATE TEMP TABLE a (id int PRIMARY KEY, b_id int);
@@ -4720,6 +4721,55 @@ select 1 from (select a.id FROM a left join b on a.b_id = b.id) q,
          Filter: (a.id = i)
 (4 rows)
 
+rollback;
+--
+-- test join removal for partitioned tables
+-- this is analog to the non partitioned case above
+--
+begin;
+CREATE TABLE b_p (id int PRIMARY KEY, c_id int) partition by range(id);
+CREATE TABLE b_m partition of b_p for values from (0) to (10) partition by range(id);
+CREATE TABLE b_c partition of b_m for values from (0) to (10);
+CREATE TABLE c_p (id int PRIMARY KEY) partition by range(id);
+CREATE TABLE c_m partition of c_p for values from (0) to (10) partition by range(id);
+CREATE TABLE c_c partition of c_m for values from (0) to (10);
+INSERT INTO b_p VALUES (0, 0), (1, NULL);
+INSERT INTO c_p VALUES (0), (1);
+-- all three cases should be optimizable into a simple seqscan
+EXPLAIN (COSTS OFF) SELECT b0.* FROM b_p b0 LEFT JOIN b_p b ON b0.c_id = b.id;
+     QUERY PLAN     
+--------------------
+ Seq Scan on b_c b0
+(1 row)
+
+EXPLAIN (COSTS OFF) SELECT b.* FROM b_p b LEFT JOIN c_p c ON b.c_id = c.id;
+    QUERY PLAN     
+-------------------
+ Seq Scan on b_c b
+(1 row)
+
+EXPLAIN (COSTS OFF)
+  SELECT b0.* FROM b_p b0 LEFT JOIN (b_p b LEFT JOIN c_p c on b.c_id = c.id)
+  ON (b0.c_id = b.id);
+     QUERY PLAN     
+--------------------
+ Seq Scan on b_c b0
+(1 row)
+
+-- check optimization of outer join within another special join
+EXPLAIN (COSTS OFF)
+select id from b_p b0 WHERE id in (
+	SELECT b.id FROM b_p b LEFT JOIN c_p c on b.id = c.id
+);
+          QUERY PLAN           
+-------------------------------
+ Hash Join
+   Hash Cond: (b0.id = b.id)
+   ->  Seq Scan on b_c b0
+   ->  Hash
+         ->  Seq Scan on b_c b
+(5 rows)
+
 rollback;
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index b20facc19f..dd8ad1c71a 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -4867,13 +4867,13 @@ CREATE TABLE fract_t (id BIGINT, PRIMARY KEY (id)) PARTITION BY RANGE (id);
 CREATE TABLE fract_t0 PARTITION OF fract_t FOR VALUES FROM ('0') TO ('1000');
 CREATE TABLE fract_t1 PARTITION OF fract_t FOR VALUES FROM ('1000') TO ('2000');
 -- insert data
-INSERT INTO fract_t (id) (SELECT generate_series(0, 1999));
+INSERT INTO fract_t (id) SELECT i FROM generate_series(0, 1999) i;
 ANALYZE fract_t;
 -- verify plan; nested index only scans
 SET max_parallel_workers_per_gather = 0;
 SET enable_partitionwise_join = on;
 EXPLAIN (COSTS OFF)
-SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10;
                               QUERY PLAN                               
 -----------------------------------------------------------------------
  Limit
@@ -4890,7 +4890,7 @@ SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
 (11 rows)
 
 EXPLAIN (COSTS OFF)
-SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id DESC LIMIT 10;
+SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id DESC LIMIT 10;
                                    QUERY PLAN                                   
 --------------------------------------------------------------------------------
  Limit
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
index 195aedb5ff..69600bb068 100644
--- a/src/test/regress/sql/inherit.sql
+++ b/src/test/regress/sql/inherit.sql
@@ -1,6 +1,10 @@
 --
 -- Test inheritance features
 --
+
+-- XXX These tables are not dropped. Why?
+-- XXX Are they needed for later tests? For dump-restore? Dropping them doesn't fail any test from check-world.
+
 CREATE TABLE a (aa TEXT);
 CREATE TABLE b (bb TEXT) INHERITS (a);
 CREATE TABLE c (cc TEXT) INHERITS (a);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 27e7e741a1..24a18d83b8 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -1579,7 +1579,8 @@ reset enable_hashjoin;
 reset enable_nestloop;
 
 --
--- test join removal
+-- test join removal for plain tables
+-- we have almost the smare tests for partitioned tables below
 --
 
 begin;
@@ -1648,6 +1649,37 @@ select 1 from (select a.id FROM a left join b on a.b_id = b.id) q,
 
 rollback;
 
+--
+-- test join removal for partitioned tables
+-- this is analog to the non partitioned case above
+--
+
+begin;
+
+CREATE TABLE b_p (id int PRIMARY KEY, c_id int) partition by range(id);
+CREATE TABLE b_m partition of b_p for values from (0) to (10) partition by range(id);
+CREATE TABLE b_c partition of b_m for values from (0) to (10);
+CREATE TABLE c_p (id int PRIMARY KEY) partition by range(id);
+CREATE TABLE c_m partition of c_p for values from (0) to (10) partition by range(id);
+CREATE TABLE c_c partition of c_m for values from (0) to (10);
+INSERT INTO b_p VALUES (0, 0), (1, NULL);
+INSERT INTO c_p VALUES (0), (1);
+
+-- all three cases should be optimizable into a simple seqscan
+EXPLAIN (COSTS OFF) SELECT b0.* FROM b_p b0 LEFT JOIN b_p b ON b0.c_id = b.id;
+EXPLAIN (COSTS OFF) SELECT b.* FROM b_p b LEFT JOIN c_p c ON b.c_id = c.id;
+EXPLAIN (COSTS OFF)
+  SELECT b0.* FROM b_p b0 LEFT JOIN (b_p b LEFT JOIN c_p c on b.c_id = c.id)
+  ON (b0.c_id = b.id);
+
+-- check optimization of outer join within another special join
+EXPLAIN (COSTS OFF)
+select id from b_p b0 WHERE id in (
+	SELECT b.id FROM b_p b LEFT JOIN c_p c on b.id = c.id
+);
+
+rollback;
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql
index 67f506361f..94f48a7c51 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@@ -1149,7 +1149,7 @@ CREATE TABLE fract_t0 PARTITION OF fract_t FOR VALUES FROM ('0') TO ('1000');
 CREATE TABLE fract_t1 PARTITION OF fract_t FOR VALUES FROM ('1000') TO ('2000');
 
 -- insert data
-INSERT INTO fract_t (id) (SELECT generate_series(0, 1999));
+INSERT INTO fract_t (id) SELECT i FROM generate_series(0, 1999) i;
 ANALYZE fract_t;
 
 -- verify plan; nested index only scans
@@ -1157,10 +1157,10 @@ SET max_parallel_workers_per_gather = 0;
 SET enable_partitionwise_join = on;
 
 EXPLAIN (COSTS OFF)
-SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10;
 
 EXPLAIN (COSTS OFF)
-SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id DESC LIMIT 10;
+SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id DESC LIMIT 10;
 
 -- cleanup
 DROP TABLE fract_t;
-- 
2.35.3

Re: missing indexes in indexlist with partitioned tables

Reply via email to