Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

Tomas Vondra Fri, 10 Dec 2021 17:34:51 -0800


On 12/10/21 00:51, Arne Roland wrote:

Hi,

thanks for the reply!

From: Tomas Vondra <tomas.von...@enterprisedb.com>
Sent: Thursday, December 2, 2021 20:58

Subject: Re: PATCH: generate fractional cheapest paths ingenerate_orderedappend_path

 > [...]
 > Well, I mentioned three open questions in my first message, and I don't
 > think we've really addressed them yet. We've agreed we don't need to add
 > the incremental sort here, but that leaves
 >
 >
 > 1) If get_cheapest_fractional_path_for_pathkeys returns NULL, should we
 > default to cheapest_startup or cheapest_total?

I think it's reasonable to use cheapest_total like we are doing now. Ihardly see any reason to change it.


True, it's a reasonable first step.

Either we generate the same plan as today (with cheapest_total), ormaybe a better one (if we find a cheaper fractional path with matchingpathkeys). It's true we could do better, but that's life - it's not likewe consider every possible path everywhere.

The incremental sort case you mentioned, seems like the only case thatplan might be interesting. If we really want that to happen, we probablyshould check for that separately, i.e. having startup_fractional. Eventhough this is a fairly special case as it's mostly relevant forpartitionwise joins, I'm still not convinced it's worth the cpu cycles.The point is that in most cases factional and startup_fractional will bethe same anyways.

I don't think we can simply check for startup_fractional (although, I'mnot sure I fully understand what would that be). I think what we shouldreally do in this case is walking all the paths, ensuring it's properlysorted (with either full or incremental sort), and then pick thecheapest fractional path from these sorted paths. But I agree that seemspretty expensive.

And I suspect, even if they aren't, solving that from an applicationdevelopers point of view, is in most cases not that difficult. One canusually match the pathkey. I personally had a lot of real world issueswith missing fractional paths using primary keys. I think it's worthnoting that everything will likely match the partition keys anyways,because otherwise there is no chance of doing a merge append.


Yeah, I think you're right.

If I am not mistaken, in case we end up doing a full sort, thecheapest_total path should be completely sufficient.


Definitely true.

 > 2) Should get_cheapest_fractional_path_for_pathkeys worry about
 > require_parallel_safe? I think yes, but we need to update the patch.
I admit, that such a version ofget_cheapest_fractional_path_for_pathkeys could be consistent with otherfunctions. And I was confused about that before. But I am not sure whatto use require_parallel_safe for. build_minmax_path doesn't care, theyare never parallel safe. And afaict generate_orderedappend_paths caresneither, it considers all plans. For instance when it callsget_cheapest_path_for_pathkeys, it sets require_parallel_safe just tofalse as well.

True as well. It's looks a bit strange, but you're right neither placecares about parallel safety.

 > I'll take a closer look next week, once I get home from NYC, and I'll
 > submit an improved version for the January CF.
Thank you for your work! The current patch, apart from thecomments/regression tests, seems pretty reasonable to me.

Attached is a cleaned-up patch, with a simple regression test. I'll markthis as RFC and get it committed in a couple days.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 296dd75c1b..4a5f52287e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -1716,6 +1716,7 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 		List	   *pathkeys = (List *) lfirst(lcp);
 		List	   *startup_subpaths = NIL;
 		List	   *total_subpaths = NIL;
+		List	   *fractional_subpaths = NIL;
 		bool		startup_neq_total = false;
 		ListCell   *lcr;
 		bool		match_partition_order;
@@ -1745,7 +1746,8 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 		{
 			RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
 			Path	   *cheapest_startup,
-					   *cheapest_total;
+					   *cheapest_total,
+					   *cheapest_fractional = NULL;
 
 			/* Locate the right paths, if they are available. */
 			cheapest_startup =
@@ -1773,6 +1775,33 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 				Assert(cheapest_total->param_info == NULL);
 			}
 
+			/*
+			 * When needed (building fractional path), determine the cheapest
+			 * fractional path too.
+			 */
+			if (root->tuple_fraction > 0)
+			{
+				double	path_fraction = (1.0 / root->tuple_fraction);
+
+				cheapest_fractional =
+					get_cheapest_fractional_path_for_pathkeys(childrel->pathlist,
+															  pathkeys,
+															  NULL,
+															  path_fraction);
+
+				/*
+				 * If we found no path with matching pathkeys, use the cheapest
+				 * total path instead.
+				 *
+				 * XXX We might be smarter and consider partially sorted paths,
+				 * with just an incremental sort on top. That might be cheaper
+				 * than using cheapest_total with regular sort, but we'd have
+				 * to build all the incremental paths.
+				 */
+				if (!cheapest_fractional)
+					cheapest_fractional = cheapest_total;
+			}
+
 			/*
 			 * Notice whether we actually have different paths for the
 			 * "cheapest" and "total" cases; frequently there will be no point
@@ -1799,6 +1828,12 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 
 				startup_subpaths = lappend(startup_subpaths, cheapest_startup);
 				total_subpaths = lappend(total_subpaths, cheapest_total);
+
+				if (cheapest_fractional)
+				{
+					cheapest_fractional = get_singleton_append_subpath(cheapest_fractional);
+					fractional_subpaths = lappend(fractional_subpaths, cheapest_fractional);
+				}
 			}
 			else if (match_partition_order_desc)
 			{
@@ -1812,6 +1847,12 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 
 				startup_subpaths = lcons(cheapest_startup, startup_subpaths);
 				total_subpaths = lcons(cheapest_total, total_subpaths);
+
+				if (cheapest_fractional)
+				{
+					cheapest_fractional = get_singleton_append_subpath(cheapest_fractional);
+					fractional_subpaths = lcons(cheapest_fractional, fractional_subpaths);
+				}
 			}
 			else
 			{
@@ -1823,6 +1864,10 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										  &startup_subpaths, NULL);
 				accumulate_append_subpath(cheapest_total,
 										  &total_subpaths, NULL);
+
+				if (cheapest_fractional)
+					accumulate_append_subpath(cheapest_fractional,
+											  &fractional_subpaths, NULL);
 			}
 		}
 
@@ -1849,6 +1894,17 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 														  0,
 														  false,
 														  -1));
+
+			if (fractional_subpaths)
+				add_path(rel, (Path *) create_append_path(root,
+														  rel,
+														  fractional_subpaths,
+														  NIL,
+														  pathkeys,
+														  NULL,
+														  0,
+														  false,
+														  -1));
 		}
 		else
 		{
@@ -1864,6 +1920,13 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 																total_subpaths,
 																pathkeys,
 																NULL));
+
+			if (fractional_subpaths)
+				add_path(rel, (Path *) create_merge_append_path(root,
+																rel,
+																fractional_subpaths,
+																pathkeys,
+																NULL));
 		}
 	}
 }
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 27f7525b3e..bb5b7c47a4 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -4862,3 +4862,51 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2
   1 | 209 | 0009 |  1 | 209 | 0009
 (8 rows)
 
+-- partitionwise join with fractional paths
+CREATE TABLE fract_t (id BIGINT, PRIMARY KEY (id)) PARTITION BY RANGE (id);
+CREATE TABLE fract_t0 PARTITION OF fract_t FOR VALUES FROM ('0') TO ('1000');
+CREATE TABLE fract_t1 PARTITION OF fract_t FOR VALUES FROM ('1000') TO ('2000');
+-- insert data
+INSERT INTO fract_t (id) (SELECT generate_series(0, 1999));
+ANALYZE fract_t;
+-- verify plan; nested index only scans
+SET max_parallel_workers_per_gather = 0;
+SET enable_partitionwise_join = on;
+EXPLAIN (COSTS OFF)
+SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+                              QUERY PLAN                               
+-----------------------------------------------------------------------
+ Limit
+   ->  Merge Append
+         Sort Key: x.id
+         ->  Merge Left Join
+               Merge Cond: (x_1.id = y_1.id)
+               ->  Index Only Scan using fract_t0_pkey on fract_t0 x_1
+               ->  Index Only Scan using fract_t0_pkey on fract_t0 y_1
+         ->  Merge Left Join
+               Merge Cond: (x_2.id = y_2.id)
+               ->  Index Only Scan using fract_t1_pkey on fract_t1 x_2
+               ->  Index Only Scan using fract_t1_pkey on fract_t1 y_2
+(11 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id DESC LIMIT 10;
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
+ Limit
+   ->  Merge Append
+         Sort Key: x.id DESC
+         ->  Nested Loop Left Join
+               ->  Index Only Scan Backward using fract_t0_pkey on fract_t0 x_1
+               ->  Index Only Scan using fract_t0_pkey on fract_t0 y_1
+                     Index Cond: (id = x_1.id)
+         ->  Nested Loop Left Join
+               ->  Index Only Scan Backward using fract_t1_pkey on fract_t1 x_2
+               ->  Index Only Scan using fract_t1_pkey on fract_t1 y_2
+                     Index Cond: (id = x_2.id)
+(11 rows)
+
+-- cleanup
+DROP TABLE fract_t;
+RESET max_parallel_workers_per_gather;
+RESET enable_partitionwise_join;
diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql
index d97b5b69ff..67f506361f 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@@ -1142,3 +1142,28 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.c = t2
 EXPLAIN (COSTS OFF)
 SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE ((t1.b >= 100 AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210)) AND t1.c IN ('0004', '0009') ORDER BY t1.a, t1.b;
 SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE ((t1.b >= 100 AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210)) AND t1.c IN ('0004', '0009') ORDER BY t1.a, t1.b;
+
+-- partitionwise join with fractional paths
+CREATE TABLE fract_t (id BIGINT, PRIMARY KEY (id)) PARTITION BY RANGE (id);
+CREATE TABLE fract_t0 PARTITION OF fract_t FOR VALUES FROM ('0') TO ('1000');
+CREATE TABLE fract_t1 PARTITION OF fract_t FOR VALUES FROM ('1000') TO ('2000');
+
+-- insert data
+INSERT INTO fract_t (id) (SELECT generate_series(0, 1999));
+ANALYZE fract_t;
+
+-- verify plan; nested index only scans
+SET max_parallel_workers_per_gather = 0;
+SET enable_partitionwise_join = on;
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id DESC LIMIT 10;
+
+-- cleanup
+DROP TABLE fract_t;
+
+RESET max_parallel_workers_per_gather;
+RESET enable_partitionwise_join;

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

Reply via email to