On 18/9/2024 17:48, Robert Haas wrote:
Comments?
Let me share my personal experience on path management in the planner.
The main thing important for extensions is flexibility - I would discuss
a decision that is not limited by join ordering but could be applied to
implement an index picking strategy, Memoize/Material choice versus a
non-cached one, choice of custom paths, etc.
The most flexible way I have found to this moment is a collaboration
between the get_relation_info_hook and add_path hook. In
get_relation_info, we have enough context and can add some information
to RelOptInfo - I added an extra list to this structure where extensions
can add helpful info. Using extensible nodes, we can tackle interference
between extensions.
The add_path hook can analyse new and old paths and also look into the
extensible data inside RelOptInfo. The issue with lots of calls eases by
quick return on the out-of-scope paths: usually extensions manage some
specific path type or relation and quick test of RelOptInfo::extra_list
allow to sort out unnecessary cases.
Being flexible, this approach is less invasive. Now, I use it to
implement heuristics demanded by clients for cases when the estimator
predicts only one row - usually, it means that the optimiser
underestimates cardinality. For example, in-place switch-off of NestLoop
if it uses too many clauses, or rules to pick up index scan if we have
alternative scans, each of them predicts only one tuple.
Positive outcomes includes: we don't alter path costs; extension may be
sure that core doesn't remove path from the list if the extension
forbids it.
In attachment - hooks for add_path and add_partial_path. As you can see,
because of differences in these routines hooks also implemented
differently. Also the compare_path_costs_fuzzily is exported, but it is
really good stuff for an extension.
--
regards, Andrei Lepikhov
From af8f5bd65e33c819723231ce433dcd51438b7ef0 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <lepi...@gmail.com>
Date: Thu, 26 Sep 2024 10:34:08 +0200
Subject: [PATCH 1/2] Introduce compare_path_hook.
The add_path function is the only interface to safely add and remove paths in
the pathlist. Postgres core provides a few optimisation hooks that an extension
can use to offer additional paths. However, it is still uncertain whether such
a path will be replaced until the end of the path population process.
This hook allows the extension to control the comparison of a new path with
each pathlist's path and denote the optimiser which path is better.
It doesn't point to the optimiser directly about what exactly to do with the
incoming path and paths, already existed in pathlist.
Tag: optimizer.
caused by: PGPRO-10445, PGPRO-11017.
---
src/backend/optimizer/util/pathnode.c | 21 +++++++++++----------
src/include/optimizer/pathnode.h | 17 +++++++++++++++++
2 files changed, 28 insertions(+), 10 deletions(-)
diff --git a/src/backend/optimizer/util/pathnode.c
b/src/backend/optimizer/util/pathnode.c
index fc97bf6ee2..ae57932862 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -31,13 +31,6 @@
#include "utils/memutils.h"
#include "utils/selfuncs.h"
-typedef enum
-{
- COSTS_EQUAL, /* path costs are fuzzily equal
*/
- COSTS_BETTER1, /* first path is cheaper than
second */
- COSTS_BETTER2, /* second path is cheaper than
first */
- COSTS_DIFFERENT, /* neither path dominates the
other on cost */
-} PathCostComparison;
/*
* STD_FUZZ_FACTOR is the normal fuzz factor for compare_path_costs_fuzzily.
@@ -46,6 +39,9 @@ typedef enum
*/
#define STD_FUZZ_FACTOR 1.01
+/* Hook for plugins to get control over the add_path decision */
+compare_path_hook_type compare_path_hook = NULL;
+
static List *translate_sub_tlist(List *tlist, int relid);
static int append_total_cost_compare(const ListCell *a, const ListCell *b);
static int append_startup_cost_compare(const ListCell *a, const ListCell
*b);
@@ -178,7 +174,7 @@ compare_fractional_path_costs(Path *path1, Path *path2,
* (But if total costs are fuzzily equal, we compare startup costs anyway,
* in hopes of eliminating one path or the other.)
*/
-static PathCostComparison
+PathCostComparison
compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
{
#define CONSIDER_PATH_STARTUP_COST(p) \
@@ -490,8 +486,13 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
/*
* Do a fuzzy cost comparison with standard fuzziness limit.
*/
- costcmp = compare_path_costs_fuzzily(new_path, old_path,
-
STD_FUZZ_FACTOR);
+
+ if (compare_path_hook)
+ costcmp = (*compare_path_hook) (parent_rel, new_path,
old_path,
+
STD_FUZZ_FACTOR);
+ else
+ costcmp = compare_path_costs_fuzzily(new_path, old_path,
+
STD_FUZZ_FACTOR);
/*
* If the two paths compare differently for startup and total
cost,
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 1035e6560c..4b95d85e1a 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -18,11 +18,28 @@
#include "nodes/pathnodes.h"
+typedef enum
+{
+ COSTS_EQUAL, /* path costs are fuzzily equal
*/
+ COSTS_BETTER1, /* first path is cheaper than
second */
+ COSTS_BETTER2, /* second path is cheaper than
first */
+ COSTS_DIFFERENT, /* neither path dominates the
other on cost */
+} PathCostComparison;
+
+/* Hook for plugins to get control when grouping_planner() plans upper rels */
+typedef PathCostComparison (*compare_path_hook_type) (RelOptInfo *parent_rel,
+
Path *new_path,
+
Path *old_path,
+
double fuzz_factor);
+extern PGDLLIMPORT compare_path_hook_type compare_path_hook;
+
/*
* prototypes for pathnode.c
*/
extern int compare_path_costs(Path *path1, Path *path2,
CostSelector
criterion);
+extern PathCostComparison
+compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor);
extern int compare_fractional_path_costs(Path *path1, Path *path2,
double fraction);
extern void set_cheapest(RelOptInfo *parent_rel);
--
2.46.2
From f63c93a404554f7e48a01ed14661a119ae10b604 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <lepi...@gmail.com>
Date: Mon, 30 Sep 2024 13:32:43 +0700
Subject: [PATCH 2/2] Introduce compare_partial_path_hook.
By analogy of compare_path_hook, let an extension to alter decisions on
accepting new and removing old path. The add_partial_path() routine has
different logic. Hence, the hook also differs from non-partial case.
Also, move hooks and newly exported functions to paths.h. Do we need to
revert it and allow to stay in the pathnode.h?
---
src/backend/optimizer/util/pathnode.c | 5 +++++
src/include/optimizer/pathnode.h | 17 -----------------
src/include/optimizer/paths.h | 24 ++++++++++++++++++++++++
3 files changed, 29 insertions(+), 17 deletions(-)
diff --git a/src/backend/optimizer/util/pathnode.c
b/src/backend/optimizer/util/pathnode.c
index ae57932862..5f91ead226 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -41,6 +41,7 @@
/* Hook for plugins to get control over the add_path decision */
compare_path_hook_type compare_path_hook = NULL;
+compare_partial_path_hook_type compare_partial_path_hook = NULL;
static List *translate_sub_tlist(List *tlist, int relid);
static int append_total_cost_compare(const ListCell *a, const ListCell *b);
@@ -870,6 +871,10 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
}
}
+ if (compare_partial_path_hook)
+ (*compare_partial_path_hook)(parent_rel, new_path,
old_path,
+
&accept_new, &remove_old);
+
/*
* Remove current element from partial_pathlist if dominated by
new.
*/
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 4b95d85e1a..1035e6560c 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -18,28 +18,11 @@
#include "nodes/pathnodes.h"
-typedef enum
-{
- COSTS_EQUAL, /* path costs are fuzzily equal
*/
- COSTS_BETTER1, /* first path is cheaper than
second */
- COSTS_BETTER2, /* second path is cheaper than
first */
- COSTS_DIFFERENT, /* neither path dominates the
other on cost */
-} PathCostComparison;
-
-/* Hook for plugins to get control when grouping_planner() plans upper rels */
-typedef PathCostComparison (*compare_path_hook_type) (RelOptInfo *parent_rel,
-
Path *new_path,
-
Path *old_path,
-
double fuzz_factor);
-extern PGDLLIMPORT compare_path_hook_type compare_path_hook;
-
/*
* prototypes for pathnode.c
*/
extern int compare_path_costs(Path *path1, Path *path2,
CostSelector
criterion);
-extern PathCostComparison
-compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor);
extern int compare_fractional_path_costs(Path *path1, Path *path2,
double fraction);
extern void set_cheapest(RelOptInfo *parent_rel);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 54869d4401..47ae26033c 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -271,4 +271,28 @@ extern PathKey *make_canonical_pathkey(PlannerInfo *root,
extern void add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
List
*live_childrels);
+
+typedef enum
+{
+ COSTS_EQUAL, /* path costs are fuzzily equal
*/
+ COSTS_BETTER1, /* first path is cheaper than
second */
+ COSTS_BETTER2, /* second path is cheaper than
first */
+ COSTS_DIFFERENT, /* neither path dominates the
other on cost */
+} PathCostComparison;
+
+extern PathCostComparison compare_path_costs_fuzzily(Path *path1, Path *path2,
+
double fuzz_factor);
+/* Hook for plugins to get control when grouping_planner() plans upper rels */
+typedef PathCostComparison (*compare_path_hook_type) (RelOptInfo *rel,
+
Path *new_path,
+
Path *old_path,
+
double fuzz_factor);
+typedef void (*compare_partial_path_hook_type) (RelOptInfo *rel,
+
Path *new_path,
+
Path *old_path,
+
bool *accept_new,
+
bool *remove_old);
+extern PGDLLIMPORT compare_path_hook_type compare_path_hook;
+extern PGDLLIMPORT compare_partial_path_hook_type compare_partial_path_hook;
+
#endif /* PATHS_H */
--
2.46.2