On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote: > It seems like your patch should also check "inh" in examine_variable and > statext_expressions_load.
I tried adding that - I mostly kept my patches separate. Hopefully this is more helpful than a complication. I added at: https://commitfest.postgresql.org/35/3332/ + /* create only the "stxdinherit=false", because that always exists */ + datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false); That'd be confusing for partitioned tables, no? They'd always have an row with no data. I guess it could be stxdinherit = BoolGetDatum(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE). (not ObjectIdGetDatum). Then, that affects the loops which delete the tuples - neither inh nor !inh is guaranteed, unless you check relkind there, too. BTW, you'd need to add an "inherited" column to \dX if you added the "built" data back. Also, I think in backbranches we should document what's being stored in pg_statistic_ext, since it's pretty unintuitive: - noninherted stats (FROM ONLY) for inheritence parents; - inherted stats (FROM *) for partitioned tables; I think the !inh decision in 859b3003de was basically backwards. I think it'd be rare for someone to put extended stats on a parent for improving plans involving FROM ONLY. But it's not worth trying to fix now, since it would change plans in irreversible ways. Also, if the stx data were already populated, users would have to run a manual analyze after upgrading to populate the catalog with the data the planner would expect in the new version, or else it would end up being the opposite of the issue I mentioned: non-inherited stats (from before the upgrade) would be applied by the planner (after the upgrade) to inherited queries.
>From b8f9e453b6cd05718a7a149f7e472d6c2c28c8a6 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Sat, 25 Sep 2021 19:42:41 -0500 Subject: [PATCH 1/5] Do not use extended statistics on inheritence trees.. Since 859b3003de, inherited ext stats are not built. However, the non-inherited stats stats were incorrectly used during planning of queries with inheritence heirarchies. Since the ext stats do not include child tables, they can lead to worse estimates. choose_best_statistics is handled a bit differently (in the calling function), because it isn't passed rel nor rel->inh, and it's an exported function, so avoid changing its signature in back branches. https://www.postgresql.org/message-id/flat/20210925223152.ga7...@telsasoft.com Backpatch to v10 --- src/backend/statistics/dependencies.c | 5 +++++ src/backend/statistics/extended_stats.c | 5 +++++ src/backend/utils/adt/selfuncs.c | 9 +++++++++ src/test/regress/expected/stats_ext.out | 23 +++++++++++++++++++++++ src/test/regress/sql/stats_ext.sql | 14 ++++++++++++++ 5 files changed, 56 insertions(+) diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index 8bf80db8e4..b2e33329c7 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -1593,6 +1593,11 @@ dependencies_clauselist_selectivity(PlannerInfo *root, int nexprs; int k; MVDependencies *deps; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; + + /* If it's an inheritence tree, skip statistics (which do not include child stats) */ + if (rte->inh) + break; /* skip statistics that are not of the correct type */ if (stat->kind != STATS_EXT_DEPENDENCIES) diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 0a7d12d467..eea38a5bc7 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1744,6 +1744,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli StatisticExtInfo *stat; List *stat_clauses; Bitmapset *simple_clauses; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; + + /* If it's an inheritence tree, skip statistics (which do not include child stats) */ + if (rte->inh) + break; /* find the best suited statistics object for these attnums */ stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index abcb628a39..7533574fdc 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3913,6 +3913,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel, Oid statOid = InvalidOid; MVNDistinct *stats; StatisticExtInfo *matched_info = NULL; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; + + /* If it's an inheritence tree, skip statistics (which do not include child stats) */ + if (rte->inh) + return false; /* bail out immediately if the table has no extended statistics */ if (!rel->statlist) @@ -5232,6 +5237,10 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, if (vardata->statsTuple) break; + /* If it's an inheritence tree, skip statistics (which do not include child stats) */ + if (planner_rt_fetch(onerel->relid, root)->inh) + break; + /* skip stats without per-expression stats */ if (info->kind != STATS_EXT_EXPRESSIONS) continue; diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out index c60ba45aba..5c15e44bd6 100644 --- a/src/test/regress/expected/stats_ext.out +++ b/src/test/regress/expected/stats_ext.out @@ -176,6 +176,29 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; DROP TABLE ab1 CASCADE; NOTICE: drop cascades to table ab1c +-- Ensure non-inherited stats are not applied to inherited query +CREATE TABLE stxdinh(i int, j int); +CREATE TABLE stxdinh1() INHERITS(stxdinh); +INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a; +INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a; +VACUUM ANALYZE stxdinh, stxdinh1; +-- Without stats object, it looks like this +SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); + estimated | actual +-----------+-------- + 1000 | 1008 +(1 row) + +CREATE STATISTICS stxdinh ON i,j FROM stxdinh; +VACUUM ANALYZE stxdinh, stxdinh1; +-- Since the stats object does not include inherited stats, it should not affect the estimates +SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); + estimated | actual +-----------+-------- + 1000 | 1008 +(1 row) + +DROP TABLE stxdinh, stxdinh1; -- basic test for statistics on expressions CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ); -- expression stats may be built on a single expression column diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql index 6fb37962a7..610f7ed17f 100644 --- a/src/test/regress/sql/stats_ext.sql +++ b/src/test/regress/sql/stats_ext.sql @@ -112,6 +112,20 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; DROP TABLE ab1 CASCADE; +-- Ensure non-inherited stats are not applied to inherited query +CREATE TABLE stxdinh(i int, j int); +CREATE TABLE stxdinh1() INHERITS(stxdinh); +INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a; +INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a; +VACUUM ANALYZE stxdinh, stxdinh1; +-- Without stats object, it looks like this +SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); +CREATE STATISTICS stxdinh ON i,j FROM stxdinh; +VACUUM ANALYZE stxdinh, stxdinh1; +-- Since the stats object does not include inherited stats, it should not affect the estimates +SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); +DROP TABLE stxdinh, stxdinh1; + -- basic test for statistics on expressions CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ); -- 2.17.0
>From 1c35e88903222e8f1624babd3900a499fdfee2f2 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <tomas.von...@enterprisedb.com> Date: Sat, 25 Sep 2021 23:01:21 +0200 Subject: [PATCH 2/5] Build inherited extended stats on partitioned tables Since 859b3003de, ext stats on partitioned tables are not built, which is a regression. For back branches, pg_statistic_ext cannot support both inherited (FROM) and non-inherited (FROM ONLY) stats on inheritence heirarchies. But there's no issue building inherited stats for partitioned tables, which are empty, so cannot have non-inherited stats. See also: 8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b https://www.postgresql.org/message-id/20210923212624.GI831%40telsasoft.com Backpatch to v10 --- src/backend/commands/analyze.c | 5 ++++- src/backend/statistics/dependencies.c | 2 +- src/backend/statistics/extended_stats.c | 2 +- src/backend/utils/adt/selfuncs.c | 9 ++++++--- src/test/regress/expected/stats_ext.out | 19 +++++++++++++++++++ src/test/regress/sql/stats_ext.sql | 10 ++++++++++ 6 files changed, 41 insertions(+), 6 deletions(-) diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 8bfb2ad958..299f4893b8 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -548,6 +548,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params, { MemoryContext col_context, old_context; + bool build_ext_stats; pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, PROGRESS_ANALYZE_PHASE_COMPUTE_STATS); @@ -611,13 +612,15 @@ do_analyze_rel(Relation onerel, VacuumParams *params, thisdata->attr_cnt, thisdata->vacattrstats); } + build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh); + /* * Build extended statistics (if there are any). * * For now we only build extended statistics on individual relations, * not for relations representing inheritance trees. */ - if (!inh) + if (build_ext_stats) BuildRelationExtStatistics(onerel, totalrows, numrows, rows, attr_cnt, vacattrstats); } diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index b2e33329c7..0659307b02 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -1596,7 +1596,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root, RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh) + if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) break; /* skip statistics that are not of the correct type */ diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index eea38a5bc7..b40ad9da2b 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1747,7 +1747,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh) + if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) break; /* find the best suited statistics object for these attnums */ diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 7533574fdc..d782605953 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3916,7 +3916,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh) + if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) return false; /* bail out immediately if the table has no extended statistics */ @@ -5238,8 +5238,11 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, break; /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (planner_rt_fetch(onerel->relid, root)->inh) - break; + { + RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root); + if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) + break; + } /* skip stats without per-expression stats */ if (info->kind != STATS_EXT_EXPRESSIONS) diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out index 5c15e44bd6..67234b9fc2 100644 --- a/src/test/regress/expected/stats_ext.out +++ b/src/test/regress/expected/stats_ext.out @@ -199,6 +199,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); (1 row) DROP TABLE stxdinh, stxdinh1; +-- Ensure inherited stats ARE applied to inherited query in partitioned table +CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i); +CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100); +INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a; +CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp; +VACUUM ANALYZE stxdinp; -- partitions are processed recursively +SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass; + ?column? +---------- + 1 +(1 row) + +SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2'); + estimated | actual +-----------+-------- + 10 | 10 +(1 row) + +DROP TABLE stxdinp; -- basic test for statistics on expressions CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ); -- expression stats may be built on a single expression column diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql index 610f7ed17f..2371043ca1 100644 --- a/src/test/regress/sql/stats_ext.sql +++ b/src/test/regress/sql/stats_ext.sql @@ -126,6 +126,16 @@ VACUUM ANALYZE stxdinh, stxdinh1; SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); DROP TABLE stxdinh, stxdinh1; +-- Ensure inherited stats ARE applied to inherited query in partitioned table +CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i); +CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100); +INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a; +CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp; +VACUUM ANALYZE stxdinp; -- partitions are processed recursively +SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass; +SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2'); +DROP TABLE stxdinp; + -- basic test for statistics on expressions CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ); -- 2.17.0
>From fbb3a640c6416a19d0851c7a18ec0e9190e1c1ed Mon Sep 17 00:00:00 2001 From: Tomas Vondra <tomas.von...@enterprisedb.com> Date: Sat, 25 Sep 2021 21:27:10 +0200 Subject: [PATCH 3/5] Add stxdinherit; build inherited extended stats on inheritence parents pg_statistic has an inherited flag which is part of the unique index, but pg_statistic has never had that. In back branches, pg_statistic stores the cannot store both inherited and non-inherited stats. So it stores non-inherited stats (FROM ONLY) for inheritence parents and inherited stats for partitioned tables. This patch allows storing both inherited and non-inherited stats for non-empty inheritence parents, and avoids the above, confusing definition. --- doc/src/sgml/catalogs.sgml | 23 +++ src/backend/catalog/system_views.sql | 1 + src/backend/commands/analyze.c | 15 +- src/backend/commands/statscmds.c | 20 ++- src/backend/optimizer/util/plancat.c | 186 +++++++++++--------- src/backend/statistics/dependencies.c | 13 +- src/backend/statistics/extended_stats.c | 67 ++++--- src/backend/statistics/mcv.c | 9 +- src/backend/statistics/mvdistinct.c | 5 +- src/backend/utils/adt/selfuncs.c | 2 +- src/backend/utils/cache/syscache.c | 6 +- src/include/catalog/pg_statistic_ext_data.h | 4 +- src/include/nodes/pathnodes.h | 1 + src/include/statistics/statistics.h | 9 +- src/test/regress/expected/rules.out | 1 + 15 files changed, 217 insertions(+), 145 deletions(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 2f0def9b19..e256a5533c 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -7441,6 +7441,19 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>. </para> + <para> + Normally there is one entry, with <structfield>stxdinherit</structfield> = + <literal>false</literal>, for each statistics object that has been analyzed. + If the table has inheritance children, a second entry with + <structfield>stxdinherit</structfield> = <literal>true</literal> is also created. + This row represents the statistics object over the inheritance tree, i.e., + statistics for the data you'd see with + <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>, + whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row + represents the results of + <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>. + </para> + <para> Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>, <structname>pg_statistic_ext_data</structname> should not be @@ -7480,6 +7493,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </para></entry> </row> + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>stxdinherit</structfield> <type>bool</type> + </para> + <para> + If true, the stats include inheritance child columns, not just the + values in the specified relation + </para></entry> + </row> + <row> <entry role="catalog_table_entry"><para role="column_definition"> <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type> diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 55f6e3711d..07ab18dc52 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS ) AS attnames, pg_get_statisticsobjdef_expressions(s.oid) as exprs, s.stxkind AS kinds, + sd.stxdinherit AS inherited, sd.stxdndistinct AS n_distinct, sd.stxddependencies AS dependencies, m.most_common_vals, diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 299f4893b8..7f4b0f5320 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -548,7 +548,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params, { MemoryContext col_context, old_context; - bool build_ext_stats; pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, PROGRESS_ANALYZE_PHASE_COMPUTE_STATS); @@ -612,17 +611,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params, thisdata->attr_cnt, thisdata->vacattrstats); } - build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh); - - /* - * Build extended statistics (if there are any). - * - * For now we only build extended statistics on individual relations, - * not for relations representing inheritance trees. - */ - if (build_ext_stats) - BuildRelationExtStatistics(onerel, totalrows, numrows, rows, - attr_cnt, vacattrstats); + /* Build extended statistics (if there are any). */ + BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows, + attr_cnt, vacattrstats); } pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c index e9e2382ceb..3a9f416f39 100644 --- a/src/backend/commands/statscmds.c +++ b/src/backend/commands/statscmds.c @@ -524,6 +524,9 @@ CreateStatistics(CreateStatsStmt *stmt) datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid); + /* create only the "stxdinherit=false", because that always exists */ + datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false); + /* no statistics built yet */ datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true; datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true; @@ -726,6 +729,7 @@ RemoveStatisticsById(Oid statsOid) HeapTuple tup; Form_pg_statistic_ext statext; Oid relid; + int inh; /* * First delete the pg_statistic_ext_data tuple holding the actual @@ -733,14 +737,20 @@ RemoveStatisticsById(Oid statsOid) */ relation = table_open(StatisticExtDataRelationId, RowExclusiveLock); - tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid)); + /* hack to delete both stxdinherit = true/false */ + for (inh = 0; inh <= 1; inh++) + { + tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid), + BoolGetDatum(inh)); - if (!HeapTupleIsValid(tup)) /* should not happen */ - elog(ERROR, "cache lookup failed for statistics data %u", statsOid); + if (!HeapTupleIsValid(tup)) /* should not happen */ + // elog(ERROR, "cache lookup failed for statistics data %u", statsOid); + continue; - CatalogTupleDelete(relation, &tup->t_self); + CatalogTupleDelete(relation, &tup->t_self); - ReleaseSysCache(tup); + ReleaseSysCache(tup); + } table_close(relation, RowExclusiveLock); diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index c5194fdbbf..154d48a330 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -30,6 +30,7 @@ #include "catalog/pg_am.h" #include "catalog/pg_proc.h" #include "catalog/pg_statistic_ext.h" +#include "catalog/pg_statistic_ext_data.h" #include "foreign/fdwapi.h" #include "miscadmin.h" #include "nodes/makefuncs.h" @@ -1311,127 +1312,144 @@ get_relation_statistics(RelOptInfo *rel, Relation relation) { Oid statOid = lfirst_oid(l); Form_pg_statistic_ext staForm; + Form_pg_statistic_ext_data dataForm; HeapTuple htup; HeapTuple dtup; Bitmapset *keys = NULL; List *exprs = NIL; int i; + int inh; htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", statOid); staForm = (Form_pg_statistic_ext) GETSTRUCT(htup); - dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid)); - if (!HeapTupleIsValid(dtup)) - elog(ERROR, "cache lookup failed for statistics object %u", statOid); - - /* - * First, build the array of columns covered. This is ultimately - * wasted if no stats within the object have actually been built, but - * it doesn't seem worth troubling over that case. - */ - for (i = 0; i < staForm->stxkeys.dim1; i++) - keys = bms_add_member(keys, staForm->stxkeys.values[i]); - /* - * Preprocess expressions (if any). We read the expressions, run them - * through eval_const_expressions, and fix the varnos. + * Hack to load stats with stxdinherit true/false - there should be + * a better way to do this, I guess. */ + for (inh = 0; inh <= 1; inh++) { - bool isnull; - Datum datum; + dtup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh)); + if (!HeapTupleIsValid(dtup)) + continue; - /* decode expression (if any) */ - datum = SysCacheGetAttr(STATEXTOID, htup, - Anum_pg_statistic_ext_stxexprs, &isnull); + dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup); - if (!isnull) + /* + * First, build the array of columns covered. This is ultimately + * wasted if no stats within the object have actually been built, but + * it doesn't seem worth troubling over that case. + */ + for (i = 0; i < staForm->stxkeys.dim1; i++) + keys = bms_add_member(keys, staForm->stxkeys.values[i]); + + /* + * Preprocess expressions (if any). We read the expressions, run them + * through eval_const_expressions, and fix the varnos. + */ { - char *exprsString; + bool isnull; + Datum datum; - exprsString = TextDatumGetCString(datum); - exprs = (List *) stringToNode(exprsString); - pfree(exprsString); + /* decode expression (if any) */ + datum = SysCacheGetAttr(STATEXTOID, htup, + Anum_pg_statistic_ext_stxexprs, &isnull); - /* - * Run the expressions through eval_const_expressions. This is - * not just an optimization, but is necessary, because the - * planner will be comparing them to similarly-processed qual - * clauses, and may fail to detect valid matches without this. - * We must not use canonicalize_qual, however, since these - * aren't qual expressions. - */ - exprs = (List *) eval_const_expressions(NULL, (Node *) exprs); + if (!isnull) + { + char *exprsString; - /* May as well fix opfuncids too */ - fix_opfuncids((Node *) exprs); + exprsString = TextDatumGetCString(datum); + exprs = (List *) stringToNode(exprsString); + pfree(exprsString); - /* - * Modify the copies we obtain from the relcache to have the - * correct varno for the parent relation, so that they match - * up correctly against qual clauses. - */ - if (varno != 1) - ChangeVarNodes((Node *) exprs, 1, varno, 0); + /* + * Run the expressions through eval_const_expressions. This is + * not just an optimization, but is necessary, because the + * planner will be comparing them to similarly-processed qual + * clauses, and may fail to detect valid matches without this. + * We must not use canonicalize_qual, however, since these + * aren't qual expressions. + */ + exprs = (List *) eval_const_expressions(NULL, (Node *) exprs); + + /* May as well fix opfuncids too */ + fix_opfuncids((Node *) exprs); + + /* + * Modify the copies we obtain from the relcache to have the + * correct varno for the parent relation, so that they match + * up correctly against qual clauses. + */ + if (varno != 1) + ChangeVarNodes((Node *) exprs, 1, varno, 0); + } } - } - /* add one StatisticExtInfo for each kind built */ - if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT)) - { - StatisticExtInfo *info = makeNode(StatisticExtInfo); + /* add one StatisticExtInfo for each kind built */ + if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT)) + { + StatisticExtInfo *info = makeNode(StatisticExtInfo); - info->statOid = statOid; - info->rel = rel; - info->kind = STATS_EXT_NDISTINCT; - info->keys = bms_copy(keys); - info->exprs = exprs; + info->statOid = statOid; + info->inherit = dataForm->stxdinherit; + info->rel = rel; + info->kind = STATS_EXT_NDISTINCT; + info->keys = bms_copy(keys); + info->exprs = exprs; - stainfos = lappend(stainfos, info); - } + stainfos = lappend(stainfos, info); + } - if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES)) - { - StatisticExtInfo *info = makeNode(StatisticExtInfo); + if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES)) + { + StatisticExtInfo *info = makeNode(StatisticExtInfo); - info->statOid = statOid; - info->rel = rel; - info->kind = STATS_EXT_DEPENDENCIES; - info->keys = bms_copy(keys); - info->exprs = exprs; + info->statOid = statOid; + info->inherit = dataForm->stxdinherit; + info->rel = rel; + info->kind = STATS_EXT_DEPENDENCIES; + info->keys = bms_copy(keys); + info->exprs = exprs; - stainfos = lappend(stainfos, info); - } + stainfos = lappend(stainfos, info); + } - if (statext_is_kind_built(dtup, STATS_EXT_MCV)) - { - StatisticExtInfo *info = makeNode(StatisticExtInfo); + if (statext_is_kind_built(dtup, STATS_EXT_MCV)) + { + StatisticExtInfo *info = makeNode(StatisticExtInfo); - info->statOid = statOid; - info->rel = rel; - info->kind = STATS_EXT_MCV; - info->keys = bms_copy(keys); - info->exprs = exprs; + info->statOid = statOid; + info->inherit = dataForm->stxdinherit; + info->rel = rel; + info->kind = STATS_EXT_MCV; + info->keys = bms_copy(keys); + info->exprs = exprs; - stainfos = lappend(stainfos, info); - } + stainfos = lappend(stainfos, info); + } - if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS)) - { - StatisticExtInfo *info = makeNode(StatisticExtInfo); + if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS)) + { + StatisticExtInfo *info = makeNode(StatisticExtInfo); - info->statOid = statOid; - info->rel = rel; - info->kind = STATS_EXT_EXPRESSIONS; - info->keys = bms_copy(keys); - info->exprs = exprs; + info->statOid = statOid; + info->inherit = dataForm->stxdinherit; + info->rel = rel; + info->kind = STATS_EXT_EXPRESSIONS; + info->keys = bms_copy(keys); + info->exprs = exprs; + + stainfos = lappend(stainfos, info); + } - stainfos = lappend(stainfos, info); + ReleaseSysCache(dtup); } ReleaseSysCache(htup); - ReleaseSysCache(dtup); bms_free(keys); } diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index 0659307b02..835f4bdf7a 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -618,14 +618,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums) * Load the functional dependencies for the indicated pg_statistic_ext tuple */ MVDependencies * -statext_dependencies_load(Oid mvoid) +statext_dependencies_load(Oid mvoid, bool inh) { MVDependencies *result; bool isnull; Datum deps; HeapTuple htup; - htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid)); + htup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(mvoid), + BoolGetDatum(inh)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", mvoid); @@ -1410,6 +1412,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root, int ndependencies; int i; AttrNumber attnum_offset; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* unique expressions */ Node **unique_exprs; @@ -1603,6 +1606,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root, if (stat->kind != STATS_EXT_DEPENDENCIES) continue; + /* skip statistics with mismatching stxdinherit value */ + if (stat->inherit != rte->inh) + continue; + /* * Count matching attributes - we have to undo the attnum offsets. The * input attribute numbers are not offset (expressions are not @@ -1649,7 +1656,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root, if (nmatched + nexprs < 2) continue; - deps = statext_dependencies_load(stat->statOid); + deps = statext_dependencies_load(stat->statOid, rte->inh); /* * The expressions may be represented by different attnums in the diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index b40ad9da2b..5a43d7e39a 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -77,7 +77,7 @@ typedef struct StatExtEntry static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid); static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs, int nvacatts, VacAttrStats **vacatts); -static void statext_store(Oid statOid, +static void statext_store(Oid statOid, bool inh, MVNDistinct *ndistinct, MVDependencies *dependencies, MCVList *mcv, Datum exprs, VacAttrStats **stats); static int statext_compute_stattarget(int stattarget, @@ -110,7 +110,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat, * requested stats, and serializes them back into the catalog. */ void -BuildRelationExtStatistics(Relation onerel, double totalrows, +BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows, int numrows, HeapTuple *rows, int natts, VacAttrStats **vacattrstats) { @@ -230,7 +230,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows, } /* store the statistics in the catalog */ - statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats); + statext_store(stat->statOid, inh, + ndistinct, dependencies, mcv, exprstats, stats); /* for reporting progress */ pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED, @@ -781,7 +782,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs, * tuple. */ static void -statext_store(Oid statOid, +statext_store(Oid statOid, bool inh, MVNDistinct *ndistinct, MVDependencies *dependencies, MCVList *mcv, Datum exprs, VacAttrStats **stats) { @@ -790,14 +791,19 @@ statext_store(Oid statOid, oldtup; Datum values[Natts_pg_statistic_ext_data]; bool nulls[Natts_pg_statistic_ext_data]; - bool replaces[Natts_pg_statistic_ext_data]; pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock); memset(nulls, true, sizeof(nulls)); - memset(replaces, false, sizeof(replaces)); memset(values, 0, sizeof(values)); + /* basic info */ + values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid); + nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false; + + values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh); + nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false; + /* * Construct a new pg_statistic_ext_data tuple, replacing the calculated * stats. @@ -830,25 +836,27 @@ statext_store(Oid statOid, values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs; } - /* always replace the value (either by bytea or NULL) */ - replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true; - replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true; - replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true; - replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true; - - /* there should already be a pg_statistic_ext_data tuple */ - oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid)); - if (!HeapTupleIsValid(oldtup)) + /* + * Delete the old tuple if it exists, and insert a new one. It's easier + * than trying to update or insert, based on various conditions. + * + * There should always be a pg_statistic_ext_data tuple for inh=false, + * but there may be none for inh=true yet. + */ + oldtup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(statOid), + BoolGetDatum(inh)); + if (HeapTupleIsValid(oldtup)) + { + CatalogTupleDelete(pg_stextdata, &(oldtup->t_self)); + ReleaseSysCache(oldtup); + } + else if (!inh) elog(ERROR, "cache lookup failed for statistics object %u", statOid); - /* replace it */ - stup = heap_modify_tuple(oldtup, - RelationGetDescr(pg_stextdata), - values, - nulls, - replaces); - ReleaseSysCache(oldtup); - CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup); + /* form a new tuple */ + stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls); + CatalogTupleInsert(pg_stextdata, stup); heap_freetuple(stup); @@ -1234,7 +1242,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs, * further tiebreakers are needed. */ StatisticExtInfo * -choose_best_statistics(List *stats, char requiredkind, +choose_best_statistics(List *stats, char requiredkind, bool inh, Bitmapset **clause_attnums, List **clause_exprs, int nclauses) { @@ -1256,6 +1264,10 @@ choose_best_statistics(List *stats, char requiredkind, if (info->kind != requiredkind) continue; + /* skip statistics with mismatching inheritance flag */ + if (info->inherit != inh) + continue; + /* * Collect attributes and expressions in remaining (unestimated) * clauses fully covered by this statistic object. @@ -1694,6 +1706,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli List **list_exprs; /* expressions matched to any statistic */ int listidx; Selectivity sel = (is_or) ? 0.0 : 1.0; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* check if there's any stats that might be useful for us. */ if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV)) @@ -1751,7 +1764,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli break; /* find the best suited statistics object for these attnums */ - stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, + stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh, list_attnums, list_exprs, list_length(clauses)); @@ -1840,7 +1853,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli MCVList *mcv_list; /* Load the MCV list stored in the statistics object */ - mcv_list = statext_mcv_load(stat->statOid); + mcv_list = statext_mcv_load(stat->statOid, rte->inh); /* * Compute the selectivity of the ORed list of clauses covered by @@ -2411,7 +2424,7 @@ statext_expressions_load(Oid stxoid, int idx) HeapTupleData tmptup; HeapTuple tup; - htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid)); + htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", stxoid); diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c index 35b39ece07..173f746e41 100644 --- a/src/backend/statistics/mcv.c +++ b/src/backend/statistics/mcv.c @@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups, * Load the MCV list for the indicated pg_statistic_ext tuple. */ MCVList * -statext_mcv_load(Oid mvoid) +statext_mcv_load(Oid mvoid, bool inh) { MCVList *result; bool isnull; Datum mcvlist; - HeapTuple htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid)); + HeapTuple htup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(mvoid), BoolGetDatum(inh)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", mvoid); @@ -2040,11 +2041,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat, MCVList *mcv; Selectivity s = 0.0; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; + /* match/mismatch bitmap for each MCV item */ bool *matches = NULL; /* load the MCV list stored in the statistics object */ - mcv = statext_mcv_load(stat->statOid); + mcv = statext_mcv_load(stat->statOid, rte->inh); /* build a match bitmap for the clauses */ matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs, diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c index 4481312d61..ab1f10d6c0 100644 --- a/src/backend/statistics/mvdistinct.c +++ b/src/backend/statistics/mvdistinct.c @@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data) * Load the ndistinct value for the indicated pg_statistic_ext tuple */ MVNDistinct * -statext_ndistinct_load(Oid mvoid) +statext_ndistinct_load(Oid mvoid, bool inh) { MVNDistinct *result; bool isnull; Datum ndist; HeapTuple htup; - htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid)); + htup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(mvoid), BoolGetDatum(inh)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", mvoid); diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index d782605953..ede393115d 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -4008,7 +4008,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel, Assert(nmatches_vars + nmatches_exprs > 1); - stats = statext_ndistinct_load(statOid); + stats = statext_ndistinct_load(statOid, rte->inh); /* * If we have a match, search it for the specific item that matches (there diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c index d6cb78dea8..eabd74952f 100644 --- a/src/backend/utils/cache/syscache.c +++ b/src/backend/utils/cache/syscache.c @@ -740,11 +740,11 @@ static const struct cachedesc cacheinfo[] = { 32 }, {StatisticExtDataRelationId, /* STATEXTDATASTXOID */ - StatisticExtDataStxoidIndexId, - 1, + StatisticExtDataStxoidInhIndexId, + 2, { Anum_pg_statistic_ext_data_stxoid, - 0, + Anum_pg_statistic_ext_data_stxdinherit, 0, 0 }, diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h index 7b73b790d2..8ffd8b68cd 100644 --- a/src/include/catalog/pg_statistic_ext_data.h +++ b/src/include/catalog/pg_statistic_ext_data.h @@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId) { Oid stxoid BKI_LOOKUP(pg_statistic_ext); /* statistics object * this data is for */ + bool stxdinherit; /* true if inheritance children are included */ #ifdef CATALOG_VARLEN /* variable-length fields start here */ @@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data; DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431); -DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops)); +DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops)); + #endif /* PG_STATISTIC_EXT_DATA_H */ diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index 2a53a6e344..884bda7232 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -934,6 +934,7 @@ typedef struct StatisticExtInfo NodeTag type; Oid statOid; /* OID of the statistics row */ + bool inherit; /* includes child relations */ RelOptInfo *rel; /* back-link to statistic's table */ char kind; /* statistics kind of this entry */ Bitmapset *keys; /* attnums of the columns covered */ diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h index 326cf26fea..02ee41b9f3 100644 --- a/src/include/statistics/statistics.h +++ b/src/include/statistics/statistics.h @@ -94,11 +94,11 @@ typedef struct MCVList MCVItem items[FLEXIBLE_ARRAY_MEMBER]; /* array of MCV items */ } MCVList; -extern MVNDistinct *statext_ndistinct_load(Oid mvoid); -extern MVDependencies *statext_dependencies_load(Oid mvoid); -extern MCVList *statext_mcv_load(Oid mvoid); +extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh); +extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh); +extern MCVList *statext_mcv_load(Oid mvoid, bool inh); -extern void BuildRelationExtStatistics(Relation onerel, double totalrows, +extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows, int numrows, HeapTuple *rows, int natts, VacAttrStats **vacattrstats); extern int ComputeExtStatisticsRows(Relation onerel, @@ -121,6 +121,7 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root, bool is_or); extern bool has_stats_of_kind(List *stats, char requiredkind); extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind, + bool inh, Bitmapset **clause_attnums, List **clause_exprs, int nclauses); diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 2fa00a3c29..8ab5187ccb 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -2425,6 +2425,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname, JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames, pg_get_statisticsobjdef_expressions(s.oid) AS exprs, s.stxkind AS kinds, + sd.stxdinherit AS inherited, sd.stxdndistinct AS n_distinct, sd.stxddependencies AS dependencies, m.most_common_vals, -- 2.17.0
>From 6b3c80fc37d0c034cc2072921da756728e3e3c3e Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Sat, 25 Sep 2021 18:20:03 -0500 Subject: [PATCH 4/5] f! check inh statext_expressions_load examine_variable estimate_multivariate_ndistinct TODO: pg_stats_ext_exprs needs to expose inh flag --- src/backend/statistics/dependencies.c | 5 ----- src/backend/statistics/extended_stats.c | 9 ++------- src/backend/utils/adt/selfuncs.c | 27 ++++++++----------------- src/include/statistics/statistics.h | 2 +- src/test/regress/expected/stats_ext.out | 11 ++++++++-- src/test/regress/sql/stats_ext.sql | 4 +++- 6 files changed, 23 insertions(+), 35 deletions(-) diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index 835f4bdf7a..02cf0efc66 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -1596,11 +1596,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root, int nexprs; int k; MVDependencies *deps; - RangeTblEntry *rte = root->simple_rte_array[rel->relid]; - - /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) - break; /* skip statistics that are not of the correct type */ if (stat->kind != STATS_EXT_DEPENDENCIES) diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 5a43d7e39a..231b153c15 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1706,7 +1706,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli List **list_exprs; /* expressions matched to any statistic */ int listidx; Selectivity sel = (is_or) ? 0.0 : 1.0; - RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* check if there's any stats that might be useful for us. */ if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV)) @@ -1759,10 +1758,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli Bitmapset *simple_clauses; RangeTblEntry *rte = root->simple_rte_array[rel->relid]; - /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) - break; - /* find the best suited statistics object for these attnums */ stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh, list_attnums, list_exprs, @@ -2414,7 +2409,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs) * identified by the supplied index. */ HeapTuple -statext_expressions_load(Oid stxoid, int idx) +statext_expressions_load(Oid stxoid, bool inh, int idx) { bool isnull; Datum value; @@ -2424,7 +2419,7 @@ statext_expressions_load(Oid stxoid, int idx) HeapTupleData tmptup; HeapTuple tup; - htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false)); + htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), inh); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", stxoid); diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index ede393115d..8a6bc29636 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3915,10 +3915,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel, StatisticExtInfo *matched_info = NULL; RangeTblEntry *rte = root->simple_rte_array[rel->relid]; - /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) - return false; - /* bail out immediately if the table has no extended statistics */ if (!rel->statlist) return false; @@ -5237,13 +5233,6 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, if (vardata->statsTuple) break; - /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - { - RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root); - if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) - break; - } - /* skip stats without per-expression stats */ if (info->kind != STATS_EXT_EXPRESSIONS) continue; @@ -5262,22 +5251,22 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, /* found a match, see if we can extract pg_statistic row */ if (equal(node, expr)) { - HeapTuple t = statext_expressions_load(info->statOid, pos); - - /* Get statistic object's table for permission check */ - RangeTblEntry *rte; + RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root); Oid userid; + bool inh; - vardata->statsTuple = t; + Assert(rte->rtekind == RTE_RELATION); /* * XXX Not sure if we should cache the tuple somewhere. * Now we just create a new copy every time. */ - vardata->freefunc = ReleaseDummy; + inh = root->append_rel_array == NULL ? false : + root->append_rel_array[onerel->relid]->parent_relid != 0; + vardata->statsTuple = + statext_expressions_load(info->statOid, inh, pos); - rte = planner_rt_fetch(onerel->relid, root); - Assert(rte->rtekind == RTE_RELATION); + vardata->freefunc = ReleaseDummy; /* * Use checkAsUser if it's set, in case we're accessing diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h index 02ee41b9f3..3868e43f8a 100644 --- a/src/include/statistics/statistics.h +++ b/src/include/statistics/statistics.h @@ -125,6 +125,6 @@ extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind, Bitmapset **clause_attnums, List **clause_exprs, int nclauses); -extern HeapTuple statext_expressions_load(Oid stxoid, int idx); +extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx); #endif /* STATISTICS_H */ diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out index 67234b9fc2..35edc6a361 100644 --- a/src/test/regress/expected/stats_ext.out +++ b/src/test/regress/expected/stats_ext.out @@ -176,7 +176,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; DROP TABLE ab1 CASCADE; NOTICE: drop cascades to table ab1c --- Ensure non-inherited stats are not applied to inherited query CREATE TABLE stxdinh(i int, j int); CREATE TABLE stxdinh1() INHERITS(stxdinh); INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a; @@ -191,11 +190,19 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); CREATE STATISTICS stxdinh ON i,j FROM stxdinh; VACUUM ANALYZE stxdinh, stxdinh1; +-- Ensure non-inherited stats are not applied to inherited query -- Since the stats object does not include inherited stats, it should not affect the estimates SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); estimated | actual -----------+-------- - 1000 | 1008 + 1008 | 1008 +(1 row) + +-- Ensure correct (non-inherited) stats are applied to inherited query +SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2'); + estimated | actual +-----------+-------- + 9 | 9 (1 row) DROP TABLE stxdinh, stxdinh1; diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql index 2371043ca1..8490da9558 100644 --- a/src/test/regress/sql/stats_ext.sql +++ b/src/test/regress/sql/stats_ext.sql @@ -112,7 +112,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; DROP TABLE ab1 CASCADE; --- Ensure non-inherited stats are not applied to inherited query CREATE TABLE stxdinh(i int, j int); CREATE TABLE stxdinh1() INHERITS(stxdinh); INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a; @@ -122,8 +121,11 @@ VACUUM ANALYZE stxdinh, stxdinh1; SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); CREATE STATISTICS stxdinh ON i,j FROM stxdinh; VACUUM ANALYZE stxdinh, stxdinh1; +-- Ensure non-inherited stats are not applied to inherited query -- Since the stats object does not include inherited stats, it should not affect the estimates SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); +-- Ensure correct (non-inherited) stats are applied to inherited query +SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2'); DROP TABLE stxdinh, stxdinh1; -- Ensure inherited stats ARE applied to inherited query in partitioned table -- 2.17.0
>From 33916b89dcb92acdc66b8c4febfe87d43247e5dd Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Sat, 25 Sep 2021 18:58:33 -0500 Subject: [PATCH 5/5] Refactor parent ACL check selfuncs.c is 8k lines long, and this makes it 30 LOC shorter. --- src/backend/utils/adt/selfuncs.c | 140 ++++++++++++------------------- 1 file changed, 52 insertions(+), 88 deletions(-) diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 8a6bc29636..949726a861 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid, bool *failure); static double convert_timevalue_to_scalar(Datum value, Oid typid, bool *failure); +static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, + Oid relid); static void examine_simple_variable(PlannerInfo *root, Var *var, VariableStatData *vardata); static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata, @@ -5152,51 +5154,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, (pg_class_aclcheck(rte->relid, userid, ACL_SELECT) == ACLCHECK_OK); - /* - * If the user doesn't have permissions to - * access an inheritance child relation, check - * the permissions of the table actually - * mentioned in the query, since most likely - * the user does have that permission. Note - * that whole-table select privilege on the - * parent doesn't quite guarantee that the - * user could read all columns of the child. - * But in practice it's unlikely that any - * interesting security violation could result - * from allowing access to the expression - * index's stats, so we allow it anyway. See - * similar code in examine_simple_variable() - * for additional comments. - */ - if (!vardata->acl_ok && - root->append_rel_array != NULL) - { - AppendRelInfo *appinfo; - Index varno = index->rel->relid; - - appinfo = root->append_rel_array[varno]; - while (appinfo && - planner_rt_fetch(appinfo->parent_relid, - root)->rtekind == RTE_RELATION) - { - varno = appinfo->parent_relid; - appinfo = root->append_rel_array[varno]; - } - if (varno != index->rel->relid) - { - /* Repeat access check on this rel */ - rte = planner_rt_fetch(varno, root); - Assert(rte->rtekind == RTE_RELATION); - - userid = rte->checkAsUser ? rte->checkAsUser : GetUserId(); - - vardata->acl_ok = - rte->securityQuals == NIL && - (pg_class_aclcheck(rte->relid, - userid, - ACL_SELECT) == ACLCHECK_OK); - } - } + recheck_parent_acl(root, vardata, index->rel->relid); } else { @@ -5287,49 +5245,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, (pg_class_aclcheck(rte->relid, userid, ACL_SELECT) == ACLCHECK_OK); - /* - * If the user doesn't have permissions to access an - * inheritance child relation, check the permissions of - * the table actually mentioned in the query, since most - * likely the user does have that permission. Note that - * whole-table select privilege on the parent doesn't - * quite guarantee that the user could read all columns of - * the child. But in practice it's unlikely that any - * interesting security violation could result from - * allowing access to the expression stats, so we allow it - * anyway. See similar code in examine_simple_variable() - * for additional comments. - */ - if (!vardata->acl_ok && - root->append_rel_array != NULL) - { - AppendRelInfo *appinfo; - Index varno = onerel->relid; - - appinfo = root->append_rel_array[varno]; - while (appinfo && - planner_rt_fetch(appinfo->parent_relid, - root)->rtekind == RTE_RELATION) - { - varno = appinfo->parent_relid; - appinfo = root->append_rel_array[varno]; - } - if (varno != onerel->relid) - { - /* Repeat access check on this rel */ - rte = planner_rt_fetch(varno, root); - Assert(rte->rtekind == RTE_RELATION); - - userid = rte->checkAsUser ? rte->checkAsUser : GetUserId(); - - vardata->acl_ok = - rte->securityQuals == NIL && - (pg_class_aclcheck(rte->relid, - userid, - ACL_SELECT) == ACLCHECK_OK); - } - } - + recheck_parent_acl(root, vardata, onerel->relid); break; } @@ -5339,6 +5255,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, } } +/* + * If the user doesn't have permissions to access an inheritance child + * relation, check the permissions of the table actually mentioned in the + * query, since most likely the user does have that permission. Note that + * whole-table select privilege on the parent doesn't quite guarantee that the + * user could read all columns of the child. But in practice it's unlikely + * that any interesting security violation could result from allowing access to + * the expression stats, so we allow it anyway. See similar code in + * examine_simple_variable() for additional comments. + */ +static void +recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid) +{ + RangeTblEntry *rte; + Oid userid; + + if (!vardata->acl_ok && + root->append_rel_array != NULL) + { + AppendRelInfo *appinfo; + Index varno = relid; + + appinfo = root->append_rel_array[varno]; + while (appinfo && + planner_rt_fetch(appinfo->parent_relid, + root)->rtekind == RTE_RELATION) + { + varno = appinfo->parent_relid; + appinfo = root->append_rel_array[varno]; + } + + if (varno != relid) + { + /* Repeat access check on this rel */ + rte = planner_rt_fetch(varno, root); + Assert(rte->rtekind == RTE_RELATION); + + userid = rte->checkAsUser ? rte->checkAsUser : GetUserId(); + + vardata->acl_ok = + rte->securityQuals == NIL && + (pg_class_aclcheck(rte->relid, + userid, + ACL_SELECT) == ACLCHECK_OK); + } + } +} + /* * examine_simple_variable * Handle a simple Var for examine_variable -- 2.17.0