On 9/25/21 9:53 PM, Justin Pryzby wrote:
On Sat, Sep 25, 2021 at 09:27:10PM +0200, Tomas Vondra wrote:
On 9/23/21 11:26 PM, Justin Pryzby wrote:
extended stats objects are allowed on partitioned tables since v10.
https://www.postgresql.org/message-id/flat/CAKJS1f-BmGo410bh5RSPZUvOO0LhmHL2NYmdrC_Jm8pk_FfyCA%40mail.gmail.com
8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

But since 859b3003de they're not populated - pg_statistic_ext(_data) is empty.
This was the consequence of a commit to avoid an error I reported with stats on
inheritence parents (not partitioned tables).

preceding 859b3003de, stats on the parent table *did* improve the estimate,
so this part of the commit message seems to have been wrong?
|commit 859b3003de87645b62ee07ef245d6c1f1cd0cedb
|    Don't build extended statistics on inheritance trees
...
|    Moreover, the current selectivity estimation code only works with 
individual
|    relations, so building statistics on inheritance trees would be pointless
|    anyway.

|CREATE TABLE p (i int, a int, b int) PARTITION BY RANGE (i);
|CREATE TABLE pd PARTITION OF p FOR VALUES FROM (1)TO(100);
|TRUNCATE p; INSERT INTO p SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
|CREATE STATISTICS pp ON (a),(b) FROM p;
|VACUUM ANALYZE p;
|SELECT * FROM pg_statistic_ext WHERE stxrelid ='p'::regclass;

|postgres=# begin; DROP STATISTICS pp; explain analyze SELECT a,b FROM p GROUP 
BY 1,2; abort;
| HashAggregate  (cost=20.98..21.98 rows=100 width=8) (actual time=1.088..1.093 
rows=10 loops=1)

|postgres=# explain analyze SELECT a,b FROM p GROUP BY 1,2;
| HashAggregate  (cost=20.98..21.09 rows=10 width=8) (actual time=1.082..1.086 
rows=10 loops=1)

So I think this is a regression, and extended stats should be populated for
partitioned tables - I had actually done that for some parent tables and hadn't
noticed that the stats objects no longer do anything.
...
Agreed, that seems like a regression, but I don't see how to fix that
without having the extra flag in the catalog. Otherwise we can store just
one version for each statistics object :-(

Do you think it's possible to backpatch a fix to handle partitioned tables
specifically ?

The "tuple already updated" error which I reported and which was fixed by
859b3003 involved inheritence children.  Since partitioned tables have no data
themselves, the !inh check could be relaxed.  It's not totally clear to me if
the correct statistics would be used in that case.  I suppose the wrong
(inherited) stats would be wrongly applied affect queries FROM ONLY a
partitioned table, which seems pointless to write and also hard for the
estimates to be far off :)


Hmmm, maybe. To prevent the "tuple concurrently updated" we must ensure we never build stats with and without inheritance at the same time (for the same rel). The 859b3003de ensures that by only building extended stats in the (!inh) case, but we might tweak that based on relkind. See the attached patch. But I wonder if there are cases that might be hurt by this - that'd be a regression too, of course.

Attached is a PoC that I quickly bashed together today. It's pretty raw, but
it passed "make check" and I think it does most of the things right. Can you
try if this fixes the estimates with partitioned tables?

I think pg_stats_ext_exprs also needs to expose the inherited flag.


Yeah, I only did the bare minimum to get the PoC working. I'm sure there are various other loose ends.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8bfb2ad958..299f4893b8 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -548,6 +548,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -611,13 +612,15 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}

Reply via email to