On Nov 7, 2024, at 9:27 PM, Andrei Lepikhov <lepi...@gmail.com> wrote:
> Postgres didn't want Materialize in this example because of the low 
> estimation on its outer subquery. AFAIC, by increasing the *_page_cost's 
> value, you added extra weight to the inner subquery and shifted the decision 
> to use materialisation.

Interesting, except I decreased the random_page_cost. Just clarifying.

> I see huge underestimation in the simple scan:
> 
> Bitmap Heap Scan on metainfo b_1
>  (cost=23.96..35.77 rows=3 width=38)
>  (actual time=1.225..4.206 rows=1025 loops=1)
> 
> It may be caused by some functional dependency in its filter:
> 
> ((relation = 'located'::text) AND (type = 'document'::text))
> 
> You can create extended statistics on the columns 'relation' and 'type'. 
> These statistics can reduce estimation errors and enable the optimiser to 
> find a better plan without changing the cost balance.

OK, this is the first I'm learning about extended statistics...

I'm looking at https://www.postgresql.org/docs/15/sql-createstatistics.html
and https://www.postgresql.org/docs/15/planner-stats.html#PLANNER-STATS-EXTENDED

What kind of extended statistics do you suggest for this? ndistinct, 
dependencies, or mcv?

CREATE STATISTICS tablename_stats (<statistics type?>) ON relation, type FROM 
tablename;
ANALYZE tablename;

Thanks,
Ed



Reply via email to