2018-08-01 16:59 GMT+02:00 David Rowley <david.row...@2ndquadrant.com>:
> On 2 August 2018 at 02:48, Guillaume Lelarge <guilla...@lelarge.info> > wrote: > > EXPLAIN (ANALYZE) SELECT DISTINCT * FROM gleu2; > > > > QUERY PLAN > > ------------------------------------------------------------ > --------------------------------------------------------- > > Unique (cost=12005.97..12049.20 rows=1 width=1430) (actual > > time=20055.294..20323.348 rows=1 loops=1) > > -> Sort (cost=12005.97..12006.30 rows=132 width=1430) (actual > > time=20055.290..20105.738 rows=60000 loops=1) > > Sort Key: (... 130 columns ...) > > Sort Method: external sort Disk: 84464kB > > -> Seq Scan on gleu2 (cost=0.00..12001.32 rows=132 width=1430) > > (actual time=0.109..114.142 rows=60000 loops=1) > > Planning time: 10.012 ms > > Execution time: 20337.854 ms > > (7 rows) > > > > That looks quite good. The total cost is 12049, so I expect this plan to > > have the smaller cost as it's the choosen plan. Now, I'm disabling Sort, > and > > here is what I get: > > > > SET enable_sort TO off; > > EXPLAIN (ANALYZE) SELECT DISTINCT * FROM gleu2; > > > > QUERY PLAN > > ------------------------------------------------------------ > --------------------------------------------------------- > > HashAggregate (cost=12044.22..12044.23 rows=1 width=1430) (actual > > time=508.342..508.343 rows=1 loops=1) > > Group Key: (... 130 columns ...) > > -> Seq Scan on gleu2 (cost=0.00..12001.32 rows=132 width=1430) > (actual > > time=0.036..57.088 rows=60000 loops=1) > > Planning time: 1.335 ms > > Execution time: 508.529 ms > > (5 rows) > > When the planner adds a new path it compares the cost not exactly, but > with a 'fuzz factor'. It's very likely that the hashagg path did not > make it as it was not fuzzily any cheaper than the unique path. By > default, this fuzz factor is 1%. > > That may definitely be it. There's not much of a difference in the total costs. It seems in your case the costs don't quite match reality which is > quite likely due to the poor row estimates on "gleu2". Has that table > been analyzed recently? or is there some reason that auto-vacuum is > not getting to it? > > This is a small test case of a much bigger query joining a large number of tables, materialized views, views (calling functions), etc. The actual plan contains 84 nodes (32 scans, lots of joins, and a few other nodes). The mis-estimate is to follow what the big query gives me. There's a bit more reading of what I'm talking about in > https://github.com/postgres/postgres/blob/master/src/ > backend/optimizer/util/pathnode.c#L141 > > I'm gonna read that. Thank you. -- Guillaume.