Robert Haas <robertmh...@gmail.com> writes: > The alternative worth considering is presumably something like:
> GroupAggregate > -> Index Only Scan on grue_size > Scanning an entire index in order is pretty expensive, but it seems > possible that this could be faster than the Seq Scan, especially on a > table with other wide columns, because then the index might be a lot > smaller than the table. Even if the index traversal generates some > random I/O, if it's sufficiently smaller than the table you will still > come out ahead. I'm not positive that the planner will actually > consider this plan, Of course it does. Simple example in the regression database: regression=# explain select distinct unique1 from tenk1; QUERY PLAN -------------------------------------------------------------------------------- ------------ Unique (cost=0.29..295.29 rows=10000 width=4) -> Index Only Scan using tenk1_unique1 on tenk1 (cost=0.29..270.29 rows=100 00 width=4) (2 rows) I think though that this depends on being an IOS, with a fairly wide and all-all-visible table, in order for the cost estimate to come out cheaper than a seqscan. If you disable IOS then the planner's second choice is a seqscan: regression=# set enable_indexonlyscan to 0; SET regression=# explain select distinct unique1 from tenk1; QUERY PLAN ----------------------------------------------------------------- HashAggregate (cost=483.00..583.00 rows=10000 width=4) Group Key: unique1 -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (3 rows) A whole-table plain indexscan, or IOS with any significant number of heap probes needed, is not going to be preferred over a seqscan because of the amount of random I/O it implies. > We're probably missing a few tricks on queries of this type. If the > index-traversal machinery had a mechanism to skip quickly to the next > distinct value, that could be used here: Yeah, I suspect Bill was imagining that that sort of plan could be used; but it requires execution machinery we have not got. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers