Hi, Right now we don't indicate that a top-n sort is going to be used in EXPLAIN, just EXPLAIN ANALYZE. That's imo suboptimal, because one quite legitimately might want to know that before actually executing (as it will make a huge amount of difference in the actual resource intensity of the query).
postgres[28165][1]=# EXPLAIN (VERBOSE) SELECT * FROM hashes ORDER BY count DESC LIMIT 10; ┌───────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Limit (cost=12419057.53..12419058.70 rows=10 width=45) │ │ Output: hash, count │ │ -> Gather Merge (cost=12419057.53..66041805.65 rows=459591466 width=45) │ │ Output: hash, count │ │ Workers Planned: 2 │ │ -> Sort (cost=12418057.51..12992546.84 rows=229795733 width=45) │ │ Output: hash, count │ │ Sort Key: hashes.count DESC │ │ -> Parallel Seq Scan on public.hashes (cost=0.00..7452254.33 rows=229795733 width=45) │ │ Output: hash, count │ └───────────────────────────────────────────────────────────────────────────────────────────────────────┘ (10 rows) postgres[28165][1]=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM hashes ORDER BY count DESC LIMIT 10; ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Limit (cost=12419057.53..12419058.70 rows=10 width=45) (actual time=115204.278..115205.024 rows=10 loops=1) │ │ Output: hash, count │ │ -> Gather Merge (cost=12419057.53..66041805.65 rows=459591466 width=45) (actual time=115204.276..115205.020 rows=10 loops=1) │ │ Output: hash, count │ │ Workers Planned: 2 │ │ Workers Launched: 2 │ │ -> Sort (cost=12418057.51..12992546.84 rows=229795733 width=45) (actual time=115192.189..115192.189 rows=7 loops=3) │ │ Output: hash, count │ │ Sort Key: hashes.count DESC │ │ Sort Method: top-N heapsort Memory: 25kB │ │ Worker 0: Sort Method: top-N heapsort Memory: 25kB │ │ Worker 1: Sort Method: top-N heapsort Memory: 25kB │ │ Worker 0: actual time=115186.558..115186.559 rows=10 loops=1 │ │ Worker 1: actual time=115186.540..115186.540 rows=10 loops=1 │ │ -> Parallel Seq Scan on public.hashes (cost=0.00..7452254.33 rows=229795733 width=45) (actual time=0.080..90442.215 rows=183836589 loops=3) │ │ Output: hash, count │ │ Worker 0: actual time=0.111..90366.999 rows=183976442 loops=1 │ │ Worker 1: actual time=0.107..90461.921 rows=184707680 loops=1 │ │ Planning Time: 0.121 ms │ │ Execution Time: 115205.053 ms │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (20 rows) It's also noticable that we preposterously assume that the sort actually will return exactly the number of rows in the table, despite being a top-n style sort. That seems bad for costing of the parallel query, because it think we'll assume that costs tups * parallel_tuple_cost? I'm also unclear as to why the Gather Merge ends up with twice as many estimated rows as there are in the table. Greetings, Andres Freund