For external sorts, cost_sort() assumes the following: * The disk traffic is assumed to be 3/4ths sequential and 1/4th random * accesses (XXX can't we refine that guess?)
... /* Assume 3/4ths of accesses are sequential, 1/4th are not */ startup_cost += npageaccesses * (seq_page_cost * 0.75 + random_page_cost * 0.25); I think that we *can* refine this guess, and should, because random I/O is really quite unlikely to be a large cost these days (I/O in general often isn't a large cost, actually). More fundamentally, I think it's a problem that cost_sort() thinks that external sorts are far more expensive than internal sorts in general. There is good reason to think that that does not reflect the reality. I think we can expect external sorts to be *faster* than internal sorts with increasing regularity in Postgres 10. In one sense, things got worse here in 9.6 when replacement selection was all but killed, since runs became on average half their size, which cost_sort() was taught about at the time. But, cost_sort() didn't care about cases where only one pass is predicted (the great majority of external sorts), so that didn't really matter. cost_sort() doesn't seem to care about the size of runs to be merged at all, which seems limiting. It also seems limiting that cost_sort() doesn't differentiate between internal and external sort *startup* costs at all. External sorts can often start returning tuples sooner, due to the final on-the-fly merge mechanism. It's pretty common to see cases where merging 4 runs takes only slightly less time than an internal sort, and yet are thought to cost more than twice as much. I realize that costs are always a bit fudged, but I am fairly suspicious of how big the differences between external and internal sorts regularly are. I suggest that this be revisited soon. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers