Tom Lane <t...@sss.pgh.pa.us> wrote: > Antonin Houska <a...@cybertec.at> writes: > > Robert Haas <robertmh...@gmail.com> wrote: > >> These two phases overlap, though. I believe progress reporting for > >> sorts is really hard. > > > Whatever complexity is hidden in the sort, cost_sort() should have taken it > > into consideration when called via plan_cluster_use_sort(). Thus I think > > that > > once we have both startup and total cost, the current progress of the sort > > stage can be estimated from the current number of input and output > > rows. Please remind me if my proposal appears to be too simplistic. > > Well, even if you assume that the planner's cost model omits nothing > (which I wouldn't bet on), its result is only going to be as good as the > planner's estimate of the number of rows to be sorted. And, in cases > where people actually care about progress monitoring, it's likely that > the planner got that wrong, maybe horribly so. I think it's a bad idea > for progress monitoring to depend on the planner's estimates in any way > whatsoever.
The general idea was that some sort of prediction of the total cost is needed anyway if we should tell during execution what fraction of work has already been done. And also that the cost computation that we perform during execution shouldn't (ideally) differ from cost_sort(). So I thought that it's easier to refine cost_sort() than to implement the same computation from scratch elsewhere. Besides that I see 2 circumstances that make the estimate of the number of input tuples simpler in the CLUSTER case: * There's only 1 input relation w/o any kind of clause. * CLUSTER uses SnapshotAny, so pg_class(reltuples) is closer to the actual number of input rows than it would be in general case. (Of course, pg_class would only be useful for the initial estimate.) Unlike planner, the executor could recalculate the cost estimate at some point(s) as it recognizes that the actual number of tuples per page appears to differ from the density derived from pg_class initially. Still wrong? -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at