> Right. But that seems fraught with difficulty. I suspect that the
> costs that the planner attributes to each plan often aren't very
> reliable in any absolute sense, even when everything is working very
> well by every available metric. Even a very noisy cost model with
> somewhat inaccurate selectivity estimates will often pick the cheapest
> plan, or close enough.

Sure, the absolute cost of a complex plan will always be inaccurate at best.
My point is that we can be very confident in the cardinalities of base tables. 
As the paper states in "3.1. Estimates for Base Tables":

"The median q-error is close to the optimal value of 1 for all systems,
indicating that the majority of all selections are estimated correctly."

Thanks to the statistics will practically never be off by an order of magnitude 
when estimating base table cardinalities.

The paper also clearly shows (and that certainly coincides with my experience) 
that those cardinality underestimations grow exponentially as they propagate up 
the join tree.

Given the research I'd stipulate that at any given level of the join tree, the 
current depth is a reasonable indicator of underestimation. Taking that into 
account (even if only to mitigate nested loops on higher levels) is IMV a 
principled approach, and not necesseraly a hack.

Obviously having something like error bars as proposed by Tom would be even 
better and perhaps more general, but that is on a whole different level in 
terms of complexity and I certainly have no idea how we would easily get there.


Reply via email to