On Fri, Aug 21, 2009 at 6:54 PM, decibel <deci...@decibel.org> wrote:
> Would it? Risk seems like it would just be something along the lines of > the high-end of our estimate. I don't think confidence should be that hard > either. IE: hard-coded guesses have a low confidence. Something pulled > right out of most_common_vals has a high confidence. Something estimated > via a bucket is in-between, and perhaps adjusted by the number of tuples. > I used to advocate a similar idea. But when questioned on list I tried to work out the details and ran into some problem coming up with a concrete plan. How do you compare a plan that you think has a 99% chance of running in 1ms but a 1% chance of taking 1s against a plan that has a 90% chance of 1ms and a 10% chance of taking 100ms? Which one is actually riskier? They might even both have the same 95% percentile run-time. And additionally there are different types of unknowns. Do you want to treat plans where we have a statistical sample that gives us a probabilistic answer the same as plans where we think our model just has a 10% chance of being wrong? The model is going to either be consistently right or consistently wrong for a given query but the sample will vary from run to run. (Or vice versa depending on the situation). -- greg