On Fri, Aug 21, 2009 at 6:54 PM, decibel <deci...@decibel.org> wrote:

> Would it? Risk seems like it would just be something along the lines of
> the high-end of our estimate. I don't think confidence should be that hard
> either. IE: hard-coded guesses have a low confidence. Something pulled
> right out of most_common_vals has a high confidence. Something estimated
> via a bucket is in-between, and perhaps adjusted by the number of tuples.
>

I used to advocate a similar idea. But when questioned on list I tried to
work out the details and ran into some problem coming up with a concrete
plan.

How do you compare a plan that you think has a 99% chance of running in 1ms
but a 1% chance of taking 1s against a plan that has a 90% chance of 1ms
and a 10% chance of taking 100ms? Which one is actually riskier? They might
even both have the same 95% percentile run-time.

And additionally there are different types of unknowns. Do you want to
treat plans where we have a statistical sample that gives us a
probabilistic answer the same as plans where we think our model just has a
10% chance of being wrong? The model is going to either be consistently
right or consistently wrong for a given query but the sample will vary from
run to run. (Or vice versa depending on the situation).


-- 
greg

Reply via email to