Re: strange slow query - lost lot of time somewhere

David Rowley Mon, 02 May 2022 21:05:13 -0700

On Tue, 3 May 2022 at 15:22, David G. Johnston
<[email protected]> wrote:
> Plugging in:
> n = 2,000
> e = 500
> c = 10,000
>
> proper = 5%
> incorrect = 25%
>
> But of the 10,000 calls we will receive, the first 2,000 will be misses while 
> 2,000 of the remaining 8,000 will be hits, due to sharing 2,000 distinct 
> groups among the available inventory of 500 (25% of 8,000 is 2,000).  2,000 
> hits in 10,000 calls yields 20%.
>
> I believe the correct formula to be:
>
> ((calls - ndistinct) / calls) * (est_entries / ndistinct) = hit_ratio
> .80 * .25 = .20


I think you're correct here.  The formula should be that.  However,
two things; 1) this being incorrect is not the cause of the original
problem reported on this thread, and 2) There's just no way we could
consider fixing this in v15, let alone back patch it to v14.

Maybe we should open a new thread about this and put an entry in the
first CF for v16 under bugs and come back to it after we branch.
Thinking the cache hit ratio is lower than it actually is going to be
will reduce the chances of the planner switching to a Nested Loop /
Memoize plan vs a Hash or Merge Join plan.

I was already fairly concerned that Memoize could cause performance
regressions when the ndistinct value or expected cache entry size is
underestimated or the outer side rows are overestimated.  What I've
got to calculate the cache hit ratio does seem incorrect given what
you're showing, however it does add an element of pessimism and
reduces the chances of a bad plan being picked when work_mem is too
low to cache all entries.  Swapping it out for your formula seems like
it would increase the chances of a Memoize plan being picked when the
row, ndistinct and cache entry size estimates are correct, however, it
could also increase the chance of a bad plan when being picked in
cases where the estimates are incorrect.

My problem with changing this now would be that we already often
perform Nested Loop joins when a Hash or Merge join would be a better
option. I'd hate to take us in a direction where we make that problem
even worse.

David

Re: strange slow query - lost lot of time somewhere

Reply via email to