On 01/11/2018 03:43 PM, David Fetter wrote:
Is the assumption of a normal distribution reasonable for outlier plans as you've seen them?
This is a difficult but fair question.First of all, I'd like to clarify that the normal distribution is assumed for the set of all execution times matching a queryid; No assumptions are made about the distribution of the outliers themselves. The primary goal of this approach was the limitation of plan updates, to avoid unnecessary IO operations. When query performance does not vary much, no updates of the plans should be necessary, but as soon as query performance varies too much, the new plan should be stored. For the purpose of distinguishing reasonable variance between execution times and great variance due to changing conditions which ultimately might result in a different plan, the assumption of a normal distribution for all execution times suits well.
Based on some early testing, this results in only a few percent of updates (1-3%) in relation to the total number of calls, when running some short pgbench tests. As the sample size grows, the assumption of a normal distribution becomes increasingly accurate and the (unnecessary) sampling of plans decreases. In a different test, I ran several queries with identical table sizes, the queries were fairly simple, and the statistical evaluation led to few updates during these tests. When I increased the table size significantly, the database switched to a different plan. Because the execution time differed significantly, this new bad plan was stored. Similarly, after running a certain query a couple of times, I created an index on my test data, which resulted in a speedup which was significant enough to result in an update of the good plan.
Now, if a change to the data or the index situation only resulted in an insignificant performance increase or decrease (one that falls into the interval defined as [mean - 1.5*IQD, mean + 1-5*IQD] ), I think it might be possible to assume that we are not interested in an updated plan for this scenario.
smime.p7s
Description: S/MIME Cryptographic Signature