Andrew Dunstan wrote > On 01/20/2015 01:26 PM, Arne Scheffer wrote: >> >> And a very minor aspect: >> The term "standard deviation" in your code stands for >> (corrected) sample standard deviation, I think, >> because you devide by n-1 instead of n to keep the >> estimator unbiased. >> How about mentioning the prefix "sample" >> to indicate this beiing the estimator? > > > I don't understand. I'm following pretty exactly the calculations stated > at <http://www.johndcook.com/blog/standard_deviation/> > > > I'm not a statistician. Perhaps others who are more literate in > statistics can comment on this paragraph.
I'm largely in the same boat as Andrew but... I take it that Arne is referring to: http://en.wikipedia.org/wiki/Bessel's_correction but the mere presence of an (n-1) divisor does not mean that is what is happening. In this particular situation I believe the (n-1) simply is a necessary part of the recurrence formula and not any attempt to correct for sampling bias when estimating a population's variance. In fact, as far as the database knows, the values provided to this function do represent an entire population and such a correction would be unnecessary. I guess it boils down to whether "future" queries are considered part of the population or whether the population changes upon each query being run and thus we are calculating the ever-changing population variance. Note point 3 in the linked Wikipedia article. David J. -- View this message in context: http://postgresql.nabble.com/Add-min-and-max-execute-statement-time-in-pg-stat-statement-tp5774989p5834805.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers