Steven D'Aprano <steve+pyt...@pearwood.info> added the comment: Thanks Victor for the explanation about pyperf's addition features. They do sound very useful. Perhaps we should consider adding some of them to timeit?
However, in my opinion using the average is statistically wrong. Using the mean is good when errors are two-sided, that is, your measured value can be either too low or too high compared to the measured value: measurement = true value ± random error If the random errors are symmetrically distributed, then taking the average tends to cancel them out and give you a better estimate of the true value. Even if the errors aren't symmetrical, the mean will still be a better estimate of the true value. (Or perhaps a trimmed mean, or the median, if there are a lot of outliers.) But timing results are not like that, the measurement errors are one-sided, not two: measurement = true value + random error So by taking the average, all you are doing is averaging the errors, not cancelling them. The result you get is *worse* as an estimate of the true value than the minimum. All those other factors (ignore the warmup, check for a small stdev, etc) seem good to me. But the minimum, not the mean, is still going to be closer to the true cost of running the code. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue45261> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com