New submission from Raymond Hettinger <raymond.hettin...@gmail.com>:

The current mean() function makes heroic efforts to achieve last bit accuracy 
and when possible to retain the data type of the input.

What is needed is an alternative that has a simpler signature, that is much 
faster, that is highly accurate without demanding perfection, and that is 
usually what people expect mean() is going to do, the same as their calculators 
or numpy.mean():

   def fmean(seq: Sequence[float]) -> float:
       return math.fsum(seq) / len(seq)

On my current 3.8 build, this code given an approx 500x speed-up (almost three 
orders of magnitude).   Note that having a fast fmean() function is important 
in resampling statistics where the mean() is typically called many times:  
http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm 


$ ./python.exe -m timeit -r 11 -s 'from random import random' -s 'from 
statistics import mean' -s 'seq = [random() for i in range(10_000)]' 'mean(seq)'
50 loops, best of 11: 6.8 msec per loop

$ ./python.exe -m timeit -r 11 -s 'from random import random' -s 'from math 
import fsum' -s 'mean=lambda seq: fsum(seq)/len(seq)' -s 'seq = [random() for i 
in range(10_000)]' 'mean(seq)'
2000 loops, best of 11: 155 usec per loop

----------
assignee: steven.daprano
components: Library (Lib)
messages: 334894
nosy: rhettinger, steven.daprano, tim.peters
priority: normal
severity: normal
status: open
title: Add statistics.fmean(seq)
type: behavior
versions: Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35904>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to