Feature suggestion: sum() ought to use a compensated summation algorithm

Szabolcs Horvát Sat, 03 May 2008 09:56:23 -0700

I did the following calculation: Generated a list of a million randomnumbers between 0 and 1, constructed a new list by subtracting the meanvalue from each number, and then calculated the mean again.

The result should be 0, but of course it will differ from 0 slightlybecause of rounding errors.

However, I noticed that the simple Python program below gives a resultof ~ 10^-14, while an equivalent Mathematica program (also using doubleprecision) gives a result of ~ 10^-17, i.e. three orders of magnitudemore precise.


Here's the program (pardon my style, I'm a newbie/occasional user):

from random import random

data = [random() for x in xrange(1000000)]

mean = sum(data)/len(data)
print sum(x - mean for x in data)/len(data)

A little research shows that Mathematica uses a "compensated summation"algorithm. Indeed, using the algorithm described at

http://en.wikipedia.org/wiki/Kahan_summation_algorithm
gives us a result around ~ 10^-17:

def compSum(arr):
    s = 0.0
    c = 0.0
    for x in arr:
        y = x-c
        t = s+y
        c = (t-s) - y
        s = t
    return s

mean = compSum(data)/len(data)
print compSum(x - mean for x in data)/len(data)

I thought that it would be very nice if the built-in sum() function usedthis algorithm by default. Has this been brought up before? Would thishave any disadvantages (apart from a slight performance impact, butPython is a high-level language anyway ...)?


Szabolcs Horvát
--
http://mail.python.org/mailman/listinfo/python-list

Feature suggestion: sum() ought to use a compensated summation algorithm

Reply via email to