Oscar Benjamin added the comment: I should be clearer about my intentions. I'm hoping to create an efficient and accurate sum() function for the new stdlib statistics module: http://bugs.python.org/issue18606 http://www.python.org/dev/peps/pep-0450/
The sum() function currently proposed can be seen in this patch: http://bugs.python.org/file31680/statistics_combined.patch It uses Fractions to compute an exact sum and separately tries to keep track of type coercion to convert the end result. It is accurate and returns a result of the appropriate type for any mix of ints, Fractions, Decimals and floats with the caveat that mixing Fractions and Decimals is disallowed. I believe the most common use-cases will be for ints/floats and for these cases it is 100x slower than sum/fsum. The discussion here: http://bugs.python.org/issue18606#msg195630 resulted in the sum function that you can see in the above patch. Following that I've had off-list discussions with the author of the module about improving the speed of the sum function (which is a bottleneck for half of the functions exposed by the module). He is very much interested in speed optimisations but doesn't want to compromise on accuracy. My own interest is that I would like an efficient and accurate (for all types) sum function *somewhere* in the stdlib. The addition of the statistics module with its new sum() function is a good opportunity to achieve this. If this were purely for myself I wouldn't bother this much with speed optimisation since I've probably already spent more of my own time thinking about this function than I ever would running it! (If I did want to specifically speed this up in my own work I would, as you say, use cython). If fsum() were modified in the way that I describe that it would be usable as a primitive for the statistics.sum() function and also for parallel etc. computation. I agree that the carry argument is unnecessary and that the main is just the exactness of the return value: it seems a shame that fsum could so easily give me an exact result but doesn't. As for exposing the internals of fsum, is it not the case that a *non-overlapping* set of floats having a given exact sum is a uniquely defined set? For msum I believe it is only necessary to strip out zeros in order to normalise the output so that it is uniquely defined by the true value of the sum in question (the list is already in ascending order once the zeros are removed). Uniqueness: If a many-digit float is given by something like (+-)abcdefghijkl * 2 ** x and we want to break it into non-overlapping 2-digit floats of the form ab*2**x. Since the first digit of each 2-digit float must be non-zero (consider denormals as a separate case) the largest magnitude float is uniquely determined. Subtract that from the many-digit total and then the next largest float is uniquely determined and so on. There can be at most 1 denormal number in the result and this is also uniquely determined once the larger numbers are extracted. So the only way that the partials list can be non-unique is by the inclusion of zeros (unless I've missed something :). ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue19086> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com