Wolfgang Maier added the comment: > -----Ursprüngliche Nachricht----- > Von: Steven D'Aprano [mailto:rep...@bugs.python.org] > Gesendet: Sonntag, 2. Februar 2014 12:55 > An: wolfgang.ma...@biologie.uni-freiburg.de > Betreff: [issue20479] Efficiently support weight/frequency mappings in the > statistics module > > > Steven D'Aprano added the comment: > > Off the top of my head, I can think of three APIs: > > (1) separate functions, as Nick suggests: > mean vs weighted_mean, stdev vs weighted_stdev > > (2) treat mappings as an implied (value, frequency) pairs >
(2) is clearly my favourite. (1) may work well, if you have a module with a small fraction of functions, for which you need an alternate API. In the statistics module, however, almost all of its current functions could profit from having a way to treat mappings specially. In such a case, (1) is prone to create lots of redundancies. I do not share Oscar's opinion that > apart from mode() the implementation of each function on > map-format data will be completely different from the iterable version > so you'd want to have it as a separate function at least internally > anyway. Consider _sum's current code (docstring omitted for brevity): def _sum(data, start=0): n, d = _exact_ratio(start) T = type(start) partials = {d: n} # map {denominator: sum of numerators} # Micro-optimizations. coerce_types = _coerce_types exact_ratio = _exact_ratio partials_get = partials.get # Add numerators for each denominator, and track the "current" type. for x in data: T = _coerce_types(T, type(x)) n, d = exact_ratio(x) partials[d] = partials_get(d, 0) + n if None in partials: assert issubclass(T, (float, Decimal)) assert not math.isfinite(partials[None]) return T(partials[None]) total = Fraction() for d, n in sorted(partials.items()): total += Fraction(n, d) if issubclass(T, int): assert total.denominator == 1 return T(total.numerator) if issubclass(T, Decimal): return T(total.numerator)/total.denominator return T(total) all you'd have to do to treat mappings as proposed here is to add a check whether we are dealing with a mapping, then in this case, instead of the for loop: for x in data: T = _coerce_types(T, type(x)) n, d = exact_ratio(x) partials[d] = partials_get(d, 0) + n use this: for x,m in data.items(): T = _coerce_types(T, type(x)) n, d = exact_ratio(x) partials[d] = partials_get(d, 0) + n*m and no other changes (though I haven't tested this carefully). Wolfgang ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue20479> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com