Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

vstinner: The problem isn't the averaging, it's the type inconsistency. In both 
examples (median([1]), median([1, 1])), the median is unambiguously 1 (no 
actual average is needed; the values are identical), yet it gets converted to 
1.0 only in the latter case.

I'm not sure it's possible to fix this though; right now, there is consistency 
among two cases:

1. When the length is odd, you get the median by identity (and therefore type 
and value are unchanged)
2. When the length is even, you get the median by adding and dividing by 2 (so 
for ints, the result is always float).

A fix that changed that would add yet another layer of complexity:

1. When the length is odd, you get the median by identity (and therefore type 
and value are unchanged)
2. When the length is even, 
  a. If the two middle values are equal (possibly only if they have equal types 
as well, to resolve the issue with [1, 1.0] or [1, True]), return the first of 
the two middle values (median by identity as in #1)
  b. Otherwise, you get the median by adding and dividing by 2

And note the required type checking in 2a required to even make it that 
consistent. Even if we accepted that, we'd pretty quickly get into a debate 
over whether median([3, 5]) should try to return 4 instead of 4.0, given that 
the median is representable in the source type (which would further damage 
consistency).

If anything, I think the best design would have been to *always* include a 
division step (so odd length cases performed middle_elem / 1, while even did 
(middle_elem1 + middle_elem2) / 2) so the behavior was consistent regardless 
odd vs. even input length, but that shipped has probably sailed, given the 
documented behavior specifically notes that the precise middle data point is 
itself returned for the odd case.

I think the solution for people concerned is to explicitly convert int values 
to be median-ed to fractions.Fraction (or decimal.Decimal) ahead of time, so 
floating point math never gets involved, and the return type is consistent 
regardless of length.

----------
nosy: +josh.r

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35698>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to