Well, some days ago i didn't know about `statistics` module, so I
wrote my own median implementation, that I improved with the help of a
private discussion:
```
import math
def median(it, member=False, sort_fn=sorted, **kwargs):
if sort is None:
# Don't sort. Coder must be carefull to pass an already sorted iterable
sorted_it = it
else:
sorted_it = sort_fn(it, **kwargs)
try:
len_it = len(it)
except TypeError:
# Generator, iterator et similia
it = tuple(it)
len_it = len(it)
if len_it == 0:
raise ValueError("Iterable is empty")
index = len_it // 2
if isEven(len_it):
res1 = it[index]
res2 = it[index-1]
if math.isnan(res1):
return res2
if math.isnan(res2):
return res1
if member:
# To remove bias
if isEven(index):
return min(res1, res2)
else:
return max(res1, res2)
else:
res = (it[index] + it[index-1]) / 2
else:
res = it[index]
return res
def isEven(num):
return num % 2 == 0
```
As you can see, with `sort_fn` you can pass another function, maybe
the pandas one (even if I do not recommend it, pandas is slow). Or you
can pass None and sort the iterable before. Maybe you have already
sorted the iterable, so there's no reason to sort it again.
Furthermore, if the iterable have even length and the elements are not
numbers, you can calculate the median in a predictable way choosing
member=True. It will return one of the two central arguments, in a
not-biased way. So you don't need median_high() or median_low() in
this cases.
Finally, if the iterable have even length and one of the two central
values is NaN, the other value is returned. The function returns NaN
only if both are NaNs.
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/KN6BSMJRVCPSQW32DTWQHTGZ5E3E5KK2/
Code of Conduct: http://python.org/psf/codeofconduct/