[issue33084] Computing median, median_high an median_low in statistics library

Steven D'Aprano Sun, 07 Oct 2018 07:36:02 -0700


Steven D'Aprano <steve+pyt...@pearwood.info> added the comment:


I want to revisit this for 3.8.

I agree that the current implementation-dependent behaviour when there are NANs 
in the data is troublesome. But I don't think that there is a single right 
answer.

I also agree with Mark that if we change median, we ought to change the other 
functions so that people can get consistent behaviour. It wouldn't be good for 
median to ignore NANs and mean to process them.

I'm inclined to add a parameter to the statistics functions to deal with NANs, 
that allow the caller to select from:

- implementation-dependent, i.e. what happens now;
  (for speed, and backwards compatibility, this would be the default)

- raise an exception;

- return a NAN;

- skip any NANs (treat them as missing values to be ignored).

I think that raise/return/ignore will cover most use-cases for NANs, and the 
default will be suitable for the "easy cases" where there are no NANs, without 
paying any performance penalty if you already know your data has no NANs.

Thoughts?

I'm especially looking for ideas on what to call the first option.

----------
assignee:  -> steven.daprano
versions: +Python 3.8 -Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33084>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue33084] Computing median, median_high an median_low in statistics library

Reply via email to