On 17/03/2018 00:16, Steven D'Aprano wrote: > The bug tracker currently has a discussion of a bug in the median(), > median_low() and median_high() functions that they wrongly compute the > medians in the face of NANs in the data: > > https://bugs.python.org/issue33084 > > I would like to ask people how they would prefer to handle this issue:
TL;DR: I choose (5) I'm agree with Terry Reedy for his proposal for the (5), however, I want to define precisely what we mean with "ignore". In my opinion "ignoring" should be more like "stripping". In the case the number of data points is odd, we can return a NAN without any concerns. But in the case the number of data points is even, and at least one of the two middle values is a NAN, we're probably going to have an exception raised. In this case, to not over-complicate things, I think we should go with this meaning for "ignore": "Removing" NAN before actual data points processing. In this case, we should have two possible options for the keyword argument "nan": 'strip' (Which does what I just described) and 'raise' (Which raises an exception if there is a NAN in the data points). We should still consider adding an "ignore" option in a later time. This option would blindly ignore NAN values. If an exception is encountered during the actual processing (Let's say we have an even number of data points, and a NAN in one of the two values), it is raised up to the caller. From my point of view, I prefer the (5). With a default of 'strip'. Your argument with (1) being the fastest (I believe, in terms of running-time, tell me if I'm wrong) can be achieved with the 'ignore' option. Going with (1) would force Python developers to write implementation specific code (Oh rather "implementation-defined-prone" code). In this case (5) goes easy with Python-side code. Options from (2) to (4) force Python developers to adopt a behavior. It's not necessarily a bad thing, but since (5) allows flexibility at no cost I don't see why we shouldn't go with it. -- Léo El Amri -- https://mail.python.org/mailman/listinfo/python-list