Lets assume the following space A={1,2,3} [I do not know if it is called
space in English, probably NOT.]
Lets take the array 1,1,2,3,3,3
So the median is (2+3)/2 = 2.5 ? BUT the space A does NOT have the value
2.5. It is either 2 or 3. In this case, it is more appropriate to talk
about 2 medians, 2 and 3! And this list DOES NOT have the middle value
of 2.5 because this value DOES NOT exist.
Weighted Medians: actually, most real world applications making use of
median from signal processing, use the weighted median. (Just google for
"weighted median".) So again, because Excel did take the easy ride, I do
NOT believe that a standard should follow the same way.
Sincerely Yours,
Leonard Mada
Andreas J. Guelzow wrote:
> On Sat, 2007-10-02 at 02:41 +0200, Leonard Mada wrote:
>
>> John Machin wrote:
>>
>>> ...
>>> So who cares? The median value is 1. Is your alternative going to
>>> return some value other than 1 ????
>>>
>> Please define mathematically the middle value! It is NOT trivial as my
>> definitions showed. Anything else would be ambiguous. This should be a
>> standard, so make a better definition.
>>
>
> Contrary to your claims, there is nothing ambiguous. Any non-decreasing
> list of the same values has the same middle value(s).
>
>
>> Well, I could have used a much shorter definition: the median is the
>> value that halves the list so that there are two sets of equal size with
>> numbers in the first set being higher than the median and numbers in the
>> second set being lower. As noted, this definition avoids the sorting,
>> too. (One could extend this definition for even and odd number of
>> elements. Or even a much shorter definition: the 50th percentile. BUT
>> all these definitions are ambiguous, see later.)
>>
>> The one thing that I do NOT agree at all with the OASIS definition is,
>> that it includes the wording "sorting". Sorting is definitely NOT
>> necessary to calculate the median. You can take any array, even one that
>> is NOT sorted, and determine the median without first sorting it. This
>> is much to often stated wrongly in so many textbooks, BUT sorting is
>> really not necessary.
>>
>
> The OpenFormula standard does not prescribe any method used to find the
> value. It only prescribes what the value is.
>
>
>> So, this is NOT a prerequisite that should enter a standard definition.
>>
>> May I even point out, that for even number of elements, one may
>> define/have an upper median and a lower median. Alternatively, in
>> serious mathematical uses, the median is usually calculated using a
>> weighted approach. Therefore, the median of 1,2,2,3,4,5 is NOT (2+3)/2 =
>> 2.5, BUT rather (2+2+3)/3 = 2.66. So, it does make sense to have a very
>> strong and unambiguous definition in a standard.
>>
>
>
>
>> The *weighted median* may be introduced later into the standard and then
>> the ambiguity would be complete.
>>
>
> MEDIAN is not intended to implement a weighted median. None of the
> current spreadsheet implementation uses that name for a weighted
> median.
>
> Gnumeric for example does also provide a function for a weighted median,
> namely SSMEDIAN. That function may at some time also be introduced in
> the Standard but would in no way make other definition ambiguous.
>
> Andreas
>
_______________________________________________
gnumeric-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/gnumeric-list