Lets assume the following space A={1,2,3} [I do not know if it is called 
space in English, probably NOT.]

Lets take the array 1,1,2,3,3,3

So the median is (2+3)/2 = 2.5 ? BUT the space A does NOT have the value 
2.5. It is either 2 or 3. In this case, it is more appropriate to talk 
about 2 medians, 2 and 3! And this list DOES NOT have the middle value 
of 2.5 because this value DOES NOT exist.

Weighted Medians: actually, most real world applications making use of 
median from signal processing, use the weighted median. (Just google for 
"weighted median".) So again, because Excel did take the easy ride, I do 
NOT believe that a standard should follow the same way.

Sincerely Yours,

Leonard Mada

Andreas J. Guelzow wrote:
> On Sat, 2007-10-02 at 02:41 +0200, Leonard Mada wrote:
>   
>> John Machin wrote:
>>     
>>> ...
>>> So who cares? The median value is 1. Is your alternative going to 
>>> return some value other than 1 ????
>>>       
>> Please define mathematically the middle value! It is NOT trivial as my 
>> definitions showed. Anything else would be ambiguous. This should  be a 
>> standard, so make a better definition.
>>     
>
> Contrary to your claims, there is nothing ambiguous. Any non-decreasing
> list of the same values has the same middle value(s). 
>
>   
>> Well, I could have used a much shorter definition: the median is the 
>> value that halves the list so that there are two sets of equal size with 
>> numbers in the first set being higher than the median and numbers in the 
>> second set being lower. As noted, this definition avoids the sorting, 
>> too. (One could extend this definition for even and odd number of 
>> elements. Or even a much shorter definition: the 50th percentile. BUT 
>> all these definitions are ambiguous, see later.)
>>
>> The one thing that I do NOT agree at all with the OASIS definition is, 
>> that it includes the wording "sorting". Sorting is definitely NOT 
>> necessary to calculate the median. You can take any array, even one that 
>> is NOT sorted, and determine the median without first sorting it. This 
>> is much to often stated wrongly in so many textbooks, BUT sorting is 
>> really not necessary.
>>     
>
> The OpenFormula standard does not prescribe any method used to find the
> value. It only prescribes what the value is.
>
>   
>> So, this is NOT a prerequisite that should enter a standard definition.
>>
>> May I even point out, that for even number of elements, one may 
>> define/have an upper median and a lower median. Alternatively, in 
>> serious mathematical uses, the median is usually calculated using a 
>> weighted approach. Therefore, the median of 1,2,2,3,4,5 is NOT (2+3)/2 = 
>> 2.5, BUT rather (2+2+3)/3 = 2.66. So, it does make sense to have a very 
>> strong and unambiguous definition in a standard.
>>     
>
>
>   
>> The *weighted median* may be introduced later into the standard and then 
>> the ambiguity would be complete. 
>>     
>
> MEDIAN is not intended to implement a weighted median. None of the
> current spreadsheet implementation uses that name for a weighted
> median. 
>
> Gnumeric for example does also provide a function for a weighted median,
> namely SSMEDIAN. That function may at some time also be introduced in
> the Standard but would in no way make other definition ambiguous.
>
> Andreas
>   

_______________________________________________
gnumeric-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/gnumeric-list

Reply via email to