On Sat, 2011-04-09 at 11:34 +1000, Steven D'Aprano wrote: > Hello, > > I'm trying to find out information about the SSMEDIAN statistical > function in Gnumeric, and how it differs from MEDIAN. > > I have read the description of the function: > > "median for grouped data as commonly determined in the social sciences" > > http://projects.gnome.org/gnumeric/doc/gnumeric-SSMEDIAN.shtml > > and spent much time googling for more information, but for something > described as "commonly", I haven't been able to find any information > about it. Neither Wikipedia nor Mathworld has any reference that I can > find to the social sciences using a different algorithm for finding the > median of grouped data. > > The example given in gnumeric's function wizard shows ssmedium(7, 8, 8) > with the default interval of 1 as returning 7.75. How should I interpret > this? This is my guess: > > If the interval is 1, then the data point 7 actually means some value > between 6.5 and 7.5; > Likewise the data point 8 means 7.5 through 8.5. > > So we have grouped data: > > Group Frequency > 6.5 - 7.5 1 > 7.5 - 8.5 2 > > > If I apply the formula for grouped median found here: > > http://www.duncanwil.co.uk/average4.html > > median = L + i*(n/2 - CF)/f > > where: > > L = the lower limit of the class containing the median > i = the width of the class containing the median > n = the total number of frequencies > CF = the cumulative number of frequencies in the classes preceding the > class containing the median > f = the frequency of the median class > > I get: > > median = 7.5 + 1*(3/2 - 1)/2 = 7.75 > > which matches. But of course, this could just be a coincidence. Can > somebody please: > > * confirm that the formula above is that used by SSMEDIAN? > > * if not, what does SSMEDIAN actually do? > > * point me at a more authoritative source for the formula given?
Since Gnumeric is open source, the easiest way of checking this is to check the source. Looking at the source an obvious typo is visible but that typo will onl affect the efficieny of the calculation for an even number of observations with the two central values not equal but it will not affect the result. SSMEDIAN is supposed to calculate the median as described in "Using Basic Statistics in the Social Sciences" by Annabel Ness Evans. This is in fact the formula you gave above: median = L + i*(n/2 - CF)/f with the meaning of the variables as described above. Andreas _______________________________________________ gnumeric-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/gnumeric-list
