On Aug 24, 2010, at 1:06 PM, Mike Williamson wrote:

Hello All,

   Using the standard "summary" function in 'R', I ran across some odd
behavior that I cannot understand.  Easy to reproduce:

Typing:

  summary(c(6,207936))

Yields::

  Min. *1st Qu.  Median    Mean 3rd Qu.    Max.*
     6   *51990  104000  104000  156000  207900*


None of these values are correct except for the minimum. If I perform
"quantile(c(6, 207936))", it gives the correct values.  I originally
presumed that summary was merely calling "quantile" if it saw a numeric, but
this doesn't seem to be the case.

I would have assumed as you did, and continue to think so with appropriate modification of "merely" after reading the code in summary.default:

else if (is.numeric(object)) {
        nas <- is.na(object)
        object <- object[!nas]
        qq <- stats::quantile(object)
        qq <- signif(c(qq[1L:3L], mean(object), qq[4L:5L]), digits)
        names(qq) <- c("Min.", "1st Qu.", "Median", "Mean", "3rd Qu.",
            "Max.")
        if (any(nas))
            c(qq, `NA's` = sum(nas))
        else qq


Notice the digits argument:

> summary(c(6,207936))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
      6   51990  104000  104000  156000  207900
> quantile(c(6,207936))
      0%      25%      50%      75%     100%
     6.0  51988.5 103971.0 155953.5 207936.0

> summary(c(6,207936), digits=6)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
     6.0  51988.5 103971.0 103971.0 155954.0 207936.0



 Anyone know what's going on here?  On a related note, what is the
statistically correct answer for calculating the 1st quartile & 3rd quartile when only 2 values are present? I presume one takes the mid-point between the median (also calculated) and the min or max. So in this case, 51988.5 for 1st & 155953.5 for 3rd (which is what quantile calculates). But taking
25% & 75% of the sum of the 2 also seems "reasonable".  Either way,
"summary" is calculating the wrong number, and most disturbing is that it
mis-calculates the max.

                                           Regards,


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to