On 2010-08-24 11:06, Mike Williamson wrote:
Hello All,
     Using the standard "summary" function in 'R', I ran across some odd
behavior that I cannot understand.  Easy to reproduce:

Typing:

    summary(c(6,207936))

Yields::

    Min. *1st Qu.  Median    Mean 3rd Qu.    Max.*
       6   *51990  104000  104000  156000  207900*


     None of these values are correct except for the minimum.  If I perform
"quantile(c(6, 207936))", it gives the correct values.  I originally
presumed that summary was merely calling "quantile" if it saw a numeric, but
this doesn't seem to be the case.
     Anyone know what's going on here?  On a related note, what is the
statistically correct answer for calculating the 1st quartile&  3rd quartile
when only 2 values are present?  I presume one takes the mid-point between
the median (also calculated) and the min or max.  So in this case, 51988.5
for 1st&  155953.5 for 3rd (which is what quantile calculates).  But taking
25%&  75% of the sum of the 2 also seems "reasonable".  Either way,
"summary" is calculating the wrong number, and most disturbing is that it
mis-calculates the max.

                                             Regards,
                                                     Mike
This is one of those (many) situations where reading the help pages
really helps nicely:

help(summary) points you to the 'digits' argument (as David has said)
and that probably defaults to 'digits=4' for you. So, no, R is not
miscalculating anything.

help(quantile) shows that there are quite a few ways to define
quantiles and that R defaults to 'type=7'.

  -Peter Ehlers

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to