David,
try this:
fivenum(1:101)
quantile(1:101, c(1,3)/4, type=5)
-Peter
On 2010-05-13 8:55, David Winsemius wrote:
On May 13, 2010, at 10:25 AM, Robert Baer wrote:
Hi Peter,
You're absolutely correct! The description for 'range' in 'boxplot'
help file is a little bit confusing by using the words "interquartile
range". I think it should be changed to the "length of the box" to be
exact and consistent with those in the help file for "boxplot.stats".
The issue is probably that there are multiple ways (9 to be exact) of
defining quantiles in R. See 'type= ' arguement for ?quantile. The
quantile function uses type=7 by default which matches the quantile
definition used by S-Plus(?), but differs from that used by SPSS.
Doesn't fivenum essentially use the equivalent of a different "type= "
arguement (maybe 2 or 5) in constructing the hinges?
It seems perfectly reasonable to talk about 'length of box' (or 'box
height' depending how you display the boxplot), but aren't the hinges
simply Q1 and Q3 defined by one of the possible quartile definitions
(as Peter points out the one used by fivenum)? The box height does not
necesarily match the distance produced by IQR() which also seems to
use the equivalent of quantile(..., type=7), but it is still an IQR,
is it not?
Quantiles apparantly can be defined in more than one "acceptable" way
(sort of like dealing with ties in rank statistics). The OP seemed to
want an "exact" explanation of the wiskers, and I think Peter has
pointed us at the definition of quartiles used by fivenum, as opposed
to the default used with quantile(..., "type=7").
Yes, and experimentation leads me to the conclusion that the only
possible candidate for matching up the results of fivenum[c(2,4] with
quantile(y, c(1,3)/4, type=i) is for type=5. I'm not able to prove that
to myself from mathematical arguments. since I do not quite understand
the formalism in the quantile page. If the match is not exact, this
would be a tenth definition of IQR.
> set.seed(123)
> y <- rexp(300, .02)
> fivenum(y)
[1] 0.2183685 15.8740466 42.1147820 74.0362517 360.5503788
> for (i in 4:9) {print(quantile(y, c(1,3)/4, type=i) ) }
25% 75%
15.82506 73.93080
25% 75%
15.87405 74.03625
25% 75%
15.84955 74.08898
25% 75%
15.89854 73.98352
25% 75%
15.86588 74.05383
25% 75%
15.86792 74.04943
--
Peter Ehlers
University of Calgary
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.