Hello:
I found the "hubers" function in MASS library is NOT working on the following
data:
> a <-
>c(7.19,7.19,7.19,9.41,6.79,9.41,7.19,9.41,1.64,7.19,7.19,7.19,7.19,1.422,7.19,6.79,7.19,6.79,7.19,7.19,4.44,6.55,6.79,7.19,9.41,9.41,7.19,7.19,7.19,7.19,1.64,1.597,1.64,7.19,1.422,7.19,6.79,9.38,7.19,1.64,7.19,7.19,7.19,7.19,7.19,1.64,7.19,6.79,6.79,1.649,1.64,7.19,1.597,1.64,6.55,7.19,7.19,1.64,7.19,7.19,1.407,1.672,1.672,7.19,6.55,7.19,7.19,9.41,1.407,7.19,7.19,9.41,7.19,9.41,7.19,7.19,7.19,7.19,7.19,7.19,7.19,7.19,7.19,9.41,7.19,6.79,7.19,6.79,1.64,1.422,7.19,7.19,1.67,1.64,1.64,1.64,1.64,1.787,7.19,7.19,7.19,6.79,7.19,7.19,7.19,7.19,7.19,7.19,7.19,7.19,7.19,1.64,1.64,1.64,1.422,9.41,1.64,7.19,7.19,7.19,7.19,7.19,7.19,7.19,6.79,6.79,7.19,6.79,7.19,7.19,1.407,7.19,4.42,9,1.64,1.64,6.79,1.664,1.664)
>
> library(MASS)
> hubers(a)
## NO response!
I think it is due to the infinite loop caused by the following line in the code
of "hubers" (around Line 30):
if ((abs(mu0 - mu1) < tol * s0) &&
abs(s0 - s1) < tol * s0) break
where "s0" evaluates to ZERO initially (due to more than 50% of the number
7.19).
I propose to change the "<" sign to "<=":
if ((abs(mu0 - mu1) <= tol * s0) &&
abs(s0 - s1) <= tol * s0) break
which will break the infinite loop. However, the new result is:
> hubers(a)
$mu
[1] 7.19
$s
[1] 0
which gives 0 standard deviation. Actually the data does vary and it is not
true all values other than 7.19 are outliers. Take a look at:
> plot(a)
I think this is because we allow initial SD to equal to zero instead of a
POSITIVE value. See Line 15 of the "hubers" function:
if (missing(s))
s0 <- mad(y)
I propose setting "s0" to "mad(y)" or a small positive number, whichever is
larger. For example:
if (missing(s))
s0 <- max(mad(y), tol)
where tol=1e-6. With this change, the result is more sensible:
> hubers(a)
$mu
[1] 5.88
$s
[1] 2.937
Could anyone take a look at this and decide if the above modifications to the
"hubers" function are warranted?Thanks a lot!
Sincerely,
Feiming Chen
Read more >> Options >>
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.