On Mon, 9 Aug 2010, Alexander Eggel wrote:
Hello everybody,
I need to know which samples (S1-S6) contain a value that is bigger than the
median + five standard deviations of the column he is in. This is just an
Why not the 70th percentile plus 6 times the difference in the 85th
and 75th percentiles :-)
Frank
P.S. See
@Article{fin06cal,
author = {Finney, David J.},
title = {Calibration guidelines challenge outlier
practices},
journal = The American Statistician,
year = 2006,
volume = 60,
pages = {309-313},
annote = {anticoagulant
therapy;bias;causation;ethics;objectivity;outliers;guidelines for
treatment of outliers;overview of types of outliers;letter to the
editor and reply 61:187 May 2007}
}
example. Command should be applied to a data frame wich is a lot bigger
(over 100 columns). Any solutions? Thank you very much for your help!!!
s
Samples A B C E
1 S1 1 2 3 7
2 S2 4 NA 6 6
3 S3 7 8 9 NA
4 S4 4 5 NA 6
5 S5 2 5 6 7
6 S6 2 3 4 5
This loop works fine for a column without NA values. However it doesn't work
for the other columns. I should have a loop that I could apply to all
columns ideally in "one command".
o <- data.frame();
for (i in 1:nrow(s))
{
dd <- s[i,];
if (dd$A >= median(s$A, na.rm=TRUE) + 5 * sd(s$A, na.rm=TRUE)) o <-
rbind(o,dd)
}
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.