Re: [R] Identification of Outliners and Extraction of Samples

Frank Harrell Mon, 09 Aug 2010 15:48:37 -0700



On Mon, 9 Aug 2010, Alexander Eggel wrote:

Hello everybody,

I need to know which samples (S1-S6) contain a value that is bigger than the
median + five standard deviations of the column he is in. This is just an

Why not the 70th percentile plus 6 times the difference in the 85thand 75th percentiles :-)


Frank

P.S.  See

@Article{fin06cal,
  author =               {Finney, David J.},

title = {Calibration guidelines challenge outlierpractices},

  journal =      The American Statistician,
  year =                 2006,
  volume =               60,
  pages =                {309-313},
  annote =               {anticoagulant
therapy;bias;causation;ethics;objectivity;outliers;guidelines for

treatment of outliers;overview of types of outliers;letter to theeditor and reply 61:187 May 2007}

example. Command should be applied to a data frame wich is a lot bigger
(over 100 columns). Any solutions? Thank you very much for your help!!!

   Samples     A     B    C    E
1             S1   1     2     3     7
2             S2   4    NA   6     6
3             S3   7     8     9    NA
4             S4   4     5    NA   6
5             S5   2     5     6     7
6             S6   2     3     4     5

This loop works fine for a column without NA values. However it doesn't work
for the other columns. I should have a loop that I could apply to all
columns ideally in "one command".

o <- data.frame();
for (i in 1:nrow(s))
{
      dd <- s[i,];
      if (dd$A >= median(s$A, na.rm=TRUE) + 5 * sd(s$A, na.rm=TRUE)) o <-
rbind(o,dd)

}

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Identification of Outliners and Extraction of Samples

Reply via email to