Hi Group: I searched R groups before posting this question. I could not find the appropriate answer and I do not have clear understanding how to do this in R.
I have a data frame with duplicated row identifiers but with different values across columns. I want to select the identifier with higher inter-quartile range or mean. id <- c("A", "A", "C", "D", "E", "F") year <- c(2000, 2001, 2001, 2002, 2003, 2004) samp1 <- c(100, 120, 101, 110, 132,123) samp2 <- c(110, 130, 131, 150, 122,143) mdf <- data.frame(id,samp1,samp2,samp2a) > mdf id samp1 samp2 samp2a 1 A 100 110 110 2 A 120 130 150 3 C 101 131 151 4 D 110 150 130 5 E 132 122 122 6 F 123 143 143 There are two A ids in this df. I want to select the row with higher mean. How can I do this. Thanks Adrian ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.