Is this what you want: > mdf <- read.table(text = " id samp1 samp2 samp2a + 1 A 100 110 110 + 2 A 120 130 150 + 3 C 101 131 151 + 4 D 110 150 130 + 5 E 132 122 122 + 6 F 123 143 143", header = TRUE) > result <- do.call(rbind, lapply(split(mdf, mdf$id), function(.id){ + maxIndx <- which.max(rowMeans(.id[, -1L])) + .id[maxIndx, ] + })) > > result id samp1 samp2 samp2a A A 120 130 150 C C 101 131 151 D D 110 150 130 E E 132 122 122 F F 123 143 143
On Sun, Nov 4, 2012 at 2:25 PM, Adrian Johnson <oriolebaltim...@gmail.com> wrote: > Hi Group: > I searched R groups before posting this question. I could not find the > appropriate answer and I do not have clear understanding how to do > this in R. > > I have a data frame with duplicated row identifiers but with different > values across columns. I want to select the identifier with higher > inter-quartile range or mean. > > > id <- c("A", "A", "C", "D", "E", "F") > year <- c(2000, 2001, 2001, 2002, 2003, 2004) > samp1 <- c(100, 120, 101, 110, 132,123) > samp2 <- c(110, 130, 131, 150, 122,143) > mdf <- data.frame(id,samp1,samp2,samp2a) > > >> mdf > id samp1 samp2 samp2a > 1 A 100 110 110 > 2 A 120 130 150 > 3 C 101 131 151 > 4 D 110 150 130 > 5 E 132 122 122 > 6 F 123 143 143 > > > There are two A ids in this df. I want to select the row with higher mean. > > How can I do this. > Thanks > Adrian > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.