And I hit the send button before adding the timings for when there were lots of columns and few rows. f3 changes from the best to the worst in this case. There is rarely one most efficient function for all datasets.
> x <- t(x) > benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x), replications=5, columns=c("test","replications","elapsed"), order="elapsed") test replications elapsed 4 r4 <- f4(x) 5 0.19 2 r2 <- f2(x) 5 0.24 1 r1 <- f1(x) 5 0.79 3 r3 <- f3(x) 5 3.75 > identical(r1,r2) && identical(r1, r3) && identical(r1, r4) [1] TRUE > dim(x) [1] 6 100000 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap > Sent: Tuesday, April 26, 2011 9:11 AM > To: peter dalgaard; David Winsemius > Cc: r-help@r-project.org > Subject: Re: [R] Second largest element from each matrix row > > A different approach is to use order() to sort > first by row number and then break the ties by > value. It is quick when there are lots of short > rows. > > > f1 <- function (x) > + apply(x, 1, function(row) sort(row, decreasing = TRUE)[2]) > > f2 <- function (x) > + -apply(-x, 1, function(row) sort.int(row, partial = 2)[2]) > > f3 <- function (x) > + { > + # order by row number then by value > + y <- t(x) > + array(y[order(col(y), y)], dim(y))[nrow(y) - 1, ] > + } > > f4 <- function (x) > + apply(x, 1, function(row) max(row[-which.max(row)])) > > x <- matrix(runif(1e5*6), nrow=1e5) > > library(rbenchmark) > > benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x), > + replications=5, columns=c("test","replications","elapsed"), > order="elapsed") > test replications elapsed > 3 r3 <- f3(x) 5 1.08 > 4 r4 <- f4(x) 5 12.59 > 2 r2 <- f2(x) 5 23.19 > 1 r1 <- f1(x) 5 59.54 > > identical(r1,r2) && identical(r1, r3) && identical(r1, r4) > [1] TRUE > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -----Original Message----- > > From: r-help-boun...@r-project.org > > [mailto:r-help-boun...@r-project.org] On Behalf Of peter dalgaard > > Sent: Tuesday, April 26, 2011 8:13 AM > > To: David Winsemius > > Cc: r-help@r-project.org > > Subject: Re: [R] Second largest element from each matrix row > > > > > > On Apr 26, 2011, at 14:36 , David Winsemius wrote: > > > > > > > > On Apr 26, 2011, at 8:01 AM, Lars Bishop wrote: > > > > > >> Hi, > > >> > > >> I need to extract the second largest element from each row of a > > >> matrix. Below is my solution, but I think there should be > > a more efficient > > >> way to accomplish the same, or not? > > >> > > >> > > >> set.seed(1) > > >> a <- matrix(rnorm(9), 3 ,3) > > >> sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,]) > > >> ans <- sapply(1:length(sec.large), function(i) a[i, > sec.large[i]]) > > >> ans > > > > > > There are probably many but this one is reasonably compact, > > one-step, and readable: > > > > > > > ans2 <- apply(a, 1, function(i) sort(i)[ dim(a)[2]-1]) > > > > ans2 > > > > > > Refreshing my mail client proves I was right about many > > solutions, but this is the first (so far) to use the dim attribute. > > > > Anything with sort() or order() will have complexity > > O(n*log(n)) or worse (n is the number of columns), whereas > > finding the k-th largest element has complexity O(k*n). > > > > For moderate n, this may be unimportant, but you could > > potentially find a speedup using > > > > sort.int(i, decreasing=TRUE, partial=2)[2] > > > > or > > > > max(i[-which.max(i)]) > > > > -- > > Peter Dalgaard > > Center for Statistics, Copenhagen Business School > > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > > Phone: (+45)38153501 > > Email: pd....@cbs.dk Priv: pda...@gmail.com > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.