Re: [R] Second largest element from each matrix row

William Dunlap Tue, 26 Apr 2011 09:28:32 -0700

And I hit the send button before adding the timings for
when there were lots of columns and few rows.  f3 changes
from the best to the worst in this case.  There is rarely
one most efficient function for all datasets.


> x <- t(x)
> benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x),
replications=5, columns=c("test","replications","elapsed"),
order="elapsed")
         test replications elapsed
4 r4 <- f4(x)            5    0.19
2 r2 <- f2(x)            5    0.24
1 r1 <- f1(x)            5    0.79
3 r3 <- f3(x)            5    3.75
> identical(r1,r2) && identical(r1, r3) && identical(r1, r4)
[1] TRUE
> dim(x)
[1]      6 100000

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
> Sent: Tuesday, April 26, 2011 9:11 AM
> To: peter dalgaard; David Winsemius
> Cc: r-help@r-project.org
> Subject: Re: [R] Second largest element from each matrix row
> 
> A different approach is to use order() to sort
> first by row number and then break the ties by
> value.  It is quick when there are lots of short
> rows.
> 
> > f1 <- function (x) 
> +    apply(x, 1, function(row) sort(row, decreasing = TRUE)[2])
> > f2 <- function (x) 
> +     -apply(-x, 1, function(row) sort.int(row, partial = 2)[2])
> > f3 <- function (x) 
> + {   
> +     # order by row number then by value
> +     y <- t(x)
> +     array(y[order(col(y), y)], dim(y))[nrow(y) - 1, ]
> + }
> > f4 <- function (x) 
> +     apply(x, 1, function(row) max(row[-which.max(row)]))
> > x <- matrix(runif(1e5*6), nrow=1e5)
> > library(rbenchmark)
> > benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x),
> +     replications=5, columns=c("test","replications","elapsed"),
> order="elapsed")
>          test replications elapsed
> 3 r3 <- f3(x)            5    1.08
> 4 r4 <- f4(x)            5   12.59
> 2 r2 <- f2(x)            5   23.19
> 1 r1 <- f1(x)            5   59.54
> > identical(r1,r2) && identical(r1, r3) && identical(r1, r4)
> [1] TRUE
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com  
> 
> > -----Original Message-----
> > From: r-help-boun...@r-project.org 
> > [mailto:r-help-boun...@r-project.org] On Behalf Of peter dalgaard
> > Sent: Tuesday, April 26, 2011 8:13 AM
> > To: David Winsemius
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Second largest element from each matrix row
> > 
> > 
> > On Apr 26, 2011, at 14:36 , David Winsemius wrote:
> > 
> > > 
> > > On Apr 26, 2011, at 8:01 AM, Lars Bishop wrote:
> > > 
> > >> Hi,
> > >> 
> > >> I need to extract the second largest element from each row of a
> > >> matrix. Below is my solution, but I think there should be 
> > a more efficient
> > >> way to accomplish the same, or not?
> > >> 
> > >> 
> > >> set.seed(1)
> > >> a <- matrix(rnorm(9), 3 ,3)
> > >> sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,])
> > >> ans <- sapply(1:length(sec.large), function(i) a[i, 
> sec.large[i]])
> > >> ans
> > > 
> > > There are probably many but this one is reasonably compact, 
> > one-step, and readable:
> > > 
> > > > ans2 <- apply(a, 1, function(i) sort(i)[ dim(a)[2]-1])
> > > > ans2
> > > 
> > > Refreshing my mail client proves I was right about many 
> > solutions, but this is the first (so far) to use the dim attribute.
> > 
> > Anything with sort() or order() will have complexity 
> > O(n*log(n)) or worse (n is the number of columns), whereas 
> > finding the k-th largest element has complexity O(k*n). 
> > 
> > For moderate n, this may be unimportant, but you could 
> > potentially find a speedup using
> > 
> > sort.int(i, decreasing=TRUE, partial=2)[2]
> > 
> > or
> > 
> > max(i[-which.max(i)])
> > 
> > -- 
> > Peter Dalgaard
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Email: pd....@cbs.dk  Priv: pda...@gmail.com
> > 
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Second largest element from each matrix row

Reply via email to