Re: [R] any other fast method for median calculation

2009-04-14 Thread Thomas Lumley
On Tue, 14 Apr 2009, S Ellison wrote: Sorting with an appropriate algorithm is nlog(n), so it's very hard to get the 'exact' median any faster. There actually are linear-time algorithms for the median, but n has to be very large before they are worth using, and by then you have to start consi

Re: [R] any other fast method for median calculation

2009-04-14 Thread roger koenker
There is a slightly faster algorithm in my quantreg package, see kuantile() but this is only significant when sample sizes are very large. In your case you really need a wrapper that keeps the loop over columns within some lower level language. url:www.econ.uiuc.edu/~rogerRog

Re: [R] any other fast method for median calculation

2009-04-14 Thread Matthias Kohl
there is function rowMedians in Bioconductor package Biobase which works for numeric matrices and might help. Matthias Dimitris Rizopoulos wrote: S Ellison wrote: Sorting with an appropriate algorithm is nlog(n), so it's very hard to get the 'exact' median any faster. However, if you can cope

Re: [R] any other fast method for median calculation

2009-04-14 Thread Dimitris Rizopoulos
S Ellison wrote: Sorting with an appropriate algorithm is nlog(n), so it's very hard to get the 'exact' median any faster. However, if you can cope with a less precise median, you could use a binary search between max(x) and min(x) with low tolerance or comparatively few iterations. In native R,

Re: [R] any other fast method for median calculation

2009-04-14 Thread S Ellison
Sorting with an appropriate algorithm is nlog(n), so it's very hard to get the 'exact' median any faster. However, if you can cope with a less precise median, you could use a binary search between max(x) and min(x) with low tolerance or comparatively few iterations. In native R, though, that isn;t

[R] any other fast method for median calculation

2009-04-13 Thread Zheng, Xin (NIH) [C]
Hi there, I got a data frame with more than 200k columns. How could I get median of each column fast? mapply is the fastest function I know for that, it's not yet satisfied though. It seems function "median" in R calculates median by "sort" and "mean". I am wondering if there is another funct