Elai:

Thank you.You make an excellent point. cor() is implemented at the C
level (via a .internal call) whereas sapply implements an interpreted
loop that has to issue the call each time through the loop (with some
shortcuts/tricks to reduce overhead). So the operations count of the
original poster is completely bogus.

As you say, "it depends..." . For this reason, it is generally a bad
idea to waste much time on code efficiency unless you really need to,
which these days is not often (and there are certainly arenas where
this statement is false). More important is to focus on code clarity,
flexibility, debuggabuility, etc.

Best,
Bert
On Thu, Feb 23, 2012 at 2:52 PM, ilai <ke...@math.montana.edu> wrote:
> On Thu, Feb 23, 2012 at 3:24 PM, Bert Gunter <gunter.ber...@gene.com> wrote:
>> Use 1:n as an index.
>>
>> e.g.
>> sapply(1:n, function(i) cor(x[,i],y[,i]))
>
> ## sapply is a good solution (the only one I could think of too), but
> not always worth it:
>
> # for 100 x 1000
>  x <- data.frame(matrix(rnorm(100000),nc=1000))
>  y <- data.frame(matrix(rnorm(100000),nc=1000))
>  system.time(diag(cor(x,y)))
> #   user  system elapsed
> #  0.592   0.008   0.623
> system.time(sapply(1:1000,function(i) cor(x[,i],y[,i])))
> #   user  system elapsed
> #  0.384   0.000   0.412
>
> # Great. but for 10 x 1000
> x <- data.frame(matrix(rnorm(10000),nc=1000))
> y <- data.frame(matrix(rnorm(10000),nc=1000))
> system.time(diag(cor(x,y)))
> #   user  system elapsed
> #  0.256   0.008   0.279
> system.time(sapply(1:1000,function(i) cor(x[,i],y[,i])))
> #   user  system elapsed
> #  0.376   0.000   0.388
>
> # or 100 x 100
>  system.time(diag(cor(x,y)))
> #   user  system elapsed
> #  0.016   0.000   0.014
>  system.time(sapply(1:100,function(i) cor(x[,i],y[,i])))
> #   user  system elapsed
> #  0.036   0.000   0.036
>
> # Not so great.
>
> Bottom line, as always, it depends.
>
> Cheers
> Elai
>
>
>
>
>>
>> -- Bert
>>
>>
>>
>> On Thu, Feb 23, 2012 at 2:10 PM, Sam Steingold <s...@gnu.org> wrote:
>>> suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN.
>>> I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN).
>>> my sets of vectors are arranged as data frames x & y (vector=column):
>>>
>>>  x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
>>>  y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10))
>>>
>>> cor(x,y) returns a _matrix_ of all pairwise correlations:
>>>
>>>  cor(x,y)
>>>          d          e            f
>>> a 0.2763696 -0.3523757 -0.373518870
>>> b 0.5892742 -0.1969161 -0.007159589
>>> c 0.3094301  0.1111997 -0.094970748
>>>
>>> which is _not_ what I want.
>>>
>>> I want diag(cor(x,y)) but without the N^2 calculations.
>>>
>>> thanks.
>>>
>>> --
>>> Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 
>>> 11.0.11004000
>>> http://www.childpsy.net/ http://iris.org.il http://americancensorship.org
>>> http://dhimmi.com http://www.PetitionOnline.com/tap12009/ 
>>> http://jihadwatch.org
>>> Never argue with an idiot: he has more experience with idiotic arguments.
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to