On May 29, 2012, at 6:32 PM, jacaranda tree wrote:

Hi all,
I have a data set (df, n=10 for the sake of simplicity here) where I have two continuous variables (age and weight) and I also have a grouping variable (group, with two levels). I want to run correlations for each group separately (kind of similar to "split file" in SPSS). I've been experimenting with different functions, and I was able to do this correctly using ddply function, but output is a little bit difficult to read when I do the cor.test to get all the data with p values, df, and pearson r (see below). I also tried to do it with by function. Although, with by, it shows the data for two groups separately, it seems like it calculates the same r for both groups. Here is my code for both ddply and by, and the output as well. I was wondering if there is a way to display the output better with ddply or run the correlations correctly for each group using by.
Thanks in advance,


I would have imagined something along the lines of

lapply( split( df, df$group, function(x) cor.test(x[["age"]], x[["weight")] )

... but without an example it's only a guess.

--
David

1.with  "ddply"
r<-ddply(df, .(group), summarise, "corr" = cor.test(age, weight, method = "pearson"))

Output:
   Group                                 corr
1      1                                  Inf
2      1                                    3
3      1                                    0
4      1                                    1
5      1                                    0
6      1                            two.sided
7      1 Pearson's product-moment correlation
8      1                       age and weight
9      1                                 1, 1
10     2                             9.722211
11     2                                    3
12     2                          0.002311412
13     2                            0.9844986
14     2                                    0
15     2                            two.sided
16     2 Pearson's product-moment correlation
17     2                       age and weight
18     2                 0.7779640, 0.9990233

2. with "by"
r <- by(df, group, FUN = function(x) cor.test(age, weight, method = "pearson"))

Output:
Group: 1

        Pearson's product-moment correlation

data:  age and weight
t = 6.4475, df = 8, p-value = 0.0001988
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6757758 0.9802100
sample estimates:
      cor
0.9157592

------------------------------------------------------------
Group: 2

        Pearson's product-moment correlation

data:  age and weight
t = 6.4475, df = 8, p-value = 0.0001988
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6757758 0.9802100
sample estimates:
      cor
0.9157592
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to