On May 29, 2012, at 6:32 PM, jacaranda tree wrote:
Hi all,
I have a data set (df, n=10 for the sake of simplicity here) where I
have two continuous variables (age and weight) and I also have a
grouping variable (group, with two levels). I want to run
correlations for each group separately (kind of similar to "split
file" in SPSS). I've been experimenting with different functions,
and I was able to do this correctly using ddply function, but output
is a little bit difficult to read when I do the cor.test to get all
the data with p values, df, and pearson r (see below). I also tried
to do it with by function. Although, with by, it shows the data for
two groups separately, it seems like it calculates the same r for
both groups. Here is my code for both ddply and by, and the output
as well. I was wondering if there is a way to display the output
better with ddply or run the correlations correctly for each group
using by.
Thanks in advance,
I would have imagined something along the lines of
lapply( split( df, df$group, function(x) cor.test(x[["age"]],
x[["weight")] )
... but without an example it's only a guess.
--
David
1.with "ddply"
r<-ddply(df, .(group), summarise, "corr" = cor.test(age, weight,
method = "pearson"))
Output:
Group corr
1 1 Inf
2 1 3
3 1 0
4 1 1
5 1 0
6 1 two.sided
7 1 Pearson's product-moment correlation
8 1 age and weight
9 1 1, 1
10 2 9.722211
11 2 3
12 2 0.002311412
13 2 0.9844986
14 2 0
15 2 two.sided
16 2 Pearson's product-moment correlation
17 2 age and weight
18 2 0.7779640, 0.9990233
2. with "by"
r <- by(df, group, FUN = function(x) cor.test(age, weight, method =
"pearson"))
Output:
Group: 1
Pearson's product-moment correlation
data: age and weight
t = 6.4475, df = 8, p-value = 0.0001988
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.6757758 0.9802100
sample estimates:
cor
0.9157592
------------------------------------------------------------
Group: 2
Pearson's product-moment correlation
data: age and weight
t = 6.4475, df = 8, p-value = 0.0001988
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.6757758 0.9802100
sample estimates:
cor
0.9157592
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.