[R] covariance matrix: a erro and simple mixed model question, but id not know answer sorry

2011-04-17 Thread Maya Joshi
Dear list

I need your help: Execuse me for my limited R knowledge.

#example data set
set.seed (134)
lm=c(1:4)

block = c(rep(lm,6))

gen <- c(rep(1, 4), rep(2, 4), rep(3, 4), rep(4, 4),rep(5, 4),rep(6, 4))

X1 = c( rnorm (4, 10, 4), rnorm (4, 12, 6), rnorm (4, 10, 7),rnorm (4, 5, 2),
rnorm (4, 8, 4), rnorm (4,7, 2))

X2 = X1 + rnorm(length(X1), 0,3)

yvar <- c(X1, X2)

X <- c(rep( 1, length(X1)), rep( 2, length(X2))) # dummy x variable

dataf <- data.frame(as.factor(block), as.factor(gen), as.factor(X), yvar )



My objective to estimate variance-covariance between two variables X1 and
X2. Means that I need to fit something like unstructure (UN) covariance
structure.



Question 1: I got the following error

require("lme4");

fm1Gen <- lmer(yvar ~ X + gen +(1|block), data= dataf) # Question 1: should
I consider X fixed or random



Error in model.frame.default(data = dataf, formula = yvar ~ X + gen +  :
  variable lengths differ (found for 'gen')



A tried nlme too.

require(nlme)

fm2Gen <- lme(yvar ~ X + gen,  random= ~ 1|block, data= dataf)

Error in model.frame.default(formula = ~yvar + X + gen + block, data = list(
:
  variable lengths differ (found for 'gen') # similar error



Question 2: How can get I covariance matrix between X1 and X2 either using
lme4 or lmer.

   X1X2

X1   Var (X1) Cov(X1,X2)

X2   Cov(X1, X2)  Var(X2)



Should I put gen in the model to do this? Should I specify something in "*
correlation* =  "

Thank you for your time

Maya

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] covariance matrix: a erro and simple mixed model question, but id not know answer sorry

2011-04-18 Thread Maya Joshi
Let me clarify the output I want to create:

Source   X1 (var)  X2  (var)   X1&X2
(cov)
gen   var(X1)  var(X2)
 cov(x1X2)
block var(X1)   var(X2)
cov(x1x2)
error/ res var(x1)   var(x2)
 cov(x1x2)

I need to do posterior analysis out of this table

Thanks in advance

Maya


On Sun, Apr 17, 2011 at 9:59 PM, Maya Joshi  wrote:

> Dear list
>
> I need your help: Execuse me for my limited R knowledge.
>
> #example data set
> set.seed (134)
> lm=c(1:4)
>
> block = c(rep(lm,6))
>
> gen <- c(rep(1, 4), rep(2, 4), rep(3, 4), rep(4, 4),rep(5, 4),rep(6, 4))
>
> X1 = c( rnorm (4, 10, 4), rnorm (4, 12, 6), rnorm (4, 10, 7),rnorm (4, 5,
> 2), rnorm (4, 8, 4), rnorm (4,7, 2))
>
> X2 = X1 + rnorm(length(X1), 0,3)
>
> yvar <- c(X1, X2)
>
> X <- c(rep( 1, length(X1)), rep( 2, length(X2))) # dummy x variable
>
> dataf <- data.frame(as.factor(block), as.factor(gen), as.factor(X), yvar )
>
>
>
> My objective to estimate variance-covariance between two variables X1 and
> X2. Means that I need to fit something like unstructure (UN) covariance
> structure.
>
>
>
> Question 1: I got the following error
>
> require("lme4");
>
> fm1Gen <- lmer(yvar ~ X + gen +(1|block), data= dataf) # Question 1:
> should I consider X fixed or random
>
>
>
> Error in model.frame.default(data = dataf, formula = yvar ~ X + gen +  :
>   variable lengths differ (found for 'gen')
>
>
>
> A tried nlme too.
>
> require(nlme)
>
> fm2Gen <- lme(yvar ~ X + gen,  random= ~ 1|block, data= dataf)
>
> Error in model.frame.default(formula = ~yvar + X + gen + block, data =
> list( :
>   variable lengths differ (found for 'gen') # similar error
>
>
>
> Question 2: How can get I covariance matrix between X1 and X2 either using
> lme4 or lmer.
>
>X1X2
>
> X1   Var (X1) Cov(X1,X2)
>
> X2   Cov(X1, X2)  Var(X2)
>
>
>
> Should I put gen in the model to do this? Should I specify something in "*
> correlation* =  "
>
> Thank you for your time
>
> Maya
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome

2011-09-02 Thread Maya Joshi
Dear R experts.

I might be missing something obvious. I have been trying to fix this problem
for some weeks. Please help.

#data
ped <- c(rep(1, 4), rep(2, 3), rep(3, 3))
y <- rnorm(10, 8, 2)

# variable set 1
M1a <- sample (c(1, 2,3), 10, replace= T)
M1b <- sample (c(1, 2,3), 10, replace= T)
M1aP1 <- sample (c(1, 2,3), 10, replace= T)
M1bP2 <- sample (c(1, 2,3), 10, replace= T)

# variable set 2
M2a <- sample (c(1, 2,3), 10, replace= T)
M2b <- sample (c(1, 2,3), 10, replace= T)
M2aP1 <- sample (c(1, 2,3), 10, replace= T)
M2bP2 <- sample (c(1, 2,3), 10, replace= T)

# variable set 3
M3a <- sample (c(1, 2,3), 10, replace= T)
M3b <- sample (c(1, 2,3), 10, replace= T)
M3aP1 <- sample (c(1, 2,3), 10, replace= T)
M3bP2 <- sample (c(1, 2,3), 10, replace= T)

mydf <- data.frame (ped, M1a,M1b,M1aP1,M1bP2, M2a,M2b,M2aP1,M2bP2,
M3a,M3b,M3aP1,M3bP2, y)

# functions and further calculations

mmat <- matrix
(c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1",
"M1bP2","M2bP2","M3bP2"), ncol = 4)

# first function
myfun <- function(x) {
x<- as.vector(x)
ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1)
ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1)
qt <- ot1 + ot2
return(qt)
}
qt <- apply(mmat, 1, myfun)
ydv <- c((y - mean(y))^2)
qtd <- data.frame(ped, ydv, qt)

# second function
myfun2 <- function(dataframe) {
vydv <- sum(ydv)*0.25
sumD <- sum(ydv * qt)
Rt <- vydv / sumD
return(Rt)
}

# using plyr
require(plyr)
dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2)

Here are 2 issues:
(1) The output just one, I need the output for all three set of variables
(as listed above)

(2)  all three values of dfsumd is returning to same for all level of ped:
1,2, 3
Means that the function is applied to whole dataset but only replicated in
output !!!

I tried with plyr not being lazy but due to my limited R knowledge, If you
have a different suggestion, you are welcome too.

Thank you in advance...

Maya

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome

2011-09-03 Thread Maya Joshi
Dear R experts:

Thank you Dennis and David ...

As David indicated sorry of language and I have tried to explain what I
intend to do... I would this with Dennis's solution code:


ped <- rep(1:3, c(4, 3, 3))
> y <- rnorm(10, 8, 2)
> # This replaces all of your sample() statements, and is equivalent:
> smat <- matrix(sample(1:3, 120, replace = TRUE), ncol = 12)
> colnames(smat) <- c('M1a', 'M1b', 'M1aP1', 'M1bP2',
>'M2a', 'M2b', 'M2aP1', 'M2bP2',
>'M3a', 'M3b', 'M3aP1', 'M3bP2')
> mydf <- as.data.frame(cbind(ped, y, smat))
>


>
> mmat <- matrix
(c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1",
"M1bP2","M2bP2","M3bP2"), ncol = 4)
   [,1]  [,2]  [,3][,4]

[1,] "M1a" "M1b" "M1aP1" "M1bP2"
[2,] "M2a" "M2b" "M2aP1" "M2bP2"
[3,] "M3a" "M3b" "M3aP1" "M3bP2"

I want to compare [,1]  and [,3]  names of mydf  (mydf[x[1]] == mydf[x[3]])
. for all three rows in the nmat. nmat is guiding me which variable I want
to pick while working on mydf.  In my real dataset I have 1000 such set of
variables.

# first function
myfun <- function(x) {
x<- as.vector(x)
ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1)
ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1)
qt <- ot1 + ot2
return(qt)
}
qt <- apply(mmat, 1, myfun)

Solution of this will create a matrix with number of set of variables by
number of rows in the mydf
  [,1] [,2] [,3]
 [1,]0   -20
 [2,]   -20   -2
 [3,]0   -20
 [4,]002
 [5,]0   -2   -2
 [6,]   -20   -2
 [7,]   -2   -20
 [8,]   -200
 [9,]   -202
[10,]000

ydv <- c((y - mean(y))^2)  # calculates mean of y and deviations from it for
each y values
[1]  9.5012525  0.2578341  1.6676271  6.3102202 12.8701830  9.5509480
 [7]  0.8661107  3.1828185  0.9215140  1.0909813

qtd <- data.frame(ped, ydv, qt) # new data.frame with above function's
output with ped variable
  pedydv X1 X2 X3
11  9.5012525  0 -2  0
21  0.2578341 -2  0 -2
31  1.6676271  0 -2  0
41  6.3102202  0  0  2
52 12.8701830  0 -2 -2
62  9.5509480 -2  0 -2
72  0.8661107 -2 -2  0
83  3.1828185 -2  0  0
93  0.9215140 -2  0  2
10   3  1.0909813  0  0  0

Now I want to calculate Rt for each X1, X2, X3 (in real data world I will
have 1000 of them). The expected result of the following function should
look like 3 x 3 matrix. This is just example, I do have Ped around 200 and
X1 is around 1000.
# Rt values
 Ped  X1X2X3
1
2
3

# second function
myfun2 <- function(dataframe) {
vydv <- sum(ydv)*0.25
sumD <- sum(ydv * qt)
Rt <- vydv / sumD
return(Rt)
}

# using plyr
require(plyr)
dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2)

 dfsumd1
  mydf$ped V1
11 -0.1047935
22 -0.1047935
33 -0.1047935

This is not what I want. I want ped wise Rt values for each of X variables
in above qtd matrix.
# Rt values
 Ped  X1X2X3
1
2
3

Then in I can sum Ped$X1, Ped$X2, Ped$X3. The idea is to calculated separate
Rt values for each variable group by Ped variables separately. Then add the
values.

Thank you so much for your time. Hope I had made it clear now.

Maya

>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome

2011-09-03 Thread Maya Joshi
Let me clear my last part

 qt2 <- qtd[-1:-2]
# second function
myfun2 <- function(x) {
vydv <- sum(ydv)*0.25
sumD <- sum(ydv * x)
Rt <- vydv / sumD
return(Rt)
}

# ignoring grouping with ped, the following is output
qt2 <- apply(qtd, 2, myfun2)
qtd1 <- qt2[-1:-2]

# The following result seems to pool at X1, X2, and X3
require(plyr)
dfsumd1 <- ddply(qtd,.(qtd$ped),myfun2)

 qtd$ped V1
1   1  0.2296159
2   2  0.1045569
3   3 -5.8861942

But I need a twoway table

PedX1X2   X3
1
2
3

Still unsuccessful, sorry !

On Sat, Sep 3, 2011 at 7:57 AM, Maya Joshi  wrote:

> Dear R experts:
>
> Thank you Dennis and David ...
>
> As David indicated sorry of language and I have tried to explain what I
> intend to do... I would this with Dennis's solution code:
>
>
> ped <- rep(1:3, c(4, 3, 3))
>> y <- rnorm(10, 8, 2)
>> # This replaces all of your sample() statements, and is equivalent:
>> smat <- matrix(sample(1:3, 120, replace = TRUE), ncol = 12)
>> colnames(smat) <- c('M1a', 'M1b', 'M1aP1', 'M1bP2',
>>'M2a', 'M2b', 'M2aP1', 'M2bP2',
>>'M3a', 'M3b', 'M3aP1', 'M3bP2')
>> mydf <- as.data.frame(cbind(ped, y, smat))
>>
>
>
>>
>> mmat <- matrix
> (c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1",
> "M1bP2","M2bP2","M3bP2"), ncol = 4)
>[,1]  [,2]  [,3][,4]
>
> [1,] "M1a" "M1b" "M1aP1" "M1bP2"
> [2,] "M2a" "M2b" "M2aP1" "M2bP2"
> [3,] "M3a" "M3b" "M3aP1" "M3bP2"
>
> I want to compare [,1]  and [,3]  names of mydf  (mydf[x[1]] ==
> mydf[x[3]]) . for all three rows in the nmat. nmat is guiding me which
> variable I want to pick while working on mydf.  In my real dataset I have
> 1000 such set of variables.
>
> # first function
> myfun <- function(x) {
> x<- as.vector(x)
> ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1)
> ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1)
> qt <- ot1 + ot2
> return(qt)
> }
> qt <- apply(mmat, 1, myfun)
>
> Solution of this will create a matrix with number of set of variables by
> number of rows in the mydf
>   [,1] [,2] [,3]
>  [1,]0   -20
>  [2,]   -20   -2
>  [3,]0   -20
>  [4,]002
>  [5,]0   -2   -2
>  [6,]   -20   -2
>  [7,]   -2   -20
>  [8,]   -200
>  [9,]   -202
> [10,]000
>
> ydv <- c((y - mean(y))^2)  # calculates mean of y and deviations from it
> for each y values
> [1]  9.5012525  0.2578341  1.6676271  6.3102202 12.8701830  9.5509480
>  [7]  0.8661107  3.1828185  0.9215140  1.0909813
>
> qtd <- data.frame(ped, ydv, qt) # new data.frame with above function's
> output with ped variable
>   pedydv X1 X2 X3
> 11  9.5012525  0 -2  0
> 21  0.2578341 -2  0 -2
> 31  1.6676271  0 -2  0
> 41  6.3102202  0  0  2
> 52 12.8701830  0 -2 -2
> 62  9.5509480 -2  0 -2
> 72  0.8661107 -2 -2  0
> 83  3.1828185 -2  0  0
> 93  0.9215140 -2  0  2
> 10   3  1.0909813  0  0  0
>
> Now I want to calculate Rt for each X1, X2, X3 (in real data world I will
> have 1000 of them). The expected result of the following function should
> look like 3 x 3 matrix. This is just example, I do have Ped around 200 and
> X1 is around 1000.
> # Rt values
>  Ped  X1X2X3
> 1
> 2
> 3
>
> # second function
> myfun2 <- function(dataframe) {
> vydv <- sum(ydv)*0.25
> sumD <- sum(ydv * qt)
> Rt <- vydv / sumD
> return(Rt)
> }
>
> # using plyr
> require(plyr)
> dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2)
>
>  dfsumd1
>   mydf$ped V1
> 11 -0.1047935
> 22 -0.1047935
> 33 -0.1047935
>
> This is not what I want. I want ped wise Rt values for each of X variables
> in above qtd matrix.
> # Rt values
>  Ped  X1X2X3
> 1
> 2
> 3
>
> Then in I can sum Ped$X1, Ped$X2, Ped$X3. The idea is to calculated separate
> Rt values for each variable group by Ped variables separately. Then add the
> values.
>
> Thank you so much for your time. Hope I had made it clear now.
>
> Maya
>
>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.