[R] covariance matrix: a erro and simple mixed model question, but id not know answer sorry
Dear list I need your help: Execuse me for my limited R knowledge. #example data set set.seed (134) lm=c(1:4) block = c(rep(lm,6)) gen <- c(rep(1, 4), rep(2, 4), rep(3, 4), rep(4, 4),rep(5, 4),rep(6, 4)) X1 = c( rnorm (4, 10, 4), rnorm (4, 12, 6), rnorm (4, 10, 7),rnorm (4, 5, 2), rnorm (4, 8, 4), rnorm (4,7, 2)) X2 = X1 + rnorm(length(X1), 0,3) yvar <- c(X1, X2) X <- c(rep( 1, length(X1)), rep( 2, length(X2))) # dummy x variable dataf <- data.frame(as.factor(block), as.factor(gen), as.factor(X), yvar ) My objective to estimate variance-covariance between two variables X1 and X2. Means that I need to fit something like unstructure (UN) covariance structure. Question 1: I got the following error require("lme4"); fm1Gen <- lmer(yvar ~ X + gen +(1|block), data= dataf) # Question 1: should I consider X fixed or random Error in model.frame.default(data = dataf, formula = yvar ~ X + gen + : variable lengths differ (found for 'gen') A tried nlme too. require(nlme) fm2Gen <- lme(yvar ~ X + gen, random= ~ 1|block, data= dataf) Error in model.frame.default(formula = ~yvar + X + gen + block, data = list( : variable lengths differ (found for 'gen') # similar error Question 2: How can get I covariance matrix between X1 and X2 either using lme4 or lmer. X1X2 X1 Var (X1) Cov(X1,X2) X2 Cov(X1, X2) Var(X2) Should I put gen in the model to do this? Should I specify something in "* correlation* = " Thank you for your time Maya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] covariance matrix: a erro and simple mixed model question, but id not know answer sorry
Let me clarify the output I want to create: Source X1 (var) X2 (var) X1&X2 (cov) gen var(X1) var(X2) cov(x1X2) block var(X1) var(X2) cov(x1x2) error/ res var(x1) var(x2) cov(x1x2) I need to do posterior analysis out of this table Thanks in advance Maya On Sun, Apr 17, 2011 at 9:59 PM, Maya Joshi wrote: > Dear list > > I need your help: Execuse me for my limited R knowledge. > > #example data set > set.seed (134) > lm=c(1:4) > > block = c(rep(lm,6)) > > gen <- c(rep(1, 4), rep(2, 4), rep(3, 4), rep(4, 4),rep(5, 4),rep(6, 4)) > > X1 = c( rnorm (4, 10, 4), rnorm (4, 12, 6), rnorm (4, 10, 7),rnorm (4, 5, > 2), rnorm (4, 8, 4), rnorm (4,7, 2)) > > X2 = X1 + rnorm(length(X1), 0,3) > > yvar <- c(X1, X2) > > X <- c(rep( 1, length(X1)), rep( 2, length(X2))) # dummy x variable > > dataf <- data.frame(as.factor(block), as.factor(gen), as.factor(X), yvar ) > > > > My objective to estimate variance-covariance between two variables X1 and > X2. Means that I need to fit something like unstructure (UN) covariance > structure. > > > > Question 1: I got the following error > > require("lme4"); > > fm1Gen <- lmer(yvar ~ X + gen +(1|block), data= dataf) # Question 1: > should I consider X fixed or random > > > > Error in model.frame.default(data = dataf, formula = yvar ~ X + gen + : > variable lengths differ (found for 'gen') > > > > A tried nlme too. > > require(nlme) > > fm2Gen <- lme(yvar ~ X + gen, random= ~ 1|block, data= dataf) > > Error in model.frame.default(formula = ~yvar + X + gen + block, data = > list( : > variable lengths differ (found for 'gen') # similar error > > > > Question 2: How can get I covariance matrix between X1 and X2 either using > lme4 or lmer. > >X1X2 > > X1 Var (X1) Cov(X1,X2) > > X2 Cov(X1, X2) Var(X2) > > > > Should I put gen in the model to do this? Should I specify something in "* > correlation* = " > > Thank you for your time > > Maya > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome
Dear R experts. I might be missing something obvious. I have been trying to fix this problem for some weeks. Please help. #data ped <- c(rep(1, 4), rep(2, 3), rep(3, 3)) y <- rnorm(10, 8, 2) # variable set 1 M1a <- sample (c(1, 2,3), 10, replace= T) M1b <- sample (c(1, 2,3), 10, replace= T) M1aP1 <- sample (c(1, 2,3), 10, replace= T) M1bP2 <- sample (c(1, 2,3), 10, replace= T) # variable set 2 M2a <- sample (c(1, 2,3), 10, replace= T) M2b <- sample (c(1, 2,3), 10, replace= T) M2aP1 <- sample (c(1, 2,3), 10, replace= T) M2bP2 <- sample (c(1, 2,3), 10, replace= T) # variable set 3 M3a <- sample (c(1, 2,3), 10, replace= T) M3b <- sample (c(1, 2,3), 10, replace= T) M3aP1 <- sample (c(1, 2,3), 10, replace= T) M3bP2 <- sample (c(1, 2,3), 10, replace= T) mydf <- data.frame (ped, M1a,M1b,M1aP1,M1bP2, M2a,M2b,M2aP1,M2bP2, M3a,M3b,M3aP1,M3bP2, y) # functions and further calculations mmat <- matrix (c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1", "M1bP2","M2bP2","M3bP2"), ncol = 4) # first function myfun <- function(x) { x<- as.vector(x) ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1) ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1) qt <- ot1 + ot2 return(qt) } qt <- apply(mmat, 1, myfun) ydv <- c((y - mean(y))^2) qtd <- data.frame(ped, ydv, qt) # second function myfun2 <- function(dataframe) { vydv <- sum(ydv)*0.25 sumD <- sum(ydv * qt) Rt <- vydv / sumD return(Rt) } # using plyr require(plyr) dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2) Here are 2 issues: (1) The output just one, I need the output for all three set of variables (as listed above) (2) all three values of dfsumd is returning to same for all level of ped: 1,2, 3 Means that the function is applied to whole dataset but only replicated in output !!! I tried with plyr not being lazy but due to my limited R knowledge, If you have a different suggestion, you are welcome too. Thank you in advance... Maya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome
Dear R experts: Thank you Dennis and David ... As David indicated sorry of language and I have tried to explain what I intend to do... I would this with Dennis's solution code: ped <- rep(1:3, c(4, 3, 3)) > y <- rnorm(10, 8, 2) > # This replaces all of your sample() statements, and is equivalent: > smat <- matrix(sample(1:3, 120, replace = TRUE), ncol = 12) > colnames(smat) <- c('M1a', 'M1b', 'M1aP1', 'M1bP2', >'M2a', 'M2b', 'M2aP1', 'M2bP2', >'M3a', 'M3b', 'M3aP1', 'M3bP2') > mydf <- as.data.frame(cbind(ped, y, smat)) > > > mmat <- matrix (c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1", "M1bP2","M2bP2","M3bP2"), ncol = 4) [,1] [,2] [,3][,4] [1,] "M1a" "M1b" "M1aP1" "M1bP2" [2,] "M2a" "M2b" "M2aP1" "M2bP2" [3,] "M3a" "M3b" "M3aP1" "M3bP2" I want to compare [,1] and [,3] names of mydf (mydf[x[1]] == mydf[x[3]]) . for all three rows in the nmat. nmat is guiding me which variable I want to pick while working on mydf. In my real dataset I have 1000 such set of variables. # first function myfun <- function(x) { x<- as.vector(x) ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1) ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1) qt <- ot1 + ot2 return(qt) } qt <- apply(mmat, 1, myfun) Solution of this will create a matrix with number of set of variables by number of rows in the mydf [,1] [,2] [,3] [1,]0 -20 [2,] -20 -2 [3,]0 -20 [4,]002 [5,]0 -2 -2 [6,] -20 -2 [7,] -2 -20 [8,] -200 [9,] -202 [10,]000 ydv <- c((y - mean(y))^2) # calculates mean of y and deviations from it for each y values [1] 9.5012525 0.2578341 1.6676271 6.3102202 12.8701830 9.5509480 [7] 0.8661107 3.1828185 0.9215140 1.0909813 qtd <- data.frame(ped, ydv, qt) # new data.frame with above function's output with ped variable pedydv X1 X2 X3 11 9.5012525 0 -2 0 21 0.2578341 -2 0 -2 31 1.6676271 0 -2 0 41 6.3102202 0 0 2 52 12.8701830 0 -2 -2 62 9.5509480 -2 0 -2 72 0.8661107 -2 -2 0 83 3.1828185 -2 0 0 93 0.9215140 -2 0 2 10 3 1.0909813 0 0 0 Now I want to calculate Rt for each X1, X2, X3 (in real data world I will have 1000 of them). The expected result of the following function should look like 3 x 3 matrix. This is just example, I do have Ped around 200 and X1 is around 1000. # Rt values Ped X1X2X3 1 2 3 # second function myfun2 <- function(dataframe) { vydv <- sum(ydv)*0.25 sumD <- sum(ydv * qt) Rt <- vydv / sumD return(Rt) } # using plyr require(plyr) dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2) dfsumd1 mydf$ped V1 11 -0.1047935 22 -0.1047935 33 -0.1047935 This is not what I want. I want ped wise Rt values for each of X variables in above qtd matrix. # Rt values Ped X1X2X3 1 2 3 Then in I can sum Ped$X1, Ped$X2, Ped$X3. The idea is to calculated separate Rt values for each variable group by Ped variables separately. Then add the values. Thank you so much for your time. Hope I had made it clear now. Maya > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome
Let me clear my last part qt2 <- qtd[-1:-2] # second function myfun2 <- function(x) { vydv <- sum(ydv)*0.25 sumD <- sum(ydv * x) Rt <- vydv / sumD return(Rt) } # ignoring grouping with ped, the following is output qt2 <- apply(qtd, 2, myfun2) qtd1 <- qt2[-1:-2] # The following result seems to pool at X1, X2, and X3 require(plyr) dfsumd1 <- ddply(qtd,.(qtd$ped),myfun2) qtd$ped V1 1 1 0.2296159 2 2 0.1045569 3 3 -5.8861942 But I need a twoway table PedX1X2 X3 1 2 3 Still unsuccessful, sorry ! On Sat, Sep 3, 2011 at 7:57 AM, Maya Joshi wrote: > Dear R experts: > > Thank you Dennis and David ... > > As David indicated sorry of language and I have tried to explain what I > intend to do... I would this with Dennis's solution code: > > > ped <- rep(1:3, c(4, 3, 3)) >> y <- rnorm(10, 8, 2) >> # This replaces all of your sample() statements, and is equivalent: >> smat <- matrix(sample(1:3, 120, replace = TRUE), ncol = 12) >> colnames(smat) <- c('M1a', 'M1b', 'M1aP1', 'M1bP2', >>'M2a', 'M2b', 'M2aP1', 'M2bP2', >>'M3a', 'M3b', 'M3aP1', 'M3bP2') >> mydf <- as.data.frame(cbind(ped, y, smat)) >> > > >> >> mmat <- matrix > (c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1", > "M1bP2","M2bP2","M3bP2"), ncol = 4) >[,1] [,2] [,3][,4] > > [1,] "M1a" "M1b" "M1aP1" "M1bP2" > [2,] "M2a" "M2b" "M2aP1" "M2bP2" > [3,] "M3a" "M3b" "M3aP1" "M3bP2" > > I want to compare [,1] and [,3] names of mydf (mydf[x[1]] == > mydf[x[3]]) . for all three rows in the nmat. nmat is guiding me which > variable I want to pick while working on mydf. In my real dataset I have > 1000 such set of variables. > > # first function > myfun <- function(x) { > x<- as.vector(x) > ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1) > ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1) > qt <- ot1 + ot2 > return(qt) > } > qt <- apply(mmat, 1, myfun) > > Solution of this will create a matrix with number of set of variables by > number of rows in the mydf > [,1] [,2] [,3] > [1,]0 -20 > [2,] -20 -2 > [3,]0 -20 > [4,]002 > [5,]0 -2 -2 > [6,] -20 -2 > [7,] -2 -20 > [8,] -200 > [9,] -202 > [10,]000 > > ydv <- c((y - mean(y))^2) # calculates mean of y and deviations from it > for each y values > [1] 9.5012525 0.2578341 1.6676271 6.3102202 12.8701830 9.5509480 > [7] 0.8661107 3.1828185 0.9215140 1.0909813 > > qtd <- data.frame(ped, ydv, qt) # new data.frame with above function's > output with ped variable > pedydv X1 X2 X3 > 11 9.5012525 0 -2 0 > 21 0.2578341 -2 0 -2 > 31 1.6676271 0 -2 0 > 41 6.3102202 0 0 2 > 52 12.8701830 0 -2 -2 > 62 9.5509480 -2 0 -2 > 72 0.8661107 -2 -2 0 > 83 3.1828185 -2 0 0 > 93 0.9215140 -2 0 2 > 10 3 1.0909813 0 0 0 > > Now I want to calculate Rt for each X1, X2, X3 (in real data world I will > have 1000 of them). The expected result of the following function should > look like 3 x 3 matrix. This is just example, I do have Ped around 200 and > X1 is around 1000. > # Rt values > Ped X1X2X3 > 1 > 2 > 3 > > # second function > myfun2 <- function(dataframe) { > vydv <- sum(ydv)*0.25 > sumD <- sum(ydv * qt) > Rt <- vydv / sumD > return(Rt) > } > > # using plyr > require(plyr) > dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2) > > dfsumd1 > mydf$ped V1 > 11 -0.1047935 > 22 -0.1047935 > 33 -0.1047935 > > This is not what I want. I want ped wise Rt values for each of X variables > in above qtd matrix. > # Rt values > Ped X1X2X3 > 1 > 2 > 3 > > Then in I can sum Ped$X1, Ped$X2, Ped$X3. The idea is to calculated separate > Rt values for each variable group by Ped variables separately. Then add the > values. > > Thank you so much for your time. Hope I had made it clear now. > > Maya > >> >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.