I have no idea what your code is doing, nor why you want correlated binary
variables. Correlation makes little or no sense in the context of binary random
variables --- or more generally in the context of discrete random variables.

Be that as it may, it is an easy calculation to show that if X and Y are binary
random variables both with success probability of 0.5 then cor(X,Y) = 0.2 if
and only if Pr(X=1 | Y = 1) = 0.6. So just generate X and Y using that fact:

set.seed(42)
X <- numeric(1000)
Y <- numeric(1000)
for(i in 1:1000) {
   Y[i] <- rbinom(1,1,0.5)
   X[i] <- if(Y[i]==1) rbinom(1,1,0.6) else rbinom(1,1,0.4)
}

# Check:
cor(X,Y) # Get 0.2012336

Looks about right.  Note that the sample proportions are 0.484 and
0.485 for X and Y respectively.  These values do not differ significantly
from 0.5.

    cheers,

            Rolf Turner

On 28/09/12 08:26, Simon Zehnder wrote:
Hi R-fellows,

I am trying to simulate a multivariate correlated sample via the Gaussian 
copula method. One variable is a binary variable, that should be 
autocorrelated. The autocorrelation should be rho = 0.2. Furthermore, the 
overall probability to get either outcome of the binary variable should be 0.5.
Below you can see the R code (I use for simplicity a diagonal matrix in rmvnorm 
even if it produces no correlated sample):

"sampleCop" <- function(n = 1000, rho = 0.2) {
        
        require(splus2R)
        mvrs <- rmvnorm(n + 1, mean = rep(0, 3), cov = diag(3))
        pmvrs <- pnorm(mvrs, 0, 1)
        var1 <- matrix(0, nrow = n + 1, ncol = 1)
        var1[1] <- qbinom(pmvrs[1, 1], 1, 0.5)
        if(var1[1] == 0) var1[nrow(mvrs)] <- -1
        for(i in  1:(nrow(pmvrs) - 1)) {
                if(pmvrs[i + 1, 1] <= rho) var1[i + 1] <- var1[i]
                else var1[i + 1] <- var1[i] * (-1)
        }
        sample <- matrix(0, nrow = n, ncol = 4)
        sample[, 1] <- var1[1:nrow(var1) - 1]
        sample[, 2] <- var1[2:nrow(var1)]
        sample[, 3] <- qnorm(pmvrs[1:nrow(var1) - 1, 2], 0, 1, 1, 0)
        sample[, 4] <- qnorm(pmvrs[1:nrow(var1) - 1, 3], 0, 1, 1, 0)
        
        sample
        
}

Now, the code is fine, everything compiles. But when I compute the 
autocorrelation of the binary variable, it is not 0.2, but 0.6. Does anyone 
know why this happens?

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to