Hi Shane, it sounds to me as though you have a fairly well-defined problem. You want to generate random numbers with a specific mean, variance, and correlation with another random varaible. I would reverse-enginerr the fuinctions for simple linear regression to get a result like
y = beta_0 + beta_1 * x + rnorm(n, 0, sigma^2) and use that as the basis of generating random numbers. Not sure how to interpret the second question ... Cheers Andrew On Sun, May 01, 2011 at 12:33:41AM -0400, Shane Phillips wrote: > I have the following script for generating a dataset. It works like a champ > except for a couple of things. > > 1. I need the variables "itbs" and "map" to be negatively correlated with > the binomial variable "lunch" (around -0.21 and -0.24, respectively). The > binomial variable "lunch" needs to remain unchanged. > 2. While my generated variables do come out with the desired means and > correlations, the distribution is very narrow and only represents a small > portion of the possible scores. Can I force it to encompass a wider range of > scores, while maintaining my desired parameters and correlations? > > Please help... > > Shane > > Script follows... > > > > #Number the subjects > subject=1:1000 > #Assign a treatment condition from a binomial distribution with a probability > of 0.13 > treat=rbinom(1*1000,1,.13) > #Assign a lunch status condition froma binomial distribution with a > probability of 0.35 > lunch=rbinom(1*1000,1,.35) > #Generate age in months from a random normal distribution with mean of 87 and > sd of 2 > age=rnorm(1000,87,2) > #invoke the MASS package > require(MASS) > #Establish the covariance matrix for MAP, ITBS and CogAT scores > sigma <- matrix(c(1, 0.84, 0.59, 0.84, 1, 0.56, 0.59, 0.56, 1), ncol = 3) > #Establish MAP as a random normal variable with mean of 200 and sd of 9 > map <- rnorm(1000, 200, 9) > #Establish ITBS as a random normal variable with mean of 175 and sd of 15 > itbs <- rnorm(1000, 175, 15) > #Establish CogAT as a random normal variable with mean of 100 and sd of 16 > cogat<-rnorm(1000,100,16) > #Create a dataframe of MAP, ITBS, and CogAT > data <- data.frame(map, itbs, cogat) > #Draw from the multivariate distribution defined by MAP, ITBS, and CogAT > means and the covariance matrix > sim <- mvrnorm(1000, mu=mean(data), sigma, empirical=FALSE) > #Set growth at 0 > growth=0 > #Combine elements into a single dataset > simtest=data.frame (subject=subject, treat=treat,lunch, > age=round(age,0),round(sim,0),growth) > #Set mean growth by treatment condition with treatd subjects having a mean > growth of 1.5 and non-treated having a mean growth of 0.1 > simtest<-transform(simtest, growth=rnorm(1000,m=ifelse(treat==0,0.1,1.5),s=1)) > simtest > cor (simtest) > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/ Forest Analytics with R (Springer, 2011) http://www.ms.unimelb.edu.au/FAwR/ Introduction to Scientific Programming and Simulation using R (CRC, 2009): http://www.ms.unimelb.edu.au/spuRs/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.