This makes sense. Guess I should have put a pencil to it. Further investigation revealed that it is indeed a possibility that the relation between x and y is nonlinear:
ax+bx^2+c where a, b and c are to be determined. My question is how to code this in my simulated data. I could do something like this after appropriately defining beta. meanpred and varpred: x[,2]<-rnorm(length,meanpred[2],sqrt(varpred[2])) x[,3]<-rnorm(length,meanpred[3],sqrt(varpred[3])) fixpart<-x%*%beta binomprob<-exp(fixpart)/(1+exp(fixpart)) data$y<-rbinom(n1,1,binomprob) but I'd need to square my x[,3] values before multiplying them by beta. Can I say: x[,3]<-(rnorm(length,meanpred[3],sqrt(varpred[3])))^2 in lieu of x[,3]<-rnorm(length,meanpred[3],sqrt(varpred[3]))? ----- Original Message ----- From: peter dalgaard <pda...@gmail.com> To: Scott Raynaud <scott.rayn...@yahoo.com> Cc: "r-help@r-project.org" <r-help@r-project.org> Sent: Tuesday, December 27, 2011 9:15 AM Subject: Re: [R] rbinom On Dec 27, 2011, at 15:47 , Scott Raynaud wrote: > I have the following code (which I did not write) that generates > data based on a logistic model. I'm only getting a single record > with y=1. It seems implausible that in 50k cases that have a > single y=1. Does that ring alarm bells for anyone else? > Not really. As far as I can tell, "fixpart" is roughly -10.5 (= -1.5 - .25*36), so binomprob is around 2.75e-5, which - nonlinearity notwithstanding - suggests that the expected number of positives out of 50K is something like 1.4. To do this more precisely, just compute and print sum(binomprob) in the code you gave. > beta<-c(-1.585600,-0.246900) > betasize<-length(beta) > meanpred<-c(0,35.900000) > varpred<-c(0,1.000000) > #loop code > x<-matrix(1,length,betasize) #length set to 50k > #loop code > x[,2]<-rnorm(length,meanpred[2],sqrt(varpred[2])) #length set to 50k > fixpart<-x%*%beta > binomprob<-exp(fixpart)/(1+exp(fixpart)) > data$y<-rbinom(n1,1,binomprob) > #more loop code > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.