@Duncan, You make a very good point. Somehow I overlooked that 0 is not positive. I guess that rules out the log normal model.
My challenge here is finding the right model for this data. Originally it was a nice count of students. Relatively easy to model with a zero inflated Poisson model. The resulting residuals seemed reasonable. However, I was then instructed to change the count of students to a "rate" which was calculated as students / population (Each school has its own population.)) This is now no longer a count variable, but a proportion between 0 and 1. This "rate" (students/population) is no longer Poisson, but is certainly not normal either. So, I'm a bit lost as to the appropriate distribution to represent it. Any thoughts? -- Noah Silverman, M.S. UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095 On Apr 16, 2013, at 12:44 PM, Thomas Lumley <tlum...@uw.edu> wrote: > On Wed, Apr 17, 2013 at 5:19 AM, Noah Silverman <noahsilver...@ucla.edu> > wrote: > Hi, > > I have some data, that when plotted looks very close to a log-normal > distribution. My goal is to build a regression model to test how this > variable responds to several independent variables. > > [snip] > > When I try to build a simple model, I also get an error: > > l <- glm(y~ x, family=gaussian(link="log")) > > Error in eval(expr, envir, enclos) : cannot find valid starting values: > please specify some > > > Duncan has described the problems with the lognormal. I will just point out > that this 'simple model' is not lognormal. It is a model with normal errors > and log link, ie. > > y ~ N(mu, sigma^2) > log(mu) = x \beta > > > -thomas > > -- > Thomas Lumley > Professor of Biostatistics > University of Auckland [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.