G'day all, On Tue, 30 Mar 2010 16:19:46 +0100 Corrado <ct...@york.ac.uk> wrote:
> David Winsemius wrote: > > A) It is not an error, only a warning. Wouldn't it seem reasonable > > to issue such a warning if you have data that violates the > > distributional assumptions? > I am not questioning the approach. I am only trying to understand why > a (rather expensive) source of documentation and the behaviour of a > function are not aligned. 1) Also expensive books have typos in them. 2) glm() is from a package that is part of R and the author of this book is AFAIK not a member of R core, hence has no control on whether his documentation and the behaviour of a function are aligned. a) If he were documenting a function that was part of a package he wrote as support for his book, as some authors do, there might be a reason to complain. But then 1) would still apply. b) Even books written by members of R core have occasionally misalignments between the behaviour of a function and the documentation contained in such books. This can be due to them documenting a function over whose implementation they do not have control (e.g. a function in a contributed package) or the fact that R is improving/changing from version to version while books are rather static. For these reasons it is always worthwhile to check the errata page for a book, if such exists. The source of the warning is due to the fact that you do not provide all necessary information about your response. If your response is binomial (with a mean depended on some explanatory variables), then each response consists of two numbers, the number of trials and the number of success. If you calculate the observed proportion of successes from these two numbers and feed this into glm as the response, you are omitting necessary information. In this case, you should provide the number of trials on which each proportion is based as prior weights. For example: R> x <- seq(from=-1,to=1,length=41) R> px <- exp(x)/(1+exp(x)) R> nn <- sample(8:12, 41, replace=TRUE) R> yy <- rbinom(41, size=nn, prob=px) R> y <- yy/nn R> glm(y~x, family=binomial, weights=nn) Call: glm(formula = y ~ x, family = binomial, weights = nn) Coefficients: (Intercept) x 0.246 1.124 Degrees of Freedom: 40 Total (i.e. Null); 39 Residual Null Deviance: 91.49 Residual Deviance: 50.83 AIC: 157.6 R> glm(y~x, family=binomial) Call: glm(formula = y ~ x, family = binomial) Coefficients: (Intercept) x 0.2143 1.1152 Degrees of Freedom: 40 Total (i.e. Null); 39 Residual Null Deviance: 9.256 Residual Deviance: 5.229 AIC: 49.87 Warning message: In eval(expr, envir, enclos) : non-integer #successes in a binomial glm! HTH, Cheers, Berwin ========================== Full address ============================ Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019) +61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009 e-mail: ber...@maths.uwa.edu.au Australia http://www.maths.uwa.edu.au/~berwin ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.