You are not creating your data set properly. Your 'mat' is:
> mat column1 column2 1 1 0 2 1 0 3 0 1 4 0 0 5 1 1 6 1 0 7 1 0 8 0 1 9 0 0 10 1 1 What you really want is: DF <- data.frame(y = c(1,0,1,0,0,1,0,0,1,1), x = c(5,4,1,6,3,6,5,3,7,9)) > DF y x 1 1 5 2 0 4 3 1 1 4 0 6 5 0 3 6 1 6 7 0 5 8 0 3 9 1 7 10 1 9 MOD <- glm(y ~ x, data = DF, family = binomial) > summary(MOD) Call: glm(formula = y ~ x, family = binomial, data = DF) Deviance Residuals: Min 1Q Median 3Q Max -1.3353 -1.0229 -0.1239 0.9956 1.7477 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.6118 1.7833 -0.904 0.366 x 0.3293 0.3383 0.973 0.330 (Dispersion parameter for binomial family taken to be 1) Null deviance: 13.863 on 9 degrees of freedom Residual deviance: 12.767 on 8 degrees of freedom AIC: 16.767 Number of Fisher Scoring iterations: 4 HTH, Marc Schwartz On Nov 12, 2010, at 12:56 PM, Benjamin Godlove wrote: > I think it is likely I am missing something. Here is a very simple example: > > R code: > > mat <- matrix(nrow = 10, ncol = 2, c(1,0,1,0,0,1,0,0,1,1), > c(5,4,1,6,3,6,5,3,7,9), dimnames = list(c(1,2,3,4,5,6,7,8,9,10), > c("column1","column2"))) > > g <- glm(mat[1:10] ~ mat[11:20], family = binomial (link = logit)) > > g$converged > > > SAS code: > > data mat; > input col1 col2; > datalines; > 1 5 > 0 4 > 1 1 > 0 6 > 0 3 > 1 6 > 0 5 > 0 3 > 1 7 > 1 9 > ; > > proc logistic data=mat descending; > model col1 = col2 / link=logit; > run; > > SAS output (in case you don't have access to SAS): > Convergence criterion satisfied > > Estimate SE > Intercept -1.6118 1.7833 > col2 0.3293 0.3383 > > > Of course, with an example this small, it is not so surprising that the two > methods differ; and they hardly differ by a single S. But as the datasets > get larger, the difference is more pronounced. Let me know if you would > like me to send you a large dataset. I get the feeling I am doing something > wrong in R, so please let me know what you think. > > Thank you! > > Ben Godlove > > On Thu, Nov 11, 2010 at 1:59 PM, Albyn Jones <jo...@reed.edu> wrote: > >> do you have factors (categorical variables) in the model? it could be >> just a parameterization difference. >> >> albyn >> >> On Thu, Nov 11, 2010 at 12:41:03PM -0500, Benjamin Godlove wrote: >>> Dear R developers, >>> >>> I have noticed a discrepancy between the coefficients returned by R's >> glm() >>> for logistic regression and SAS's PROC LOGISTIC. I am using dist = >> binomial >>> and link = logit for both R and SAS. I believe R uses IRLS whereas SAS >> uses >>> Fisher's scoring, but the difference is something like 100 SE on the >>> intercept. What accounts for such a huge difference? >>> >>> Thank you for your time. >>> >>> Ben Godlove >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> -- >> Albyn Jones >> Reed College >> jo...@reed.edu >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.