
I'm trying to do binary logistic regression on 10 covariables, comparing glm to lrm from Harrell's Design package. They don't seem to agree on whether the data is collinear:

> library(Design)
> load(url("http://www.csse.unimelb.edu.au/~gabraham/data.Rdata";))
> lrm(y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10, data=x)
singular information matrix in lrm.fit (rank= 10 ).  Offending variable(s):
Error in j:(j + params[i] - 1) : NA/NaN argument

If I understand correctly, lrm is complaining about collinearity in the data. However, the rank of the matrix is 10:
> qr(x)$rank
[1] 10

glm doesn't seem to care about the supposed collinearity, but does say that the data are perfectly separable:

> glm(y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10, data=x,
+    family=binomial(), control=glm.control(maxit=50))

Call: glm(formula = y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10, family = binomial(), data = x, control = glm.control(maxit = 50))

(Intercept) X1 X2 X3 X4 X5 -6.921e+03 7.185e-02 4.344e-02 -3.980e-02 -5.362e-02 -6.387e-03
         X6           X7           X8           X9          X10
  2.455e-01    2.753e-02   -1.848e-01    1.903e-01   -3.187e-02

Degrees of Freedom: 27 Total (i.e. Null);  17 Residual
Null Deviance:      38.82
Residual Deviance: 4.266e-10    AIC: 22
Warning message:
In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, :
  fitted probabilities numerically 0 or 1 occurred

What's the reason for this discrepancy?


Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
web: http://www.csse.unimelb.edu.au/~gabraham

R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to