Hi,
I'm trying to do binary logistic regression on 10 covariables, comparing
glm to lrm from Harrell's Design package. They don't seem to agree on
whether the data is collinear:
> library(Design)
> load(url("http://www.csse.unimelb.edu.au/~gabraham/data.Rdata"))
> lrm(y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10, data=x)
singular information matrix in lrm.fit (rank= 10 ). Offending variable(s):
X10
Error in j:(j + params[i] - 1) : NA/NaN argument
If I understand correctly, lrm is complaining about collinearity in the
data. However, the rank of the matrix is 10:
> qr(x)$rank
[1] 10
glm doesn't seem to care about the supposed collinearity, but does say
that the data are perfectly separable:
> glm(y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10, data=x,
+ family=binomial(), control=glm.control(maxit=50))
Call: glm(formula = y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 +
X10, family = binomial(), data = x, control = glm.control(maxit = 50))
Coefficients:
(Intercept) X1 X2 X3 X4
X5
-6.921e+03 7.185e-02 4.344e-02 -3.980e-02 -5.362e-02
-6.387e-03
X6 X7 X8 X9 X10
2.455e-01 2.753e-02 -1.848e-01 1.903e-01 -3.187e-02
Degrees of Freedom: 27 Total (i.e. Null); 17 Residual
Null Deviance: 38.82
Residual Deviance: 4.266e-10 AIC: 22
Warning message:
In glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
etastart, :
fitted probabilities numerically 0 or 1 occurred
What's the reason for this discrepancy?
Thanks,
Gad
--
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: [EMAIL PROTECTED]
web: http://www.csse.unimelb.edu.au/~gabraham
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.