Larry, You found a data set that kills coxph. I'll have to think about what to do since on the one hand it's your own fault for trying to fit a very bad model, and on the other I'd like the routine to give a nice error message before it dies.
In the data set you sent me the predictor variable is very skewed: > quantile(anomaly1$CREAT, c(0, .5, .9, .999, 1)) 0% 50% 90% 99.9% 100% 0.5000 1.4000 2.2000 4.6027 7.3000 There are 1000 observations, the largest is 7.3 and the second largest is 4.6. The subject with a 7.3 was a failure. When you fit a 4 degree of freedom polynomial to this there is enough "space" between the two largest obvservations for the fit to turn nearly straight up and give a probability of failure of approx 1.0 to this point. This is the actual, true maximum likelihood solution, so the routine is doing the right thing to try and attain it. At iteration 1 the linear predictor from the Cox model has the following distribution. >quantile(lp, c(0, .5, .9, .999, 1)) 0% 50% 90% 99.9% 100% -1.65321482 -1.19164206 -0.08964025 31.05389866 900.67627778 Unfortunately exp(900) = Inf, and that pretty much does in my C routine right there; within another iteration there are NA values littered about and the return value for the coefficients is all NA. If instead you fit a smoothing spline to the data coxph(Surv(TTFAIL, FFAIL) ~ pspline(CREAT6MO, df=4), data=anomaly1) the fitted response is essentially linear from 0 to 5 before turning upward, but it never gets outrageous due to the shrinkage towards linearity. Catching this type of issue with an extremely skewed variable (creatinine^4) would be very hard to do reliably. The primary message is that polynomials and biology are a bad mix. Terry Therneau ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.