Terry Therneau-2 wrote: > > This query of "why do SAS and S give different answers for Cox models" > comes > up every so often. The two most common reasons are that > a. they are using different options for the ties > b. the SAS and S data sets are slightly different. > You have both errors. > > First, make sure I have the same data set by reading a common file, and > then > compare the results. > > tmt54% more sdata.txt > 1 0.0 0.5 0 0 > 1 0.5 3.0 1 1 > 2 0.0 1.0 0 0 > 2 1.0 1.5 1 1 > 3 0.0 6.0 0 0 > 4 0.0 8.0 0 1 > 5 0.0 1.0 0 0 > 5 1.0 8.0 1 0 > 6 0.0 21.0 0 1 > 7 0.0 3.0 0 0 > 7 3.0 11.0 1 1 > > tmt55% more test.sas > options linesize=80; > > data trythis; > infile 'sdata.txt'; > input id start end delir outcome; > > proc phreg data=trythis; > model (start, end)*outcome(0)=delir/ ties=discrete; > > proc phreg data=trythis; > model (start, end)*outcome(0)=delir/ ties=efron; > > > tmt56% more test.r > trythis <- read.table('sdata.txt', > col.names=c("id", "start", "end", "delir", > "outcome")) > > coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact') > coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron') > > ----------------- > I now get comparable answers. Note that Cox's "exact partial likelihood" > is > the correct form to use for discrete time data. I labeled this as the > 'exact' > method and SAS as the 'discrete' method. The "exact marginal likelihood" > of > Prentice et al, which SAS calls the 'exact' method is not implemented in > S. > > As to which package is more reliable, I can only point to a set of > formal test > cases that are found in Appendix E of the book by Therneau and Grambsch. > > [...] > >
I am processing estimations of regression parameters in the Cox model for recurrent event data with time-dependent covariates. As my data sets contain a lot of ties, I use the "discrete" method of SAS ("PHREG" procedure) and "exact" option in R ("coxph" function of "survival" package). Despite the high computation time (up to 45s), I always get estimations without error or warning message with the "PHREG" procedure. On the other hand, when I use R software (latest version 2.13.11 on 32 or 64 bits), I sometimes get different estimates from those obtained with SAS and I get various warnings. And some other time I don't get any result, R freezes and does not respond. In order to understand, I have tried some tests from your examples. It turns out that dysfunctions appear when the proportion of ties become important : -------------------- Test1 -------------------- With R : > (trythis <- read.table('***\\sdata4.txt', + col.names=c("id", "start", "end", "delir", "outcome"))) id start end delir outcome 1 1 0.5 3.0 1 1 2 1 0.5 3.0 1 1 3 1 0.5 3.0 1 1 4 1 0.5 3.0 1 1 5 1 0.5 3.0 1 1 6 2 1.0 1.5 1 1 7 2 1.0 1.5 1 1 8 2 1.0 1.5 1 1 9 2 1.0 1.5 1 1 10 2 1.0 1.5 1 1 11 2 1.0 1.5 1 1 12 2 1.0 1.5 1 1 13 2 1.0 1.5 1 1 14 4 0.0 8.0 0 1 15 4 0.0 8.0 0 1 16 4 0.0 8.0 0 1 17 4 0.0 8.0 0 1 18 5 0.0 1.0 0 0 19 5 0.0 1.0 0 0 20 5 0.0 1.0 0 0 21 5 0.0 1.0 0 0 > coxph(Surv(start, end, outcome) ~ delir, data=trythis, method='exact') Call: coxph(formula = Surv(start, end, outcome) ~ delir, data = trythis, method = "exact") coef exp(coef) se(coef) z p delir 22.5 6.06e+09 15460 0.00146 1 Likelihood ratio test=15.6 on 1 df, p=8.04e-05 n= 21, number of events= 17 Message d'avis : In fitter(X, Y, strats, offset, init, control, weights = weights, : Ran out of iterations and did not converge With SAS : data trythis ; input id start end delir outcome; datalines; 1 0.5 3.0 1 1 1 0.5 3.0 1 1 1 0.5 3.0 1 1 1 0.5 3.0 1 1 1 0.5 3.0 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 4 0.0 8.0 0 1 4 0.0 8.0 0 1 4 0.0 8.0 0 1 4 0.0 8.0 0 1 5 0.0 1.0 0 0 5 0.0 1.0 0 0 5 0.0 1.0 0 0 5 0.0 1.0 0 0 run; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=discrete; RUN; No error message, results : estimate delir : 20.52466 se : 5689 Pr > Khi 2 : 0.9971 convergence status : "Convergence criterion (GCONV=1E-8) satisfied." -------------------- Test2 -------------------- With R : > (trythis <- read.table('***\\montest.txt', + col.names=c("id", "start", "end", "delir", "outcome"))) id start end delir outcome 1 1 1 2 1 1 2 1 1 2 0 1 3 1 1 2 0 1 4 1 1 2 1 1 5 1 1 2 0 1 6 1 1 2 1 1 7 2 1 2 1 1 8 2 1 2 1 0 9 3 8 10 0 0 10 4 8 10 0 0 > coxph(Surv(start, end, outcome) ~ delir, data=trythis, method='exact') Call: coxph(formula = Surv(start, end, outcome) ~ delir, data = trythis, method = "exact") coef exp(coef) se(coef) z p delir -20.8 9.42e-10 42054 -0.000494 1 Likelihood ratio test=0.94 on 1 df, p=0.332 n= 10, number of events= 7 Message d'avis : In fitter(X, Y, strats, offset, init, control, weights = weights, : Loglik converged before variable 1 ; beta may be infinite. With SAS : data trythis ; input id start end delir outcome; datalines; 1 1.0 2.0 1 1 1 1.0 2.0 0 1 1 1.0 2.0 0 1 1 1.0 2.0 1 1 1 1.0 2.0 0 1 1 1.0 2.0 1 1 2 1.0 2.0 1 1 2 1.0 2.0 1 0 3 8.0 10.0 0 0 4 8.0 10.0 0 0 run; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=discrete; RUN; results : estimate delir : -17.78257 se : 9383 Pr > Khi 2 : 0.9985 convergence status : "Convergence criterion (GCONV=1E-8) satisfied." So I get 2 different warning messages and with my data sets it can be worse. How would you explain these dysfunctions or warnings with "coxph" function and how should I deal with it ? Do you think that the results obtained by SAS are reliable in such cases ? Thank you for your answer. -- View this message in context: http://r.789695.n4.nabble.com/comparing-SAS-and-R-survival-analysis-with-time-dependent-covariates-tp874438p3678763.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.