Dear Dr. Therneau,

Thank you very much for your exhaustive answer!

I now see the issue. Perhaps even more important was your confirmation that my 
approach with karno:ns(time, df=4) is theoretically correct. (I knew that 
plot.cox.zph is sound, so I was afraid that the difference can be attributed to 
some fundamental mistake in my approach.)

Thank you again,
Tamas


2019. augusztus 8., 18:17:38, írtad:


This is an excellent question.  
The answer, in this particular case, mostly has to do with the outlier time 
values.  (I've never been convinced that the death at time 999 isn't really a 
misplaced code for "missing", actually).    If you change the knots used by the 
spline you can get quite different values.
For instance, using a smaller data set:

fit1 <-  coxph(Surv(tstart, time, status) ~ trt + prior + karno, veteran)
zph1 <- cox.zph(fit1, transform='identity')
plot(zph1[3])

dtime <- unique(veteran$time[veteran$status ==1])    # all of the death times
veteran2 <- survSplit( Surv(time, status) ~ ., data=veteran, cut=dtime)
fit2 <- coxph(Surv(tstart, time, status) ~ trt + prior + karno +
        karno:ns(time, df=4),  data=veteran2)
tx <- 0:100 * 10    # x positions for plot
ncall <- attr(terms(fit2), "predvars")[[6]]
ty <- eval(ncall, data.frame(time = tx)) %*% coef(fit2)[4:7] + coef(fit2)[3]
lines(tx, ty, col=2)

-------------

Now it looks even worse!  The only difference is that the ns() function has 
chosen a different set of knots.   

The test used by the cox.zph function is based on a score test and is solid.  
The plot that it produces uses a smoothed approximation to the variance matrix 
and is approximate.  So the diagnostic plot will never exactly match an actual 
fit.   In this data set the outliers exacerbate the issue.  To see this try a 
different time scale.

------------
zph2 <- cox.zph(fit1, transform= sqrt)
plot(zph2[3])
veteran2$stime <- sqrt(veteran2$time)
fit3 <- coxph(Surv(tstart, time, status) ~ trt + prior + karno +
           karno:ns(stime, df=4),  data=veteran2)

ncall3 <- attr(terms(fit3), "predvars")[[6]] 
ty3 <- eval(ncall3, data.frame(stime= sqrt(tx))) %*% coef(fit3)[4:7] + 
coef(fit3)[3]
lines(sqrt(tx), ty3, col=2)

----------------

The right tail is now better behaved.   Eliminating the points >900 makes 
things even better behaved.

Terry T.




On 8/8/19 9:07 AM, Ferenci Tamas wrote:

I was thinking of two possible ways to
plot a time-varying coefficient in a Cox model.

One is simply to use survival::plot.cox.zph which directly produces a
beta(t) vs t diagram.

The other is to transform the dataset to counting process format and
manually include an interaction with time, expanded with spline (to be
similar to plot.cox.zph). Plotting the coefficient produces the needed
beta(t) vs t diagram.

I understand that they're slightly different approaches, so I don't
expect totally identical results, but nevertheless, they approximate
the very same thing, so I do expect that the results are more or less
similar.

However:

library( survival )
library( splines )

data( veteran )

zp <- cox.zph( coxph(Surv(time, status) ~ trt + prior + karno,
                     data = veteran ), transform = "identity" )[ 3 ]

veteran3 <- survSplit( Surv(time, status) ~ trt + prior + karno,
                       data = veteran, cut = 1:max(veteran$time) )

fit <- coxph(Surv(tstart,time, status) ~ trt + prior + karno +
               karno:ns( time, df = 4 ), data = veteran3 )
cf <- coef( fit )
nsvet <- ns( veteran3$time, df = 4 )

plot( zp )
lines( 0:1000, ns( 0:1000, df = 4, knots = attr( nsvet, "knots" ),
                   Boundary.knots = attr( nsvet, "Boundary.knots" ) )%*%cf[
                     grep( "karno:ns", names( cf ) ) ] + cf["karno"],
       type = "l", col = "red" )

Where is the mistake? Something must be going on here, because the
plots are vastly different...

Thank you very much in advance,
Tamas

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to