Thanks to David for pointing this out. The "time dependent covariates" vignette in the
survival package has a section on time dependent coefficients that talks directly about
this issue. In short, the following model is simply wrong:
coxph(Surv(time, status) ~ trt + prior + karno + I(karno * log(time)),
data=veteran)
People try this often as a way to create the time dependent covariate Karnofsky * log(t),
which is often put forwards as a way to deal with non-proportional hazards. To do this
correctly you have to use the tt() functionality in coxph to move the computation out of
the model statement:
coxph(Surv(time, status) ~ trt + prior + karno + tt(karno), data=veteran,
tt = function(x, time, ...) x*log(time))
BTW the following SAS code is also wrong:
proc phreg data=veteran;
model time * status(0) = trt + prior + karno* time;
SAS does the right thing, however, if you move the computation off the model
line.
model time * status(0) = trt + karno + zzz;
zzz = karno * time;
The quote "SAS does it but R fails" comes at me moderately often in this context. The
reason is that SAS won't LET you put a phrase like "log(time)" into the model statement,
so people end up doing the right thing, but by accident.
Terry T.
On 03/30/2016 05:28 PM, Göran Broström wrote:
On 2016-03-30 23:06, David Winsemius wrote:
On Mar 29, 2016, at 1:47 PM, Jennifer Wu, Miss
<jennifer....@mail.mcgill.ca> wrote:
Hi,
I am currently using R v3.2.3 and on Windows 10 OS 64Bit.
I am having convergence issues when I use coxph with a interaction
term (glarg*bca_py) and interaction term with the restricted cubic
spline (glarg*bca_time_ns). I use survival and spline package to
create the Cox model and cubic splines respectively. Without the
interaction term and/or spline, I have no convergence problem. I
read some forums about changing the iterations and I have but it
did not work. I was just wondering if I am using the inter.max and
outer.max appropriately. I read the survival manual, other R-help
and stackoverflow pages and it suggested changing the iterations
but it doesn't specify what is the max I can go. I ran something
similar in SAS and did not run into a convergence problem.
This is my code:
bca_time_ns <- ns(ins_ca$bca_py, knots=3,
Boundary.knots=range(2,5,10)) test <- ins_ca$glarg*ins_ca$bca_py
test1 <- ins_ca$glarg*bca_time_ns
In your `coxph` call the variable 'bca_py' is the survival time and
Right David: I didn't notice that the 'missing main effect' in fact was part of
the
survival object! And as you say: Time to rethink the whole model.
Göran
yet here you are constructing not just one but two interactions (one
of which is a vector but the other one a matrix) between 'glarg' and
your survival times. Is this some sort of effort to identify a
violation of proportionality over the course of a study?
Broström sagely points out that these interactions are not in the
data-object and subsequent efforts to refer to them may be confounded
by the multiple environments from which data would be coming into the
model. Better to have everything come in from the data-object.
The fact that SAS did not have a problem with this rather
self-referential or circular model may be a poor reflection on SAS
rather than on the survival package. Unlike Therneau or Broström who
asked for data, I suggest the problem lies with the model
construction and you should be reading what Therneau has written
about identification of non-proportionality and identification of
time dependence of effects. See Chapter 6 of his "Modeling Survival
Data".
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.