Re: [R] Competing risks Kalbfleisch & Prentice method

Ravi Varadhan Fri, 12 Jun 2009 14:11:44 -0700

Dear Terry, Eleni & Heinz,

Please forgive me for imposing on your time, but since we have had similar
discussions before on this topic, I thought that it would be nice to get
your feedback on this problem.


I have written a function to compute the cumulative incidence, for a given
covariate vector Z, based on Cox PH models for each competing event type.
It is based on Cheng, Fine, and Wei (Biometrics 1998), but is fairly
straightforward.  Fine calls it "indirect regression" approach, since it
models each cause-specific hazard and then puts it all together using the
basic definition of CIF.  I applied it to the Green & Byar data on DES trial
for prostate cancer (this data set was also discussed in Kay 1986 and Cheng
et al. 1998).  The data set is attached as STATA file, which can be read
into R using "foreign" package.

My function for CIF seems to do ok. However, I am not sure about this. I
computed the CIFs of prostate cancer for two sets of covariates (see the
code).  I compared the indirect CIFs to the "direct regression " CIFs
predicted by Fine & Gray's method.  There appears to be some significant
differences, especially for CVD deaths.  The CIF plots for the 3 competing
events can be easily generated by runnng the code.

Why would there be a big difference between FG direct approach and the
indirect CIF computed by my code? Is this because of the modeling
assumptions in the direct regression approach of FG? Or is there some
problem with my approach?

I would appreciate any comments.  

Thank you,
Ravi.


----------------------------------------------------------------------------
-------

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 

----------------------------------------------------------------------------
--------


-----Original Message-----
From: Eleni Rapsomaniki [mailto:er...@medschl.cam.ac.uk] 
Sent: Friday, March 27, 2009 10:44 AM
To: Terry Therneau; tuech...@gmx.at; Ravi Varadhan
Cc: r-help@r-project.org
Subject: RE: RE: Competing risks Kalbfleisch & Prentice method


Dear Prof. Therneau, 

Thank you for your views on this subject. I think all R users who play with
survival analysis are most grateful for the functions you have already
supplied us with.

I'm guessing Ravi is wondering why you have not implemented the smoothing of
the baseline hazard from the Cox model. 

I actually tried to do this originally, inspired from this thread (i.e use
sm.spline to smooth the hazard):
https://stat.ethz.ch/pipermail/r-help/2004-July/053843.html

but it overestimated the CI (perhaps I implemented it wrong). I was then
advised to treat CI as a step function, rather than continuous, which means
that F(t+1, cause k)-F(t, cause k) will be 0 unless an event of cause k has
occurred in that interval (see also "Competing Risks, by Melanie Pintilie,
page 62). This is obviously problematic if one wants to estimate the CI at
times that are not close to observed events for either cause (perhaps a
parametric model could be used in this case).
But then again, this was not an issue wtih my data. 

Eleni Rapsomaniki
 Research Associate
Strangeways Research Laboratory
Department of Public Health and Primary Care University of Cambridge
 

-----Original Message-----
From: Terry Therneau [mailto:thern...@mayo.edu]
Sent: 27 March 2009 13:53
To: Eleni Rapsomaniki; tuech...@gmx.at; Ravi Varadhan
Cc: r-help@r-project.org
Subject: RE: Competing risks Kalbfleisch & Prentice method

Ravi's last note finished with
>  I am wondering why Terry Therneau's "survival" package doesn't  have 
> this option.

  The short answer is that there are only so many hours in a day.  

  I've recently moved the code base from an internal Mayo repository to
R-forge, one long term goal with this is to broaden the developer base to
n>2 (me and Thomas Lumley).  
  
  A longer statistical answer:
  
  I'm not sure if the "this" of Ravi's question is a. smoothed hazards, b.
the K&P cumulative incidence or c. the Fine & Gray model.
  
  b. I like the CI model and am using it more.  We also have local code.
The
latest version of survival (on rforge, likely in the next default R
release) has
added simple CI curves to the survfit function.  Adding code for survfit on
Cox models is on the todo list.  But -- this release also fixes up
survfit.coxph to handle weighted Cox models and that was on my list for
approx 10 years, i.e., don't hold your breath.  I don't release something
until it also has a set of worked out test cases to add to the 'tests'
directory.
  
  a. smoothed hazards.  For the case at hand I don't see any particular
advantage of this.  On the other hand, I often would like to display hazard
functions instead of CI functions for Cox models; with time dependent
covariates I don't think a survival curve makes sense.  But I haven't had
the time to think through exactly which methods should be added.
  
  c. Fine & Gray model, i.e., where covariates have a direct influence on
the competing risk.  I find the model completely untenable from a biologic
point of view, so have no interest in adding it.  (Due to finite time,
everything in the survival package is code that I needed for an analysis;
medical research is what pays my salary.)  Assume that I have competing
processes/risks, say progression of a tumor and heart disease;  I expect
that the tumor process pays no attention whatsoever to what is going on in
the heart.  But this is necessary if "type=squamous" is modeled as an
absolute beta=__ increase in the CI for cancer. 
 The squamous cells need to "step up the pace" of invasion if heart failure
threatens, like jockeys in a horse race. 
  
   Terry T.

require(foreign)
require(survival)
require(cmprsk)

################################################################
# Green & Byar data set
gb <- read.dta("h:/competing risks/gb.dta")  # N = 483
attach(gb)

#################################
# Cox model fits for cause-specific hazards
fit1 <- coxph( Surv(time, status=="Cancer") ~ treatment + ag + hx + sz + sg, 
method="breslow", data=gb) 
fit2 <- coxph( Surv(time, status=="CVD") ~ treatment + ag + hx + sz + sg, 
method="breslow", data=gb) 
fit3 <- coxph( Surv(time, status=="Other") ~ treatment + ag + hx + sz + sg, 
method="breslow", data=gb) 
fit <- coxph( Surv(time, status=="Cancer" | status=="CVD" | status=="Other") ~ 
treatment + ag + hx + sz + sg, method="breslow", data=gb) 

fits <- list(fit1, fit2, fit3, fit)

##########################################################
cif <- function(fits, etype=1, z) {
# Prediction of cumulative incidence curves (Cheng, Fine, Wei; Biometrics 1998)
# Indirect regression method
nfits <- length(fits)
cif <- NULL
# if z is missing mean covariate value is used
if (missing(z)) {
        zmiss <- TRUE 
        z <- matrix(NA, 1, 1)
        } else {
        zmiss <- FALSE
        if (is.null(dim(z))) z <- matrix(z, nrow=1, ncol=length(z))     
        }
for (i in 1:nrow(z)) {
sfit.e <- if(!zmiss) survfit(fits[[etype]], type="aalen", newdata=z[i, ])
        else survfit(fits[[etype]], type="aalen")
surv <- sfit.e$surv
time <- sfit.e$time
dH <- c(-log(surv)[1], diff(-log(surv) ))
sfit.o <- if(!zmiss) survfit(fits[[nfits]], type="aalen", newdata=z[i, ])
        else survfit(fits[[nfits]], type="aalen")
Osurv <- stepfun(x=sfit.o$time, y=c(1, sfit.o$surv))

cif <- cbind(cif, cumsum(Osurv(time) * dH))
}
if(!zmiss) colnames(cif) <- paste("cif.z", 1:nrow(z), sep="")
else    colnames(cif) <- "cif.zmean"
data.frame(time = time, cif)
}
######################

# two sets of covariates for which CIFs are to be predicted
z1 <- data.frame(treatment=0, ag=0, hx=1, sz=0, sg=0)
z2 <- data.frame(treatment=1, ag=0, hx=1, sz=0, sg=0)

model.mat <- model.matrix( ~ time + as.numeric(status) + treatment + 
as.numeric(ag) + hx + sz + sg, data=gb)  

# Comparison with Fine & Gray's direct regression

# prostate cancer
cr1 <- cif(fits, z=rbind(z1, z2), etype=1)  # Indirect regression
plot(cr1[,1], cr1[,2], type="o", xlab="Time", ylab="Cumulative incidence of 
Prostate cancer deaths", ylim=c(0, 0.6), main="Comparison of indirect and 
direct CIF prediction")
lines(cr1[,1], cr1[,3],col=2, type="o") 

cif.cancer <- crr(ftime=model.mat[, 2], fstatus=model.mat[, 3], 
cov1=model.mat[, 4:8], failcode=2, maxiter=50)  # Fine-Gray
z.cancer <- predict(cif.cancer, cov1=rbind(unlist(z1), unlist(z2))) 
lines(z.cancer[,1], z.cancer[,2], col=3)
lines(z.cancer[,1], z.cancer[,3], col=4)
legend(locator(1), legend=c("z1 (indirect)", "z2 (indirect)", "z1 (FG)", "z2 
(FG)"), lty=rep(1, 4), col=c(1,2, 3,4))

# CVD
# The two methods give quite different estimates of CIF here
cr2 <- cif(fits, z=rbind(z1, z2), etype=2)  
plot(cr2[,1], cr2[,2], type="o", xlab="Time", ylab="Cumulative incidence of CVD 
deaths", ylim=c(0, 0.6), main="Comparison of indirect and direct CIF 
prediction")
lines(cr2[,1], cr2[,3],col=2, type="o") 

cif.cvd <- crr(ftime=model.mat[, 2], fstatus=model.mat[, 3], cov1=model.mat[, 
4:8], failcode=3, maxiter=50)
z.cvd <- predict(cif.cvd, cov1=rbind(unlist(z1), unlist(z2))) 
lines(z.cvd[,1], z.cvd[,2], col=3)
lines(z.cvd[,1], z.cvd[,3], col=4)
legend(locator(1), legend=c("z1 (indirect)", "z2 (indirect)", "z1 (FG)", "z2 
(FG)"), lty=rep(1, 4), col=c(1,2, 3,4))


# Other
cr3 <- cif(fits, z=rbind(z1, z2), etype=3)  
plot(cr3[,1], cr3[,2], type="o", xlab="Time", ylab="Cumulative incidence of 
Other deaths", ylim=c(0, 0.6), main="Comparison of indirect and direct CIF 
prediction")
lines(cr3[,1], cr3[,3],col=2, type="o") 

cif.oth <- crr(ftime=model.mat[, 2], fstatus=model.mat[, 3], cov1=model.mat[, 
4:8], failcode=4, maxiter=50)
z.oth <- predict(cif.oth, cov1=rbind(unlist(z1), unlist(z2))) 
lines(z.oth[,1], z.oth[,2], col=3)
lines(z.oth[,1], z.oth[,3], col=4)
legend(locator(1), legend=c("z1 (indirect)", "z2 (indirect)", "z1 (FG)", "z2 
(FG)"), lty=rep(1, 4), col=c(1,2, 3,4))

CIF_cvd.pdf
Description: Adobe PDF document

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Competing risks Kalbfleisch & Prentice method

Reply via email to