Re: [R] Kaplan-Meier plotting quirks

Terry Therneau Thu, 18 Oct 2012 05:56:55 -0700

Better would be to use interval censored data.  Create your data set so that 
you have 
(time1, time2) pairs, each of which describes the interval of time over which 
the tag was 
lost.  So an animal first captured at time 10 sans tag would be (0,10); with 
tag at 5 and 
without at 20 would be (5,20), and last seen with tag at 30 would be (30, NA).
Then survit(Surv(time1, time2, type='interval2') ~ 1, data=yourdata) will give 
a curve 
that accounts for interval censoring.
   As a prior poster suggested, if the times are very sparse then you may be 
better off 
assuming a smooth curve.  Use the survreg function with the same equation as 
above; see 
help("predict.survreg") for an example of how to draw the resulting survival 
curve.


Terry Therneau

On 10/18/2012 05:00 AM, r-help-requ...@r-project.org wrote:
> -----Original Message-----
> From: Michael Rentz [mailto:rent0...@umn.edu]
> Sent: Tuesday, October 16, 2012 12:36 PM
> To:r-help@r-project.org
> Subject: [R] R Kaplan-Meier plotting quirks?
>
> Hello. I apologize in advance for the VERY lengthy e-mail. I endeavor to 
> include enough detail.
>
> I have a question about survival curves I have been battling off and on for a 
> few months. No one local seems to be able to help, so I turn here. The issue 
> seems to either be how R calculates Kaplan-Meier Plots, or something with the 
> underlying statistic itself that I am misunderstanding. Basically, longer 
> survival times are yielding steeper drops in survival than a set of shorter 
> survival times but with the same number of loss and retention events.
>
> As a minor part of my research I have been comparing tag survival in marked 
> wild rodents. I am comparing a standard ear tag with a relatively new 
> technique. The newer tag clearly ?wins? using survival tests, but the 
> resultant Kaplan-Meier plot does not seem to make sense. Since I am dealing 
> with a wild animal and only trapped a few days out of a month the data is 
> fairly messy, with gaps in capture history that require assumptions of tag 
> survival. An animal that is tagged and recaptured 2 days later with a tag and 
> 30 days later without one could have an assumed tag retention of 2 days 
> (minimum confirmed) or 30 days (maximum possible).
>
> Both are significant with a survtest, but the K-M plots differ. A plot of 
> minimum confirmed (overall harsher data, lots of 0 days and 1 or 2 days) 
> yields a curve with a steep initial drop in ?survival?, but then a leveling 
> off and straight line thereafter at about 80% survival. Plotting the maximum 
> possible dates (same number of losses/retention, but retention times are 
> longer, the length to the next capture without a tag, typically
> 25-30 days or more) does not show as steep of a drop in the first few days, 
> but at about the point the minimum estimate levels off this one begins 
> dropping steeply. 400 days out the plot with minimum possible estimates has 
> tag survival of about 80%, whereas the plot with the same loss rate but 
> longer assumed survival times shows only a 20% assumed survival at 400 days. 
> Complicating this of course is the fact that the great majority of the 
> animals die before the tag is lost, survival of the rodents is on the order 
> of months.
>
> I really am not sure what is going on, unless somehow the high number of 
> events in the first few days followed by few events thereafter leads to the 
> assumption that after the initial few days survival of the tag is high. The 
> plotting of maximum lengths has a more even distribution of events, rather 
> than a clumping in the first few days, so I guess the model assumes 
> relatively constant hazards? As an aside, a plot of the mean between the 
> minimum and maximum almost mirrors the maximum plot. Adding five days to the 
> minimum when the minimum plus 5 is less than the maximum returns a plot with 
> a steeper initial drop, but then constant thereafter, mimicking the minimum 
> plot, but at a lower final survival rate.
>
> Basically, I am at a loss why surviving longer would*decrease*  the survival 
> rate???
>
> My co-author wants to drop the K-M graph given the confusion, but I think it 
> would be odd to publish a survival paper without one. I am not sure which 
> graph to use? They say very different things, while the actual statistics do 
> not differ that greatly.
>
> I am more than happy to provide the data and code for anyone who would like 
> to help if the above is not explanation enough. Thank you in advance.
>
> Mike.
>
>
> --
> Michael S. Rentz
> PhD Candidate, Conservation Biology
> University of Minnesota
> 5122 Idlewild Street
> Duluth, MN 55804
> (218) 525-3299
> rent0...@umn.edu

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Kaplan-Meier plotting quirks

Reply via email to