On Sep 28, 2011, at 10:56 AM, Ruth Arias wrote:
hallo terry:
I attached araceae data set,
The usual survival analysis via the Kaplan-Meier method only make
estimates at the time of events. When you tabulate your data, you see
that there were no events for the missing (starting) "time" rows in
those categories during the intervals that you are questioning as
missing:
xtabs( ~ time+time2+categoria+event, data=araceae)
, , categoria = C, event = 0
time2
time 2005 2006 2007 2008 2009 2010
2004 0 23 1 3 1 22
2005 0 0 0 0 0 0
2007 0 0 0 0 4 19
2008 0 0 0 0 0 0
2009 0 0 0 0 0 0
, , categoria = E, event = 0
time2
time 2005 2006 2007 2008 2009 2010
2004 0 22 0 7 3 21
2005 0 0 1 1 0 0
2007 0 0 0 0 0 29
2008 0 0 0 0 0 0
2009 0 0 0 0 0 1
, , categoria = C, event = 1
time2
time 2005 2006 2007 2008 2009 2010
2004 0 5 2 3 0 3
2005 0 0 0 0 0 0
2007 0 0 0 2 3 2
2008 0 0 0 0 1 0
2009 0 0 0 0 0 0
, , categoria = E, event = 1
time2
time 2005 2006 2007 2008 2009 2010
2004 7 2 1 1 3 4
2005 0 0 0 1 0 0
2007 0 0 0 3 1 3
2008 0 0 0 0 0 0
2009 0 0 0 0 0 0
when I use this:
surara<-survfit(Surv(time,time2,event)~categoria)
Call: survfit(formula = Surv(time, time2, event) ~ categoria)
records n.max n.start events median 0.95LCL 0.95UCL
categoria=C 94 63 0 21 NA NA NA
categoria=E 111 77 0 26 NA NA NA
summary(surara)
Call: survfit(formula = Surv(time, time2, event) ~ categoria)
categoria=C
time n.risk n.event entered censored survival std.err lower 95% CI
upper 95% CI
2006 63 5 0 23 0.921 0.0341
0.856 0.990
2007 35 2 30 1 0.868 0.0483
0.778 0.968
2008 62 5 1 3 0.798 0.0536
0.700 0.910
2009 55 4 0 5 0.740 0.0570
0.636 0.861
2010 46 5 0 41 0.660 0.0611
0.550 0.791
categoria=E
time n.risk n.event entered censored survival std.err lower 95% CI
upper 95% CI
2005 71 7 3 0 0.901 0.0354
0.835 0.973
2006 67 2 0 22 0.875 0.0391
0.801 0.955
2007 43 1 36 1 0.854 0.0432
0.774 0.943
2008 77 5 0 8 0.799 0.0469
0.712 0.896
2009 64 4 1 3 0.749 0.0502
0.657 0.854
2010 58 7 0 51 0.658 0.0545
0.560 0.774
You see that your first survfit object is offering a simple sum of
'time2' columns of that tabulation as its 'n.event' values. It's
'n.risk' tabulation is not taking note of whether a case started in
any particular prior interval. The n.risk sum appears to be the sum of
persons surviving from the prior year less any decedents plus any
entrants as reflected in "future" events on that row You notice
that there are missing years even in that report: 2004,2005 for
category C and 2004 for category E since there are no events in
columns for those 'time2' values.
but whe I included type=interval,
suraraint<-
survfit(Surv(time,time2,event,type='interval')~categoria) # falta
arreglar lo del intervalo!!!
summary(suraraint)
Call: survfit(formula = Surv(time, time2, event, type = "interval") ~
categoria)
categoria=C
time n.risk n.event survival std.err lower 95% CI upper 95% CI
2004 95.00 13.14 0.862 0.0354 0.795 0.934
2007 31.86 7.19 0.667 0.0695 0.544 0.818
2008 1.67 1.67 0.000 NaN NA NA
categoria=E
time n.risk n.event survival std.err lower 95% CI upper 95% CI
2004 112.0 18.47 0.835 0.0351 0.769 0.907
2005 40.5 1.06 0.813 0.0401 0.738 0.896
2007 37.5 7.46 0.651 0.0620 0.540 0.785
The second object's n.event, when Surv() was constructed with
type="interval", has values based on the starting 'time' rows, but I
am unable to deduce the estimating algorithm. I remember Therneau
saying it wasn't a simple algorithm. The 2008 row in category C has
one entry of 1 in the next year and there were no censoring for C-
entrants in that year. Why the n.event is 1.67 I cannot say, but at
least the n.event does not exceed the n.risk. The code or a copy of
Therneau and Grambsch would be sensible places to look for answer by
my initial efforts in those direction have not illuminated me.
--
David.
it does not survival calculed for very year
I have a one-year interval between each census
________________________________
De: Terry Therneau <thern...@mayo.edu>
Para: Ruth Arias <rueu...@yahoo.es>
CC: r-help@r-project.org
Enviado: miƩrcoles 28 de septiembre de 2011 16:00
Asunto: Re: survival analysis: interval censored data
You have still not given me enough information to reproduce your
problem. "Why doesn't it include all years?" I have no way of
knowing,
since we have no data.
--- begin included message --
halo david
when I use type= 'interval'
Call: survfit(formula = Surv(ingreso, fecha, estado, type =
"interval")
~
categoria)
and when I use just
Call: survfit(formula = Surv(ingreso, fecha, estado) ~ categoria)
I don t know why when I use type = "interval" it does not survival
calculed for very year
regards<araceae.txt>______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.