Dear R users,

I fear this is terribly trivial but I'm struggling to get my head around it.

First of all, I'm using the "survival" package in R 2.12.2 on Windows Vista 
with the RExcel plugin. You probably only need to know that I'm using 
"survival" for this.

I have data collected from 180 or so individuals that were checked 7 times 
throughout a trial with set start and end times. Once the event happens (death 
by predator) there are no more checks for that individual. This means that I 
check on each individual up to 7 times with either an event recorded or the 
final time being censored.

At the moment, I have a data sheet with one observation per individual; that is 
either the event time (the observation time when the individual had had an 
event) or the censored time. However, I'd like to add a time dependent factor 
and I also wonder if this data should be treated as interval censored.

The time dependent factor is like this. The individuals are grouped in "houses" 
and once one individual in a group has an event, it makes biological sense that 
the rest of them should be at greater risk, as the predator is likely to have 
discovered the others in the "house" as well (the predator is able to consume 
many individuals). At the moment I'm coding this as a normal two level factor 
(discovered) where all individuals alive after the first event in that house 
are "TRUE" and the first individuals in a house to be eaten are "FALSE". All 
individuals in houses that were not discovered at al are also "FALSE"l. 
Obviously, all individuals that were eaten, were first discovered, then eaten. 
However, the first individuals in a house to be eaten, had not been previously 
discovered by the predator (not observably so, anyway).

Should I write up this data set with a start and stop time for every check I 
made so each individual has up to 7 records, one for each time I checked?

Is there a quick and easy way to do this in R or would I have to go through the 
data set manually?

Does coding the "discovered" factor the way I have, make statistical sense? 

Should I worry about proportional hazards of the "discovered" factor? It seems 
to me that it would often turn out not proportional because of its nature.

Sorry, lots of stats questions. I don't mind if you don't answer all of these. 
Just knowing how to best feed this data into R would help me no end. The rest I 
can probably glean from the millions of survival analysis books I have lying 
about.

Cheers,

Freya

PS: Example data as it is: Treatment has 3 levels and House 6, though I don't 
normally include House in the analysis as it's not so much the house as whether 
the individuals were previously discovered that is interesting. I may include 
it as a random factor or stratify by it, but I want to get the basics sorted 
before I tackle that.


ID  Time  Event  Discovered   Treatment  House
1     10      1           FALSE          1                1
2     20      1           TRUE           1                1
3     90      0           TRUE           1                1
4     10      1           FALSE          2                5
5     10      1           FALSE          2                5
6     40      1           TRUE           2                5

Should ID 2 have two rows, one with no event at time 10? Should it be coded 
with start and end times as (first row) 0, 10, 0 (second row) 10, 20, 1?
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to