Re: [R] Coding columns for survival analysis

jim holtman Fri, 13 Apr 2012 17:03:58 -0700

try this:

> x <- read.table(text = "   tree live1 live2 live3 live4 live5
+    1 tree1     0     0     0     1     1
+    2 tree2     0     0     1     1     0
+    3 tree3     0     1     1     0     0
+    4 tree4     1     1     0     0     0
+    6 tree4     1     1     1     1     0  # another test condition
+    5 tree5     1     0     0     0     0", header = TRUE)
>
> # get matrix of data columns
> z <- as.matrix(x[, -1])
> # process each row
> a <- apply(z, 1, function(.row){
+     # determine where found (will be a 2)
+     found <- pmin(cumsum(.row) + 1, 3) # cannot be greater than 3
+     # determined where it died
+     die <- cumsum(diff(c(0, .row)) != 0)
+     # replace value at die == 2 with 4
+     found[die == 2] <- 4
+     c(NA, "found", "alive", "mort")[found]
+ })
> t(a)  # result
  [,1]    [,2]    [,3]    [,4]    [,5]
1 NA      NA      NA      "found" "alive"
2 NA      NA      "found" "alive" "mort"
3 NA      "found" "alive" "mort"  "mort"
4 "found" "alive" "mort"  "mort"  "mort"
6 "found" "alive" "alive" "alive" "mort"
5 "found" "mort"  "mort"  "mort"  "mort"
>



On Fri, Apr 13, 2012 at 4:53 PM, Alexander Shenkin <ashen...@ufl.edu> wrote:
> Hello Folks,
>
> I have 5 columns for thousands of tree records that record whether that
> tree was alive or dead.  I want to recode the columns such that the cell
> reads "found" when a live tree is first observed, "alive" for when a
> tree is found alive and is not just found, and "mort" when it was
> previously alive but is now dead.
>
> Given the following:
>
>    > tree_live = data.frame(tree =
> c("tree1","tree2","tree3","tree4","tree5"), live1 = c(0,0,0,1,1), live2
> = c(0,0,1,1,0), live3 = c(0,1,1,0,0), live4 = c(1,1,0,0,0), live5 = c(1,
> 0, 0, 0, 0))
>
>       tree live1 live2 live3 live4 live5
>    1 tree1     0     0     0     1     1
>    2 tree2     0     0     1     1     0
>    3 tree3     0     1     1     0     0
>    4 tree4     1     1     0     0     0
>    5 tree5     1     0     0     0     0
>
> I would like to end up with the following:
>
>    > tree_live_recode
>
>      live1 live2 live3 live4 live5
>    1    NA    NA    NA found alive
>    2    NA    NA found alive  mort
>    3    NA found alive  mort     0
>    4 found alive  mort     0     0
>    5 found  mort     0     0     0
>
> I've accomplished the recode in the past, but only by going over the
> dataset multiple times in messy and inefficient fashion.  I'm wondering
> if there are concise and efficient ways of going about it?
>
> (I haven't been using the Survival package for my analyses, but I'm
> starting to look into it.)
>
> Thanks,
> Allie
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coding columns for survival analysis

Reply via email to