On 3/17/2012 6:24 AM, Paul Miller wrote:
Hello All,
Need to coalesce some columns using R. Looked online to see how this
is done. One approach appears to be to use ifelse. Also uncovered a
coalesce function in the BBmisc, emoa, and microbenchmark packages.
Trouble is I can't seem to get it to work in any of these packages.
Or perhaps I misunderstand what it's intended to do. The
documentation is generally pretty scant.
Working with two columns: Date of Death (DOD) and Last Known Date
Alive (LKDA). One or the other column is populated for each of the
patients in my dataframe and the other column is blank.
When I run code like "with(Demographics, coalesce(DOD, LKDA))", the
function generates a value whenever DOD is not missing and generates
NA otherwise (even though the value for LKDA is not missing). So, for
example, I get an NA for the 8th element below, even though I have a
value of "2008-03-25" for LKDA.
with(Demographics, coalesce(DOD, LKDA))
[1] "2006-07-23" "2008-07-09" "2007-12-16" "2008-01-19" "2009-05-05"
"2006-04-29" "2006-06-18" NA
At least that's what happens in the BBmisc and emoa packages. The
microbenchmark package appears not to have a coalesce function though
the documentation says it does. I think I've seen instances where a
function gets removed from a package. So maybe that's what happened
here.
Thought maybe there is a difference between blank and NA as far as R
or the function is concerned. The is.na() function seems to indicate
that a blank is an NA. I also tried making the blanks into NA but
that didn't help.
Does anyone have experience with the coalesce function in any of the
three packages? If so, can they help me understand what I might be
doing wrong?
I didn't know about these other coalesce functions, but I had written my
own. Looking at them, they don't seem to be vectorized; mine is. That's
not to say that there may not be other problems with it.
##' Return first non-NA, vectorized
##'
##'
##' @param ... Vectors, all of the same length.
##' @return Vector of the same length as the input vectors, each
##' element of which is the first corresponding non-NA element in the
##' given vectors in the order they are specified
##' @author Brian Diggs
coalesce <- function(...) {
dots <- list(...)
ret <- Reduce(function (x,y) ifelse(!is.na(x),x,y), dots)
class(ret) <- class(dots[[1]])
ret
}
And using your example data:
Demographics <- data.frame(DOD = as.Date(c("2006-07-23", "2008-07-09",
"2007-12-16", "2008-01-19", "2009-05-05",
"2006-04-29", "2006-06-18", NA)),
LKDA = as.Date(c(NA, NA, NA, NA, NA, NA,
NA, "2008-03-25")))
> with(Demographics, coalesce(DOD, LKDA))
[1] "2006-07-23" "2008-07-09" "2007-12-16" "2008-01-19" "2009-05-05"
[6] "2006-04-29" "2006-06-18" "2008-03-25"
Thanks,
Paul
--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.