Hi Gabor, Richard, and Thierry, 

Thanks very much for your replies. Turns out I had already hit on Gabor's idea 
of "factor out" in writing an initial draft of the code converting from SAS to 
R. Below is the link Gabor sent describing this and other approaches. 

https://stackoverflow.com/questions/34096162/dplyr-mutate-replace-on-a-subset-of-rows/34096575#34096575

At the end of this email are some new test data plus a snippet of my initial R 
code. The R code I have replicates the result from SAS but is quite verbose. 
That should be obvious from the snippet. I know I can make the code less 
verbose with a subsequent draft but wonder if I can simplify to the point where 
the factor out approach gets a fair test. I'd appreciate it if people could 
share some ways to make the factor out approach less verbose. I'd also like to 
see how well some of the other approaches might work with these data. I spent 
considerable time looking at the link Gabor sent as well as the other responses 
I received. The mutate_cond function in the link seems promising but it wasn't 
clear to me how I could avoid having to repeat the various conditions using 
that approach. 

Thanks again.

Paul

library(magrittr)
library(dplyr)
 
test_data <-
  structure(
    list(
      intPatientId = c("3", "37", "48", "6", "6", "5"),
      intSurveySessionId = c(1L, 10996L, 19264L, 2841L, 28L, 34897L),
      a_CCMA02 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_),
      a_CCMA69 = c(7, NA, 0, 2, NA, 0),
      a_CCMA70 = c(7, 0, NA, 10, NA, NA),
      a_CCMA72 = c(7, 2, 3, NA, NA, NA),
      CCMA2 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_, 
NA_integer_),
      a_CCMA05 = c(NA, NA, NA, NA, NA, 0),
      a_CCMA43 = c(5, 0, 6, 5, NA, NA),
      a_CCMA44 = c(5, 0, 0, 5, 0, NA),
      CCMA5 = c(NA, NA, NA, NA, NA, 0)
    ),
    class = "data.frame",
    row.names = c(NA,-6L)
  )

factor_out <- test_data %>%
  mutate(
    CCMA2_cond = case_when(
      (is.na(a_CCMA02) | a_CCMA02 < 0 | a_CCMA02 > 10) &
        (!is.na(a_CCMA69) & between(a_CCMA69, 0, 10) &
           !is.na(a_CCMA70) & between(a_CCMA70, 0, 10) &
           !is.na(a_CCMA72) & between(a_CCMA72, 0, 10)) ~ "A",
      (is.na(a_CCMA02) | a_CCMA02 < 0 | a_CCMA02 > 10) &
        (is.na(a_CCMA69) | a_CCMA69 < 0 | a_CCMA69 >= 10) &
        !is.na(a_CCMA70) & between(a_CCMA70, 0, 10) &
        !is.na(a_CCMA72) & between(a_CCMA72, 0, 10) ~ "B",
      (is.na(a_CCMA02) | a_CCMA02 < 0 | a_CCMA02 > 10) &
        (is.na(a_CCMA70) | a_CCMA70 < 0 | a_CCMA70 >= 10) &
        between(a_CCMA69, 0, 10) & between(a_CCMA72, 0, 10) ~ "C",
      (is.na(a_CCMA02) | a_CCMA02 < 0 | a_CCMA02 > 10) &
        (is.na(a_CCMA72) | a_CCMA72 < 0 | a_CCMA72 >= 10) &
        between(a_CCMA69, 0, 10) & between(a_CCMA70, 0, 10) ~ "D")
  ) %>%
  mutate(
    CCMA2 = case_when(
      CCMA2_cond == "A" & 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70) + 
(0.504 * a_CCMA72) < 0  ~ 0,
      CCMA2_cond == "A" & 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70) + 
(0.504 * a_CCMA72) > 10 ~ 10,
      CCMA2_cond == "A" ~ 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70) + 
(0.504 * a_CCMA72),
      TRUE ~ as.double(CCMA2)
    ),
    CCMA2 = case_when(
      CCMA2_cond == "B" & 0.614 + (0.065 * (a_CCMA70 + a_CCMA72) / 2) + (-0.012 
* a_CCMA70) + (0.504 * a_CCMA72) < 0  ~ 0,
      CCMA2_cond == "B" & 0.614 + (0.065 * (a_CCMA70 + a_CCMA72) / 2) + (-0.012 
* a_CCMA70) + (0.504 * a_CCMA72) > 10 ~ 10,
      CCMA2_cond == "B" ~ 0.614 + (0.065 * (a_CCMA70 + a_CCMA72) / 2) + (-0.012 
* a_CCMA70) + (0.504 * a_CCMA72),
      TRUE ~ as.double(CCMA2)
    ),
    CCMA2 = case_when(
      CCMA2_cond == "C" & 0.614 + (0.065 * a_CCMA69) + (-0.012 *(a_CCMA72 + 
a_CCMA69) / 2 ) + (0.504 * a_CCMA72) < 0  ~ 0,
      CCMA2_cond == "C" & 0.614 + (0.065 * a_CCMA69) + (-0.012 *(a_CCMA72 + 
a_CCMA69) / 2 ) + (0.504 * a_CCMA72) > 10 ~ 10,
      CCMA2_cond == "C" ~ 0.614 + (0.065 * a_CCMA69) + (-0.012 *(a_CCMA72 + 
a_CCMA69) / 2 ) + (0.504 * a_CCMA72),
      TRUE ~ as.double(CCMA2)
    ),
    CCMA2 = case_when(
      CCMA2_cond == "D" & 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70 ) + 
(0.504 *(a_CCMA70 + a_CCMA69) / 2) < 0  ~ 0,
      CCMA2_cond == "D" & 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70 ) + 
(0.504 *(a_CCMA70 + a_CCMA69) / 2) > 10 ~ 10,
      CCMA2_cond == "D" ~ 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70 ) + 
(0.504 *(a_CCMA70 + a_CCMA69) / 2),
      TRUE ~ as.double(CCMA2)
    )
  ) %>%
  select(-CCMA2_cond) %>%
  mutate(
    CCMA5_condA = if_else(
      (is.na(a_CCMA05) | a_CCMA05 < 0 | a_CCMA05 > 10),
      1, 0
    ),
    CCMA5 = ifelse(CCMA5_condA == 1 & between(a_CCMA43, 0, 10) & 
between(a_CCMA44, 0, 10),
                   0.216 + (0.257 * a_CCMA43) + (0.828 * a_CCMA44),
                   CCMA5),
    CCMA5 = ifelse(CCMA5_condA == 1 & between(a_CCMA43, 0, 10) & 
(is.na(a_CCMA44) | a_CCMA44 < 0 | a_CCMA44 > 10),
                   0.216 + (0.257 * a_CCMA43) + (0.828 * a_CCMA43),
                   CCMA5),
    CCMA5 = ifelse(CCMA5_condA == 1 & between(a_CCMA44, 0, 10) & 
(is.na(a_CCMA43) | a_CCMA43 < 0 | a_CCMA43 > 10),
                   0.216 + (0.257 * a_CCMA44) + (0.828 * a_CCMA44),
                   CCMA5),
    CCMA5 = ifelse(CCMA5_condA == 1 & !is.na(CCMA5) & CCMA5 < 0,
                   0,
                   CCMA5),
    CCMA5 = ifelse(CCMA5_condA == 1 & CCMA5 > 10,
                   10,
                   CCMA5)
  ) %>%
  select(-CCMA5_condA)

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to