On 11-02-09 3:47 PM, eck 1 wrote:
Hello R-Help users,
I have a data frame named fd, a sample of which looks like this:
cbi_A
cbi_B
cbi_B1
cbi_B2
cbi_C
cbi_D
cbi_E
cbi_F
2.183451
1.047546
NA
NA
NA
NA
0.428528
NA
0.795837
0.510152
0.510152
NA
NA
NA
NA
NA
0.795837
1.149577
0.843485
1.122334
NA
NA
NA
NA
1.885522
2.203959
NA
NA
3.020202
3.040506
0.428528
0.224467
2.877257
1.353637
NA
NA
3.020202
NA
0.836649
NA
1.479441
2.816141
NA
NA
3.020202
NA
3.040506
3.040506
1.836547
1.659729
1.203959
2.401184
3.020202
NA
NA
NA
2.069177
1.870625
1.768595
2.989593
NA
NA
NA
NA
2.046985
1.203959
1.203959
NA
NA
NA
NA
NA
1.469238
NA
1.278849
NA
NA
NA
NA
NA
I want to create a new column (fd$cbi_tot) that is an average of some of the
other columns (chosen based on the conditions as indicated in the code below).
ifelse(is.na(fd$cbi_C)& is.na(fd$cbi_D)& is.na(fd$cbi_E)& is.na(fd$cbi_F)&
is.na(fd$cbi_B1)& !is.na(fd$cbi_B2),
cbi_totlist<- cbind(fd$cbi_A, fd$cbi_B2),
You seem to be mixing up ifelse() with if ( ) else . You don't normally
do an assignment within one of the ifelse() values, you do things like this:
x <- 1:10
labels <- ifelse( x < 5, "low", "high")
Both "low" and "high" are evaluated and converted to vectors the same
length as the result of "x < 5" (i.e. 10 in my example), then elements
where x < 5 take the value from "low", and elements where that is not
true take the value from "high".
Duncan Murdoch
ifelse(is.na(fd$cbi_C)& is.na(fd$cbi_D)& is.na(fd$cbi_E)& is.na(fd$cbi_F)&
is.na(fd$cbi_B2)& !is.na(fd$cbi_B1),
cbi_totlist<- cbind(fd$cbi_A, fd$cbi_B1),
ifelse(is.na(fd$cbi_C)& is.na(fd$cbi_D)& is.na(fd$cbi_E)& is.na(fd$cbi_F)&
!is.na(fd$cbi_B1)& !is.na(fd$cbi_B2),
cbi_totlist<- cbind(fd$cbi_A, fd$cbi_B1, fd$cbi_B2),
cbi_totlist<- cbind(fd$cbi_A, fd$cbi_B, fd$cbi_C, fd$cbi_D, fd$cbi_E,
fd$cbi_F))))
fd$cbi_tot<- apply(cbi_totlist, 1, mean, na.rm = TRUE)
I do not get an error and the new column (fd$cbi_tot) is created, but when I
check the results, the first condition works properly, as does the last (the
catch-all), but the middle two conditions seem to default to the catch-all. For
example, in my data, look at row 3, where fd$cbi_C, fd$cbi_D, fd$cbi_E and
fd$cbi_F are all NA, and fd$cbi_B1 and fd$cbi_B2 have values. In this case, R
uses the cbi_totlist that includes fd$cbi_B, whereas I would like it to use the
list that includes fd$cbi_B1 and fd$cbi_B2 (and I thought I indicated that in
the conditional statements). Anybody have ideas about what I am doing wrong
here?
Thanks much for any suggestions!
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.