Finally getting back to this : >>>>> Hadley Wickham <h.wick...@gmail.com> >>>>> on Mon, 15 Aug 2016 07:51:35 -0500 writes:
> On Fri, Aug 12, 2016 at 11:31 AM, Hadley Wickham > <h.wick...@gmail.com> wrote: >>> >> One possibility would also be to consider a >>> "numbers-only" or >> rather "same type"-only {e.g., >>> would also work for characters} >> version. >>> >>> > I don't know what you mean by these. >>> >>> In the mean time, Bob Rudis mentioned dplyr::if_else(), >>> which is very relevant, thank you Bob! >>> >>> As I have found, that actually works in such a "same >>> type"-only way: It does not try to coerce, but gives an >>> error when the classes differ, even in this somewhat >>> debatable case : >>> >>> > dplyr::if_else(c(TRUE, FALSE), 2:3, 0+10:11) Error: >>> `false` has type 'double' not 'integer' >>> > >>> >>> As documented, if_else() is clearly stricter than >>> ifelse() and e.g., also does no recycling (but of >>> length() 1). >> >> I agree that if_else() is currently too strict - it's >> particularly annoying if you want to replace some values >> with a missing: >> >> x <- sample(10) if_else(x > 5, NA, x) # Error: `false` >> has type 'integer' not 'logical' >> >> But I would like to make sure that this remains an error: >> >> if_else(x > 5, x, "BLAH") >> >> Because that seems more likely to be a user error (but >> reasonable people might certainly believe that it should >> just work) >> >> dplyr is more accommodating in other places (i.e. in >> bind_rows(), collapse() and the joins) but it's >> surprisingly hard to get all the details right. For >> example, what should the result of this call be? >> >> if_else(c(TRUE, FALSE), factor(c("a", "b")), >> factor(c("c", "b")) >> >> Strictly speaking I think you could argue it's an error, >> but that's not very user-friendly. Should it be a factor >> with the union of the levels? Should it be a character >> vector + warning? Should the behaviour change if one set >> of levels is a subset of the other set? >> >> There are similar issues for POSIXct (if the time zones >> are different, which should win?), and difftimes >> (similarly for units). Ideally you'd like the behaviour >> to be extensible for new S3 classes, which suggests it >> should be a generic (and for the most general case, it >> would need to dispatch on both arguments). > One possible principle would be to use c() - > i.e. construct out as > out <- c(yes[0], no[0] > length(out) <- max(length(yes), length(no)) yes; this would require that a `length<-` method works for the class of the result. Duncan Murdoch mentioned a version of this, in his very first reply: ans <- c(yes, no)[seq_along(test)] ans <- ans[seq_along(test)] which is less efficient for atomic vectors, but requires less from the class: it "only" needs `c` and `[` to work and a mixture of your two proposals would be possible too: ans <- c(yes[0], no[0]) ans <- ans[seq_along(test)] which does *not* work for my "mpfr" numbers (CRAN package 'Rmpfr'), but that's a buglet in the c.mpfr() implementation of my Rmpfr package... (which has already been fixed in the development version on R-forge, https://r-forge.r-project.org/R/?group_id=386) > But of course that wouldn't help with factor responses. Yes. However, a version of Duncan's suggestion -- of treating 'yes' first -- does help in that case. For once, mainly as "feasability experiment", I have created a github gist to make my current ifelse2() proposal available for commenting, cloning, pullrequesting, etc: Consisting of 2 files - ifelse-def.R : Functions definitions only, basically all the current proposals, called ifelse*() - ifelse-checks.R : A simplistic checking function and examples calling it, notably demonstrating that my ifelse2() does work with "Date", <dateTime> (i.e. "POSIXct" and "POSIXlt"), factors, and "mpfr" (the arbitrary-precision numbers in my package "Rmpfr") Also if you are not on github, you can quickly get to the ifelse2() definition : https://gist.github.com/mmaechler/9cfc3219c4b89649313bfe6853d87894#file-ifelse-def-r-L168 > Also, if you're considering an improved ifelse(), I'd > strongly urge you to consider adding an `na` argument, I now did (called it 'NA.'). > so that you can use ifelse() to transform all three > possible values in a logical vector. > Hadley > -- http://hadley.nz For those who really hate GH (and don't want or cannot easily follow the above URL), here's my current definition: ##' Martin Maechler, 14. Nov 2016 --- taking into account Duncan M. and Hadley's ##' ideas in the R-devel thread starting at (my mom's 86th birthday): ##' https://stat.ethz.ch/pipermail/r-devel/2016-August/072970.html ifelse2 <- function (test, yes, no, NA. = NA) { if(!is.logical(test)) { if(is.atomic(test)) storage.mode(test) <- "logical" else ## typically a "class"; storage.mode<-() typically fails test <- if(isS4(test)) methods::as(test, "logical") else as.logical(test) } ## No longer optimize the "if (a) x else y" cases: ## Only "non-good" R users use ifelse(.) instead of if(.) in these cases. ans <- tryCatch(rep(if(is.object(yes) && identical(class(yes), class(no))) ## as c(o) or o[0] may not work for the class yes else c(yes[0], no[0]), length.out = length(test)), error = function(e) { ## try asymmetric, yes-leaning r <- yes r[!test] <- no[!test] r }) ok <- !(nas <- is.na(test)) if (any(test[ok])) ans[test & ok] <- rep(yes, length.out = length(ans))[test & ok] if (any(!test[ok])) ans[!test & ok] <- rep(no, length.out = length(ans))[!test & ok] ans[nas] <- NA. # possibly coerced to class(ans) ans } ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel