On 09/16/2015 04:41 PM, Bert Gunter wrote:
Yes! Chuck's use of mapply is exactly the split/combine strategy I was
looking for. In retrospect, exactly how one should think about it.
Many thanks to all for a constructive discussion .

-- Bert


Bert Gunter


Use mapply like this on large problems:

unsplit(
   mapply(
       function(x,z) eval( x, list( y=z )),
       expression( A=y*2, B=y+3, C=sqrt(y) ),
       split( dat$Flow, dat$ASB ),
       SIMPLIFY=FALSE),
   dat$ASB)

Chuck



Is there any reason not to use data.table for this purpose, especially if efficiency is of concern?

---

# load data.table and microbenchmark
library(data.table)
library(microbenchmark)
#
# prepare data
DF <- data.frame(
    ASB = rep_len(factor(LETTERS[1:3]), 3e5),
    Flow = rnorm(3e5)^2)
DT <- as.data.table(DF)
DT[, ASB := as.character(ASB)]
#
# define functions
#
# Chuck's version
fnSplit <- function(dat) {
    unsplit(
        mapply(
            function(x,z) eval( x, list( y=z )),
            expression( A=y*2, B=y+3, C=sqrt(y) ),
            split( dat$Flow, dat$ASB ),
            SIMPLIFY=FALSE),
        dat$ASB)
}
#
# data.table-way (IMHO, much easier to read)
fnDataTable <- function(dat) {
    dat[,
        result :=
            if (.BY == "A") {
                2 * Flow
            } else if (.BY == "B") {
                3 + Flow
            } else if (.BY == "C") {
                sqrt(Flow)
            },
        by = ASB]
}
#
# benchmark
#
microbenchmark(fnSplit(DF), fnDataTable(DT))
identical(fnSplit(DF), fnDataTable(DT)[, result])

---

Actually, in Chuck's version the unsplit() part is slow. If the order is not of concern (e.g., DF is reordered before calling fnSplit), fnSplit is comparable to the DT-version.


Denes

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to