I should summarize what I've learned after my more recent post on this: With 'long double', it is extremely expensive to keep "track of" non-finites (NA, NaN and Inf) when doing basic arithmetics. The most extreme cost is seen when the first value is non-finite, because there is penalty added in all of the following iterations.
EXAMPLE: ## All finite values > x <- rep(0.1, 1e8) > system.time(sum(x, narm=FALSE)) user system elapsed 0.28 0.00 0.28 ## First value is NA (maximum penalty) > y <- x; x[1] <- NA > system.time(sum(y, narm=FALSE)) user system elapsed 23.47 0.00 23.93 (sic!) ## Last value is NA (minimum penalty) > z <- x; x[length(z)] <- NA > system.time(sum(z, narm=FALSE)) user system elapsed 0.35 0.00 0.34 ## Silly, but proof of concept by summing rev(y) ## such that NA ends up last > system.time(sum(rev(y), narm=FALSE)) user system elapsed 2.40 0.85 3.31 This penalty of having non-finite values is only observed with 'long double', but not with 'double' (see previous message). WORKAROUND: To workaround this penalty, one can test the running result for being non-finite and return early, e.g. if na.rm=FALSE then as soon as the running sum is NA_REAL, return NA_REAL. Unfortunately, the cost for testing for NA_REAL is expensive, so the penality of doing every iteration would be too expensive. A better approach is to test for early stopping at some interval. For example: > sum4 <- inline::cfunction(sig=c(x="double", narm="logical"), body=' #define LDOUBLE long double double *x_ = REAL(x); int narm_ = asLogical(narm); int n = length(x); LDOUBLE sum = 0.0; for (R_xlen_t i = 0; i < n; i++) { /* If narm=FALSE, always sum */ if (!narm_ || !ISNAN(x_[i])) sum += x_[i]; /* Early stopping check when sum non-finite */ if (!R_FINITE(sum)) break; } return ScalarReal((double)sum); ') ## See previous email for non-early stopping version > system.time(sum3(x, narm=FALSE)) user system elapsed 1.05 0.00 1.05 ## Early stopping every iteration adds quite a bit ## of overhead if never used > system.time(sum4(x, narm=FALSE)) user system elapsed 1.79 0.00 1.86 ## Early stopping once in a while > sum5 <- inline::cfunction(sig=c(x="double", narm="logical"), body=' #define LDOUBLE long double double *x_ = REAL(x); int narm_ = asLogical(narm); int n = length(x); LDOUBLE sum = 0.0; for (R_xlen_t i = 0; i < n; i++) { /* If narm=FALSE, always sum */ if (!narm_ || !ISNAN(x_[i])) sum += x_[i]; /* Early stopping check when sum non-finite */ if (i % 1048576 == 0 && !R_FINITE(sum)) break; } return ScalarReal((double)sum); ') ## Low extra cost even without non-finites > system.time(sum5(x, narm=FALSE)) user system elapsed 1.05 0.00 1.06 ## ...and can still do early stopping > system.time(sum5(y, narm=FALSE)) user system elapsed 0 0 0 /Henrik On Sun, May 31, 2015 at 6:08 PM, Henrik Bengtsson <henrik.bengts...@ucsf.edu> wrote: > This is a great example how you cannot figure it out after spending > two hours troubleshooting, but a few minutes after you post to > R-devel, it's just jumps to you (is there a word for this other than > "impatient"?); > > Let me answer my own question. The discrepancy between my sum2() code > and the internal code for base::sum() is that the latter uses LDOUBLE > = long double (on some system it's only double, cf. > https://github.com/wch/r-source/blob/trunk/src/nmath/nmath.h#L28-L33), > whereas my sum2() code uses double. So using long double, I can > reproduce the penalty of having NA_real_ with na.rm=FALSE; > >> sum3 <- inline::cfunction(sig=c(x="double", narm="logical"), body=' > #define LDOUBLE long double > double *x_ = REAL(x); > int narm_ = asLogical(narm); > int n = length(x); > LDOUBLE sum = 0.0; > for (R_xlen_t i = 0; i < n; i++) { > if (!narm_ || !ISNAN(x_[i])) sum += x_[i]; > } > return ScalarReal((double)sum); > ') > >> x <- rep(0, 1e8) >> stopifnot(typeof(x) == "double") >> system.time(sum3(x, narm=FALSE)) > user system elapsed > 0.40 0.00 0.44 >> y <- rep(NA_real_, 1e8) >> stopifnot(typeof(y) == "double") >> system.time(sum3(y, narm=FALSE)) > user system elapsed > 9.80 0.00 9.84 >> z <- x; z[length(z)/2] <- NA_real_ >> stopifnot(typeof(z) == "double") >> system.time(sum3(z, narm=FALSE)) > user system elapsed > 4.49 0.00 4.50 > > This might even be what the following comment refers to: > > /* Required by C99 but might be slow */ > #ifdef HAVE_LONG_DOUBLE > # define LDOUBLE long double > #else > # define LDOUBLE double > #endif > > So now I should rephrase my question: Is there away to avoid this > penalty when using 'long double'? Is this something the compiler can > be clever about, or is the only solution to not use 'long double'? > > /Henrik > > On Sun, May 31, 2015 at 5:02 PM, Henrik Bengtsson > <henrik.bengts...@ucsf.edu> wrote: >> I'm observing that base::sum(x, na.rm=FALSE) for typeof(x) == "double" >> is much more time consuming when there are missing values versus when >> there are not. I'm observing this on both Window and Linux, but it's >> quite surprising to me. Currently, my main suspect is settings in on >> how R was built. The second suspect is my brain. I hope that someone >> can clarify the below results and confirm or not whether they see the >> same. Note, this is for "doubles", so I'm not expecting >> early-stopping as for "integers" (where testing for NA is cheap). >> >> On R 3.2.0, on Windows (using the official CRAN builds), on Linux >> (local built), and on OS X (official AT&T builds), I get: >> >>> x <- rep(0, 1e8) >>> stopifnot(typeof(x) == "double") >>> system.time(sum(x, na.rm=FALSE)) >> user system elapsed >> 0.19 0.01 0.20 >> >>> y <- rep(NA_real_, 1e8) >>> stopifnot(typeof(y) == "double") >>> system.time(sum(y, na.rm=FALSE)) >> user system elapsed >> 9.54 0.00 9.55 >> >>> z <- x; z[length(z)/2] <- NA_real_ >>> stopifnot(typeof(z) == "double") >>> system.time(sum(z, na.rm=FALSE)) >> user system elapsed >> 4.49 0.00 4.51 >> >> Following the source code, I'm pretty sure the code >> (https://github.com/wch/r-source/blob/trunk/src/main/summary.c#L112-L128) >> performing the calculation is: >> >> static Rboolean rsum(double *x, R_xlen_t n, double *value, Rboolean narm) >> { >> LDOUBLE s = 0.0; >> Rboolean updated = FALSE; >> for (R_xlen_t i = 0; i < n; i++) { >> if (!narm || !ISNAN(x[i])) { >> if(!updated) updated = TRUE; >> s += x[i]; >> } >> } >> if(s > DBL_MAX) *value = R_PosInf; >> else if (s < -DBL_MAX) *value = R_NegInf; >> else *value = (double) s; >> return updated; >> } >> >> In other words, when na.rm=FALSE, that inner for loop: >> >> for (R_xlen_t i = 0; i < n; i++) { >> if (!narm || !ISNAN(x[i])) { >> if(!updated) updated = TRUE; >> s += x[i]; >> } >> } >> >> should effectively become (because !ISNAN(x[i]) "does not make a >> difference"): >> >> for (R_xlen_t i = 0; i < n; i++) { >> if (!narm) { >> if(!updated) updated = TRUE; >> s += x[i]; >> } >> } >> >> That is, sum(x, na.rm=FALSE) basically spends time on `s += x[i]`. >> Now, I have always been under impression that summing with NA:s is >> *not* more expensive that summing over regular (double) values, which >> is confirmed by the below example, but the above benchmarking >> disagree. It looks like there is a big overhead keeping track of the >> sum `s` being NA, which is supported by the fact that summing over 'z' >> is costs half of 'y'. >> >> Now, I *cannot* reproduce the above using the following 'inline' example: >> >>> sum2 <- inline::cfunction(sig=c(x="double", narm="logical"), body=' >> double *x_ = REAL(x); >> int narm_ = asLogical(narm); >> int n = length(x); >> double sum = 0; >> for (R_xlen_t i = 0; i < n; i++) { >> if (!narm_ || !ISNAN(x_[i])) sum += x_[i]; >> } >> return ScalarReal(sum); >> ') >> >>> x <- rep(0, 1e8) >>> stopifnot(typeof(x) == "double") >>> system.time(sum2(x, narm=FALSE)) >> user system elapsed >> 0.16 0.00 0.16 >> >>> y <- rep(NA_real_, 1e8) >>> stopifnot(typeof(y) == "double") >>> system.time(sum2(y, narm=FALSE)) >> user system elapsed >> 0.16 0.00 0.15 >> >>> z <- x; z[length(z)/2] <- NA_real_ >>> stopifnot(typeof(z) == "double") >>> system.time(sum2(z, narm=FALSE)) >> user system elapsed >> 0.16 0.00 0.15 >> >> This is why I suspect it's related to how R was configured when it was >> built. What's going on? Can someone please bring some light on this? >> >> Thanks >> >> Henrik ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel