Using log1p instead of log improves the accuracy of the 'subtract xmax' algorithm when the largest x is very close to zero. Perhaps that is what is missing in the Rf_logspace_add.
test <- function (x) { x <- as.numeric(x) i <- which.max(x) rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))), Rf_logspace_sum = logspace_sum(x), subtract_xmax_log1p = log1p(sum(exp(x[-i] - x[i]))) + x[i], subtract_xmax_naive = log(sum(exp(x - x[i]))) + x[i], naive = log(sum(exp(x)))) } > test(c(-1e-50, -46, -47)) [,1] Rmpfr 1.4404614986241e-20 Rf_logspace_sum -1.0000000000000e-50 subtract_xmax_log1p 1.4404614986241e-20 subtract_xmax_naive -1.0000000000000e-50 naive 0.0000000000000e+00 > test(c(0, -46, -47)) [,1] Rmpfr 1.4404614986241e-20 Rf_logspace_sum 0.0000000000000e+00 subtract_xmax_log1p 1.4404614986241e-20 subtract_xmax_naive 0.0000000000000e+00 naive 0.0000000000000e+00 Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Nov 7, 2016 at 11:09 AM, William Dunlap <wdun...@tibco.com> wrote: > It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and > Rf_logspace_sub were available as R functions. (I wish the '_sub' were > '_subtract' because 'sub' means too many things in R.) > > I think Rf_logspace_sum in R could be a little better. E.g., using the C > code > > #include <R.h> > #include <Rinternals.h> > #include <Rmath.h> > > SEXP Call_logspace_sum(SEXP x) > { > if (TYPEOF(x) != REALSXP) > { > Rf_error("'x' must be a numeric vector"); > } > return ScalarReal(Rf_logspace_sum(REAL(x), length(x))); > } > > and the R functions > > logspace_sum <- function (x) .Call("Call_logspace_sum", as.numeric(x)) > > and > > test <- function (x) { > x <- as.numeric(x) > rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, > precBits=5000))))), > Rf_logspace_sum = logspace_sum(x), > subtract_xmax = log(sum(exp(x - max(x)))) + max(x), > naive = log(sum(exp(x)))) > } > > > R-3.3.2 on Linux gives, after options(digits=17) > > test(c(0, -50)) > [,1] > Rmpfr 1.9287498479639178e-22 > Rf_logspace_sum 1.9287498479639178e-22 > subtract_xmax 0.0000000000000000e+00 > naive 0.0000000000000000e+00 > > which is nice, but also the not so nice > > > test(c(0, -50, -50)) > [,1] > Rmpfr 3.8574996959278356e-22 > Rf_logspace_sum 0.0000000000000000e+00 > subtract_xmax 0.0000000000000000e+00 > naive 0.0000000000000000e+00 > > With TERR the second test has Rmpfr==Rf_logspace_sum for that example. > > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler < > maech...@stat.math.ethz.ch> wrote: > >> >>>>> William Dunlap via R-help <r-help@r-project.org> >> >>>>> on Sun, 6 Nov 2016 20:53:17 -0800 writes: >> >> > Perhaps the C function Rf_logspace_sum(double *x, int n) would help >> in >> > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, >> avoiding >> > unnecessary under- and overflow. >> >> Indeed! >> >> I had thought more than twice to also export it to the R level >> notably as we have been using two R level versions in a package >> I maintain ('copula'). They are vectorized there in a way that >> seemed particularly useful to our (Marius Hofert and my) use cases. >> >> More on this -- making these available in R, how exactly? -- >> probably should move to the R-devel list. >> >> Thank you Bill for bringing it up! >> Martin >> >> > Bill Dunlap >> > TIBCO Software >> > wdunlap tibco.com >> >> > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner < >> r.tur...@auckland.ac.nz> wrote: >> >> >> On 07/11/16 13:07, William Dunlap wrote: >> >> >> >>> Have you tried reparameterizing, using logb (=log(b)) instead of >> b? >> >>> >> >> >> >> Uh, no. I don't think that that makes any sense in my context. >> >> >> >> The "b" values are probabilities and must satisfy a "sum-to-1" >> >> constraint. To accommodate this constraint I re-parametrise via a >> >> "logistic" style parametrisation --- basically >> >> >> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n >> >> >> >> with the parameters that the optimiser works with being z_1, ..., >> z_{n-1} >> >> (and with z_n == 0 for identifiability). The objective function >> is of the >> >> form sum_i(a_i * log(b_i)), so I transform back >> >> from the z_i to the b_i in order calculate the value of the >> objective >> >> function. But when the z_i get moderately large-negative, the b_i >> become >> >> numerically 0 and then log(b_i) becomes -Inf. And the optimiser >> falls over. >> >> >> >> cheers, >> >> >> >> Rolf >> >> >> >> >> >>> Bill Dunlap >> >>> TIBCO Software >> >>> wdunlap tibco.com <http://tibco.com> >> >>> >> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner < >> r.tur...@auckland.ac.nz >> >>> <mailto:r.tur...@auckland.ac.nz>> wrote: >> >>> >> >>> >> >>> I am trying to deal with a maximisation problem in which it is >> >>> possible for the objective function to (quite legitimately) return >> >>> the value -Inf, which causes the numerical optimisers that I have >> >>> tried to fall over. >> >>> >> >>> The -Inf values arise from expressions of the form "a * log(b)", >> >>> with b = 0. Under the *starting* values of the parameters, a must >> >>> equal equal 0 whenever b = 0, so we can legitimately say that a * >> >>> log(b) = 0 in these circumstances. However as the maximisation >> >>> algorithm searches over parameters it is possible for b to take >> the >> >>> value 0 for values of >> >>> a that are strictly positive. (The values of "a" do not change >> during >> >>> this search, although they *do* change between "successive >> searches".) >> >>> >> >>> Clearly if one is *maximising* the objective then -Inf is not a >> value >> >>> of >> >>> particular interest, and we should be able to "move away". But >> the >> >>> optimising function just stops. >> >>> >> >>> It is also clear that "moving away" is not a simple task; you >> can't >> >>> estimate a gradient or Hessian at a point where the function value >> >>> is -Inf. >> >>> >> >>> Can anyone suggest a way out of this dilemma, perhaps an optimiser >> >>> that is equipped to cope with -Inf values in some sneaky way? >> >>> >> >>> Various ad hoc kludges spring to mind, but they all seem to be >> >>> fraught with peril. >> >>> >> >>> I have tried changing the value returned by the objective function >> >>> from >> >>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. >> >>> However this seemed to flatten out the objective surface too much, >> >>> and the search stalled at the 0 value, which is the antithesis of >> >>> optimal. >> >>> >> >>> The problem arises in a context of applying the EM algorithm where >> >>> the M-step cannot be carried out explicitly, whence numerical >> >>> optimisation. >> >>> I can give more detail if anyone thinks that it could be relevant. >> >>> >> >>> I would appreciate advice from younger and wiser heads! :-) >> >>> >> >>> cheers, >> >>> >> >>> Rolf Turner >> >>> >> >>> -- >> >>> Technical Editor ANZJS >> >>> Department of Statistics >> >>> University of Auckland >> >>> Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2 >> 088276> >> >>> >> >>> ______________________________________________ >> >>> R-help@r-project.org <mailto:R-help@r-project.org> mailing list >> -- >> >>> To UNSUBSCRIBE and more, see >> >>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>> <https://stat.ethz.ch/mailman/listinfo/r-help> >> >>> PLEASE do read the posting guide >> >>> http://www.R-project.org/posting-guide.html >> >>> <http://www.R-project.org/posting-guide.html> >> >>> and provide commented, minimal, self-contained, reproducible code. >> >>> >> >>> >> >>> >> >> >> >> -- >> >> Technical Editor ANZJS >> >> Department of Statistics >> >> University of Auckland >> >> Phone: +64-9-373-7599 ext. 88276 >> >> >> >> > [[alternative HTML version deleted]] >> >> > ______________________________________________ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.