On Sun, 13 Aug 2023, Hans W writes: > While working on 'random walk' applications, I got interested in > optimizing noisy objective functions. As an (artificial) example, the > following is the Rosenbrock function, where Gaussian noise of standard > deviation `sd = 0.01` is added to the function value. > > fn <- function(x) > (1+rnorm(1, sd=0.01)) * adagio::fnRosenbrock(x) > > To smooth out the noise, define another function `fnk(x, k = 1)` that > calls `fn` k times and returns the mean value of those k function > applications. > > fnk <- function(x, k = 1) { # fnk(x) same as fn(x) > rv = 0.0 > for (i in 1:k) rv <- rv + fn(x) > return(rv/n) > } > > When we apply several optimization solvers to this noisy and smoothed > noise functions we get for instance the following results: > (Starting point is always `rep(0.1, 5)`, maximal number of iterations 5000, > relative tolerance 1e-12, and the optimization is successful if the > function value at the minimum is below 1e-06.) > > k nmk anms neldermead ucminf optim_BFGS > --------------------------------------------------- > 1 0.21 0.32 0.13 0.00 0.00 > 3 0.52 0.63 0.50 0.00 0.00 > 10 0.81 0.91 0.87 0.00 0.00 > > Solvers: nmk = dfoptim::nmk, anms = pracma::anms [both Nelder-Mead codes] > neldermead = nloptr::neldermead, > ucminf = ucminf::ucminf, optim_BFGS = optim with method "BFGS" > > Read the table as follows: `nmk` will be successful in 21% of the > trials, while for example `optim` will never come close to the true > minimum. > > I think it is reasonable to assume that gradient-based methods do > poorly with noisy objectives, though I did not expect to see them fail > so clearly. On the other hand, Nelder-Mead implementations do quite > well if there is not too much noise. > > In real-world applications, it will often not be possible to do the > same measurement several times. That is, we will then have to live > with `k = 1`. In my applications with long 'random walks', doing the > calculations several times in a row will really need some time. > > QUESTION: What could be other approaches to minimize noisy functions? > > I looked through some "Stochastic Programming" tutorials and did not > find them very helpful in this situation. Of course, I might have > looked into these works too superficially. >
Since Nelder--Mead worked /relatively/ well: in my experience, its results can sometimes be improved by restarting it, i.e. re-initializing the simplex. See this note here: http://enricoschumann.net/R/restartnm.htm , which incidentally also uses the Rosenbrock function. Just for the record, I think Differential Evolution (DE) can handle such problems well, though it would usually need more iterations. As computational "proof" ;-) , I run DE 50 times for the k == 1 case, and each time store whether the resulting objective-function value is below 1e-6. I let DE take way more function evaluations (population of 100 times 500 generations = 50000 function evaluations); but it gets a value below 1e-6 in all 50 cases. library("NMOF") ofv.below.threshold <- logical(50) for (i in seq_along(ofv.below.threshold)) { sol <- DEopt(fnk, algo = list( nP = 100, nG = 500, min = rep( 5, 5), max = rep( 10, 5))) ofv.below.threshold[i] <- sol$OFvalue < 1e-6 } sum(ofv.below.threshold)/length(ofv.below.threshold) (These 50 runs take less than half a minute on my machine.) Note that I have on purpose initialized the population in the range 5 to 10, i.e. way off the optimum. -- Enrico Schumann Lucerne, Switzerland http://enricoschumann.net ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.