John,
I purposefully narrowed my question to a single issue. But since you
brought it up,
and FYI: the optimal optimizer is indeed one of the most important aspects of
the problem.
Your insight is right on target.
The underlying problem is multi-state hazard models for interval censored data,
driven by
real data in dementia research. The msm package deals with such problems, but
uses optim
behind the scenes and for our large and more complex problems, optim either
takes forever
(never finds the solution) or nearly forever too much of the time. Behind the
scenes,
the likelihood for each subject involves a product of matrix exponentials,
which are
slow, so at the very bottom of the call chain are functions that evaluate the
likelihood
and first deriv for one subject (multiple rows of data over their observation
time
window), invoked via mclapply. The first derivative is feasable but not the
second
deriv. Parallel computation across subjects was a big win wrt compute time.
The two maximizers which seem to work well are Fisher scoring (use sum_i
U_iU'_i to
approximate the Hessian, where U_i = first derivative contribution of subject
i) +
Levenberg Marquart + trust region + constraints, or a full MCMC approach
(doi.org/10.1080/01621459.2019.1594831). We may eventually update the MCMC
to use
Hamiltonian methods, but that is still far on the horizon.
In any case, this statisitical approach seems to be a winner, and we intend to
apply it
multiple times over the next several years. Which means that I need to take
code that
works but has the appearance of a Rube Goldberg invention, and which only I can
use, and
make available to others both inside and outside our group. The question
involved one
practical issue as I tackle this.
Terry
On 12/2/25 08:26, J C Nash wrote:
> Duncan's suggestion to time things is important -- and would make a very
> useful short
> communication or blog!
> There are frequently differences of orders of magnitude in timing.
>
> I'll also suggest that it is worth some crude timings of different solvers.
> There is
> sufficient variation
> over problems that this won't decide definitively which solver is fastest,
> but you might
> eliminate one or
> two that are poor for your situation. Depending on numbers of parameters, I'd
> guess ncg
> or its predecessor
> Rcgmin will be relatively good. LBFGS variants can be good, but sometimes
> seem to toss
> up disasters. Most
> of these can be accessed with optimx package to save coding. By removing some
> checks and
> safeguards in
> optimx you could likely speed up things a bit too.
>
> If full optimum is not needed, some attention to early stopping might be
> worthwhile, but
> I've seen lots
> of silly mistakes made playing with tolerances, and if you go that route,
> choose a
> custom termination
> rule that fits your particular problem or you'll get rubbish.
>
> JN
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel