John,
   I purposefully narrowed my question to a single issue.  But since you 
brought it up, 
and FYI:  the optimal optimizer is indeed one of the most important aspects of 
the problem.
Your insight is right on target.

The underlying problem is multi-state hazard models for interval censored data, 
driven by 
real data in dementia research.   The msm package deals with such problems, but 
uses optim 
behind the scenes and for our large and more complex problems, optim either 
takes forever 
(never finds the solution) or nearly forever too much of the time.   Behind the 
scenes, 
the likelihood for each subject involves a product of  matrix exponentials, 
which are 
slow, so at the very bottom of the call chain are functions that evaluate the 
likelihood 
and first deriv for one subject (multiple rows of data over their observation 
time 
window),  invoked via mclapply.   The first derivative is feasable  but not the 
second 
deriv.  Parallel computation across subjects was a big win wrt compute time.

   The two maximizers which seem to work well are Fisher scoring (use sum_i 
U_iU'_i to 
approximate the Hessian, where U_i = first derivative contribution of subject 
i) + 
Levenberg Marquart + trust region + constraints, or a full MCMC approach 
(doi.org/10.1080/01621459.2019.1594831).    We may eventually update the MCMC 
to use 
Hamiltonian methods, but that is still far on the horizon.

In any case, this statisitical approach seems to be a winner, and we intend to 
apply it 
multiple times over the next several years.   Which means that I need to take 
code that 
works but has the appearance of a Rube Goldberg invention, and which only I can 
use, and 
make available to others both inside and outside our group.   The question 
involved one 
practical issue as I tackle this.

Terry


On 12/2/25 08:26, J C Nash wrote:
> Duncan's suggestion to time things is important -- and would make a very 
> useful short 
> communication or blog!
> There are frequently differences of orders of magnitude in timing.
>
> I'll also suggest that it is worth some crude timings of different solvers. 
> There is 
> sufficient variation
> over problems that this won't decide definitively which solver is fastest, 
> but you might 
> eliminate one or
> two that are poor for your situation. Depending on numbers of parameters, I'd 
> guess ncg 
> or its predecessor
> Rcgmin will be relatively good. LBFGS variants can be good, but sometimes 
> seem to toss 
> up disasters. Most
> of these can be accessed with optimx package to save coding. By removing some 
> checks and 
> safeguards in
> optimx you could likely speed up things a bit too.
>
> If full optimum is not needed, some attention to early stopping might be 
> worthwhile, but 
> I've seen lots
> of silly mistakes made playing with tolerances, and if you go that route, 
> choose a 
> custom termination
> rule that fits your particular problem or you'll get rubbish.
>
> JN

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to