I've looked into improving performance further, but it seems any further
improvements will need big API changes for memory management.

Currently using Gauss-Newton with Cholesky (or LU) requires 4 matrix
allocations _each_ evaluation. The objective function initially
allocates the Jacobian matrix. Then the weights are applied through
matrix multiplication, allocating a new matrix. Computing the normal
equations allocates a new matrix to hold the result, and finally the
decomposition allocates it's own matrix as a copy. With QR there are 3
matrix allocations each model function evaluation, since there is no
need to compute the normal equations, but the third allocation+copy is
larger. Some empirical sampling data I've collected with the jvisualvm
tool indicates that matrix allocation and copying takes 30% to 80% of
the execution time, depending on the dimension of the Jacobian.

One way to improve performance would be to provide pre-allocated space
for the Jacobian and reuse it for each evaluation. The
LeastSquaresProblem interface would then be:

void evaluate(RealVector point, RealVector resultResiduals, RealVector
resultJacobian);

I'm interested in hearing your ideas on other approaches to solve this
issue. Or even if this is an issue worth solving.

Best Regards,
Evan


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to