On 08/26/2013 03:11 PM, Konstantin Berlin wrote:
> I looked only at the GuassNewton class as a general guide to how
> things work. I like it a lot! I only wish all of the optimizations
> were rewritten in this way.
>
> Several comments
>
> 1) I believe this code can now be simplified
>
> // build the linear problem
>             final double[] b = new double[nC];
>             final double[][] a = new double[nC][nC];
>             for (int i = 0; i < nR; ++i) {
>
>                 final double[] grad = weightedJacobian.getRow(i);
>                 final double residual = currentResiduals.getEntry(i);
>
>                 // compute the normal equation
>                 //residual is already weighted
>                 for (int j = 0; j < nC; ++j) {
>                     b[j] += residual * grad[j];
>                 }
>
>                 // build the contribution matrix for measurement i
>                 for (int k = 0; k < nC; ++k) {
>                     double[] ak = a[k];
>                     //Jacobian/gradient is already weighted
>                     for (int l = 0; l < nC; ++l) {
>                         ak[l] += grad[k] * grad[l];
>                     }
>                 }
>             }
>
> Should be just this (I haven't checked the math)
>
> jt = weightedJacobian.transpose();
> a = jt* weightedJacobian;
> b= jt*currentResiduals;

Yes. I think if we rewrite the implementation we should dig a little
deeper into the math.

Solving J^T J x = J^T b using Cholesky decomposition can use mn^2/2 +
n^3/6 operations with error proportional to [cond(J)]^2. Solving J x = b
using QR decomposition can use mn^2 - n^3/3 operations with error
proportional to cond(J) + norm(r)*[cond(J)]^2. Using SVD to solve J x =
b uses ~ 5mn^2 operations, but can handle nearly rank deficient problems.[1]

To sum up, the normal equations with Cholesky is the fastest, but only
works on well conditioned problems. SVD is very slow and robust. QR is a
good compromise between speed an numerical stability.

[1] Heath, Michael T. /Scientific computing : an introductory survey/.
Boston: McGraw-Hill, 2002. Chapters 3 and 6.

>
>
> 2) Based on the reasons I previously describe, would it be possible to
> separate the newton step into a separate function (which you would
> also add to the interface)?
> The idea here is that solving for the newton step can be
> computationally intensive depending on the size of the problem. So
> there is no one universal approach. There are several somewhat
> different ways that his can be solved, with
> some methods requiring information from previous steps in order to
> work. This would also allow to plug in line search or trust region
> control of step size. In order to allow all of these approaches, I
> would suggest this function:
>
> //this function would solve for dX
> public NewtonStepResult solveStep(LeastSquaresProblem lsp, RealVector
> currentPoint, Object prevStepInfo)
> {
>   NewtonStepResult result  = new NewtonStepResult();
>
>   //code to solve for the next step dx followed by
>   result.newtonStep = dx;
>
>   //code to update the step information
>   //leave blank for now
>   result.stepInfo = null;
> }
>
> where
>
> public class NewtonStepResult
> {
>   public class RealVector newtonStep;
>   public class Object stepInfo;
> }

What do we gain by incorporate the Newton step into the problem? We
could do it, but I think computing the Newton step better belongs in the
optimizer because there are several ways to solve for the step (see
above), and different optimizers may solve different equations to obtain
the next step. I think LM modifies the Jacobian before solving the
system. [citation needed]

Regards,
Evan

Reply via email to