On Tue, Aug 20, 2013 at 8:54 AM, Evan Ward <evan.w...@nrl.navy.mil> wrote:
> Hi all,
>
> Sorry for the delay. I'll try switching all of the least squares package
> to the new/seperable API and post the results to github. From there we
> can review, discuss and then decide if it makes sense to switch the rest
> of the optimizers.
>
>
> On 08/15/2013 09:12 AM, Konstantin Berlin wrote:
>> Hi,
>>
>> As a quick first pass, I will comment on Evans code only for now. Most
>> of my comments have to do with numerical aspects and associated
>> design, rather than threading.
>>
>> I like the idea of the optimize function taking in a Problem class. I
>> think it can add a lot of flexibility going forward.
>>
>> First I think to make things clear, LeastSquares should be renamed to
>> IterativeLeastSquares. Since for unconstrained linear least squares,
>> this whole hierarchy is not correct. For one thing they don't need an
>> initial guess, for another, linear least squares can factor the matrix
>> before experimental data is supplied, so it can support rapid
>> recomputation.
>
> From my background the problem is more commonly called General Least
> Squares, or Non-Linear Least Squares. I shortened it to avoid (yet
> another) long name. :) I don't plan to include linear least squares
> problem in the new API. It seems hard to be simpler than new
> QRDecomposition(A).getSolver().solve(b). :)
>

I think you are right about the name. I don't have a strong opinion on
this in either case.

>>
>> There are several issues that I have with current example
>>
>> 1) While the least-squares problem now encapsulates the weights, which
>> I like, for some reason the weights are explicitly used in the actual
>> GaussNewtonOptimizer and probably all other least-squares optimizer. I
>> think that is bad for several reasons:
>> i) the actual optimizer algorithm has nothing to do with weights, so
>> it logically inconsistent. It forces people who have no weights to
>> still go through the actual numerical computation.
>> ii) it makes it harder to maintain and update the optimizer code. Just
>> think about how much cleaner it would be if weights are actually
>> included in the Jacobian.
>> iIi) The inclusion of weight can be really optimized inside the
>> LeastSquaresProblem, where depending on the problem they might be able
>> to take it more efficiently than the general approach.
>>
>> 2) All computations inside the optimizers should be done using linear
>> algebra library (or is the library so bad?). Not only does it make the
>> code easier to follow, it can make it a lot faster. Imagine a
>> not-so-hypothetical example where the Jacobian is actually sparse
>> (I know you guys are removing sparse linear algebra, but lets say it
>> gets put back at some point later). Forming the Hessian from the
>> Jacobian is a very expensive operation (J^T*J), but if J is sparse the
>> speedup could be automatically propagated through the algorithm.
>
> I agree with 1 and 2. You seem to have more experience with optimizers
> than me; would you like to update GaussNewton.doOptimize() to assume the
> weights are already included?
>

Not sure how to go about it. My main barrier so far to contributing
was the lack of time to figure out how to setup everything so that it
passes the coding style, etc. I see several other issues with current
implementation, among them that it is not robust to badly conditioned
Hessians and that it uses more expensive QR decomposition. However
these fixes would not propagate to other optimizers. This is a good
example of how all the solvers can benefit from sharing the newton
step code.

>>
>> 3) Speaking of forming the Hessian, the actual computation of the
>> descent step should be logically and explicitly separated into two
>> separate function. I think the first function could actually be moved
>> to the LeastSquaresProblem (need more thought on this)
>>  i) The first function will compute the step size p, where H*p = -g, H
>> is the Hessian and g is the gradient. The reason for this is there are
>> several ways this problem can be solved, and that should be deferred
>> to a specific implementation later. For example, in a very large
>> least-squares problem (or any large problem) you cannot actually form
>> J^T*J (or H), and instead you solve the problem using conjugate
>> gradient iterative method (which you already have in the library),
>> where you compute J^T*J*x as J^T*(J*x). Again, here having the
>> Jacobian be a matrix would really help future proof things. In the
>> case of general non-linear optimizers you might want to support in the
>> future a very important quasi-Newton algorithms BFGS or L-BFGS, were
>> this step is done very differently.
>>
>> ii) Once the step size is computed, a line search or trust region
>> method is used to potentially modify the step. Here also bound
>> constraints can be enforced, so an optimizer that supports it can
>> overwrite this function. This part should also be exposed, at least in
>> the abstract classes, from which the final least-squares optimizer
>> should be built.
>
> I'm not sure I follow. Are you suggesting that the LSP class solve the
> normal equations? Solving the normal equations and computing the next
> step seems like the job of the "optimizer".
>

You are probably right, having LSP solve the linear system might be a
bad idea. However the newton step solver and the linear search/trust
region solver both need to exposed in the superclass, since this is
what will separate most optimization methods from each other. Perhaps
another class hierarchy should be created called newton step solver.
The fact that I have no access to these steps currently creates a
major barrier for me right now, since it makes it difficult to adapt
the CM optimizers for larger problems or constraints.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to