On 08/14/2013 11:30 AM, Gilles wrote:
> On Tue, 13 Aug 2013 17:47:18 -0400, Evan Ward wrote:
>> On 08/13/2013 08:17 AM, Gilles wrote:
>>>> (I.e. GN would
>>>> not need 5 superclasses and fields for storing upper and lower
>>>> bounds.)
>>>
>>> Be careful: "GaussNewtonOptimizer" does _not_ support bounds!
>>
>> That is my point! Because of it's complex inheritance every instance
>> includes fields for storing bounds even though the algorithm has no use
>> for them. (Declared in BaseMultivariateOptimizer) I was looking at the
>> released (3.2) version.
>
> That's because, I was lazy; I should have added yet another layer:
>   BaseMultivariateOptimizerWithBounds
>
> :-D
>
>> Looks like it has since been fixed. :)
>
> Yes and no. In the latest version, I simply not included what is not used
> (yet). At some point, bounds must be stored somewhere for those
> optimizers
> that do use them. We'll probably need to insert that additional layer...
> [Which you'll have to do too in your scheme when you apply it to the
> optimizers that use bounds (or not).]
>
>>>
>>>>
>>>> With the API re-design you have the opportunity to design for thread
>>>> safety, which can be hard to add on to an existing API.
>>>
>>> True in general, but in this case we didn't make a thorough analysis
>>> of advantages and drawbacks of creating a new instance for each
>>> application of a "withXxx" method.
>>> I previously hinted that such a behaviour might not be desirable,
>>> independently of the issue of thread-safety.
>>>
>>>> Thanks for the whole discussion; It has given me some new ideas. :)
>>>
>>> You are most welcome to experiment with the current revision of the
>>> code in "o.a.c.m.fitting.leastsquares", in order to pinpoint problems
>>> for your use-cases.
>
> This point is really interesting for comparing the two approaches (yours
> and the current code the above package). [More below...]
>
>>> You could also provide the concrete changes (patch) needed to move the
>>> "data" to another object, as you suggested.
>>> If it's better design and leads to simpler code, there shouldn't be
>>> any problem in applying it for the next release.
>>
>> I separated the algorithm from the problem definition for the
>> GaussNewton optimizer. See
>> https://gist.github.com/wardev/dfd16ac0ad3ba62ede05 If we decide to go
>> this route LM would follow the same pattern. It still needs some
>> polishing and a few smaller API decisions.
>>
>> A summary of the changes:
>>   - The two superclasses of GN were turned in to
>> AbstractOptimizationProblem and LeastSquaresProblemImpl. I tried to keep
>> most of the code the same so that a diff would be meaningful.
>>   - Interfaces were extracted from LeastSquaresProblemImpl and
>> AbstractOptimizationProblem. I kept the upper level of the hierarchy as
>> an implementation aid.
>>   - GaussNewton implements LeastSquaresOptimizer and does not have a
>> super class. Evaluation and iteration counts are now locals.
>>   - There is now one method to evaluate the model/problem, see
>> LeastSquaresProblem.evaluate(double[]). This returns a container with
>> the value and Jacobian.
>
> In effect, this is another way to see the problem.
>
> And if we were starting from scratch, I wouldn't mind following that
> design, but as it is now, I'm not quite convinced that it is actually
> simpler than what I got too in the above package. Personally, I find
> a few things fairly odd. E.g. the counters are in the problem spec
> whereas it is a property of the optimization algorithm; also, the
> evaluation is not encapsulated anymore (an optimization algorithm
> must explicitly call "incrementCount" instead of it being done as
> part of the model evaluation). [I agree that they are moot points,
> but good (IMO) ideas in the current code should not be dropped
> without careful examination, lest that we go in circles.]

Yes, I wasn't sure what to do with the counters. My thought was that the
evaluation counter is a divergence condition and so it belongs with the
convergence condition in the problem definition. I could also see it as
part of the optimizer with the notion that if the algorithm can't find
an optimum after X evaluations it is not worth it to keep trying.

Since the iteration counter was incremented explicitly I thought it made
sense to increment the evaluation counter explicitly. With the interface
based design it would be easy to create a wrapper to track evaluations
to keep the original semantics:

public static LeastSquareProblem countEvaluations(final
LeastSquaresProblem lsp, final Incrementor counter){
    return new LeastSquaresProblem(){
        /* delegate all other methods to lsp */
   
        public Evaluation evaluate(double[] point){
            counter.increment();
            return lsp.evaluate(point);
        }
    };
}

>>
>> Advantages:
>>   - GaussNewton is now trivially easy to make immutable and fluent and
>> thread-safe. This will allow the same algorithm to be applied to
>> different problems.
>
> Unless I'm mistaken, this is only because you off-loaded all data
> to another class. This other class is not less problematic for the
> viewpoint of thread safety: You can share the optimizer instance but not
> the "LeastSquaresProblem" instance. I'm not sure that's safer for an
> unaware user.

I saw thread safety as having two parts. First is using the same
algorithm on different problems for applications with data parallelism
(e.g. apply your favorite algorithm to all of your optimization
problems).  This give the rule that algorithms are thread safe and
problems don't have to be. For the user I think the rule is: "Each
thread optimizes a different problem". Second is using different
algorithms on the same problem for applications with task parallelism
(e.g. compare GN and LM effectiveness for a given problem). I think the
first case will be more common than the second (but I don't have data to
prove it). I didn't think the same algorithm, same problem case was
useful and different algorithm, different problem is already thread safe.

>
>>   - If the model function and convergence checker are thread safe, then
>> so is the LeastSquaresProblem (if you don't mutate it).
>
> The same applies to the current design.

Partially, each call to evaluate the model used to increment a single
counter. I had to give each optimizer its own local counter to avoid
this. (More evidence that I should have put the counters in the
optimizer class. :)

>> This would allow
>> different algorithms to be applied to the same problem. These are the
>> two big bullet points that I think make it worthwhile. :)
>
> Maybe I'm missing something, but why can't this be done with current
> design? I mean is there more to your approach than grouping the "problem
> data" in another class?  What if I added a new fluent method:
>   withLeastSquaresProblem(LeastSquaresProblem lsp)
> which would call all the other "withXxx" methods to update the
> optimizer's field?
> [Yes, the optimizer cannot be shared between threads but it does
> not matter that much: the whole config + optimizer should be duplicated
> to avoid concurrency problems (and the config is the part that is likely
> to grab more memory.]

I think the withLeastSquaresProblem() method would add it to the current
design. I think the other point is that thread safety would only depend
on classes that the user must define; if the user's code is thread safe
then the user's code + [math] is thread safe. From a performance view
point the two APIs are approximately equivalent. I've been trying to
advocate from an API and usability perspective.

>>   - Allows algorithms to be applied to problems they weren't designed
>> for by using "views". I.e. view a least squares problem as a direct
>> problem by creating an implementation of DirectProblem that delegates to
>> a LeastSquaresProblem.
>
> There is a class in CM that converts a LS problem to a "usual"
> optimization (i.e. scalar cost function).

I was trying to say that we didn't lose that functionality. :)

>> A similar delegation would work to view a
>> DirectOptimzier as a LeastSquaresOptimizer.
>>   - Same code for the most part, just reorganized.
>
> Views (aka "Adapter" IIUC) are certainly interesting but it is the sort
> of wrappers which you seemed to consider ugly.
>
> In the code which I developed, I also created adapters around the CM
> classes. And I'll still need them even if we follow your design (in fact,
> the wrapper basically arranges for the appropriate data to be passed to
> the "plain" optimizer algorithm in much the same way as an instance of
> "LeastSquaresProblem" does).

This is an interesting point. I see the views/wrappers as cool way to
add functionality to an arbitrary object by implementing a common
interface. For example, see the evaluation counter above or the neat way
Luc used wrappers to add tides and secular trends to gravity
fields.[1,2] I called my own code ugly because there was no common
interface for it to implement. So I ended up creating a custom solution
for something I thought could be handled at the [math] library layer.

[1]
https://www.orekit.org/forge/projects/orekit/repository/revisions/master/entry/src/main/java/org/orekit/forces/gravity/potential/PulsatingSphericalHarmonics.java
[2]
https://www.orekit.org/forge/projects/orekit/repository/revisions/master/entry/src/main/java/org/orekit/forces/gravity/potential/SecularTrendSphericalHarmonics.java

>
>>
>> Further decisions:
>>   - It would be nice to have a model interface that could be evaluated
>> with one method call. E.g. Pair<double[], double[][]> value(double[]).
>>   - Leave getWeights() out of the LeastSquaresProblem API? This would
>> help encapsulation, but would require further modification of the GN
>> implementation.
>
> So, you hesitate on whether the weights are part of the "optimizer" or
> of the "problem"?
> It was proposed some time ago that weights should be dropped because
> they could be managed by the user; then, since they were managed within
> CM, others wanted to keep them.

I think the weights are definitely part of the "problem". I hesitate if
there should be a getWeights() method in the API. It seems to me that
the weights could just be applied to the residuals and Jacobian and the
algorithm wouldn't know the difference. (Maybe this is naive.) The
weight functionality could be added to the "problem" using composition
(a wrapper class) and not though explicit API support. For example,
remove getWeigts() from LeastSquaresProblem and use the composition:

static LeastSquaresProblem weight(LeastSquaresProblem lsp, RealMatrix
weights) {
    return new LeastSquaresProblem(){
        /* delegation, apply weights ... */
    };
}

The common case of a diagonal weight matrix could have its own, possible
more efficient, composition.

>
>>   - Should the LeastSquaresProblem interface include mutators? I think
>> it makes sense to only include the methods that the optimization
>> algorithm needs to "read". That would keep the API smaller and not tie
>> the users to a particular implementation.
>
> Hmm, do you suggest here that we drop the fluent API (whose purpose is
> indeed to mutate fields)?

I suggest that the interface only include the "get" and "evaluate"
methods. This is all the information the optimization algorithm needs.
It also makes composition easier. The LeastSquaresProblemImpl class
could implement a fluent API. Then the user use new
LeastSquaresProblemImpl().withXxx() and the LeastSquaresProblem
interface wouldn't place more requirements than necessary on all
implementations.

>>   - method and class names.
>>
>> I'm sure there are a few more issues. :) Thanks for considering the
>> separate API.
>
> I like the idea of a class that contains everything that can be
> "evaluated"[1] but I think that it would be more flexible to _prepare_
> for evaluation: In the same way that you grouped all "computeXxx"
> methods in "EvaluationImpl", it could store the model and Jacobian
> vector function objects (and not just store the values resulting from
> evaluating the functions).

I like the lazy evaluation idea. Then the weighted Jacobian wouldn't be
computed until it is needed.

> It would also seem that perhaps the evaluation counter should be there.

What do you think of the composition scheme for the evaluation counter
above?

>
> Of course, whatever is proposed, it must be put to test, by modifying
> all the unit tests accordingly.
> Your approach may be the way to go but I don't have the time to make
> all the necessary changes yet again; while possible, this design does
> not (yet) looks like an unavoidable shining path (which may be
> partly because I'm too used to what I've coded myself)...
>
> So, how to go on?
> A possibility would that you create a copy of the "trunk" into the
> sandbox and make all the changes there? [If "it just works" :), then
> people may easily decide to commit it.]
> Do we want to make a radical[2] redesign before 3.3 (which was due on
> <some date in the past>)?
> Could we create an "o.a.c.m.experimental" sub-package, and put
> "competing" designs inside it: e.g.
>   o.a.c.m.experimental.evan
>   o.a.c.m.experimental.gilles
> for people to test with 3.3, and postpone a decision until we can see
> better the differences? Maybe this would allow to pick all the better
> features from each and merge them in a final design...
>

I agree it definitely needs more testing and polishing. Let me think
about the two development lines idea. I would hate for us to duplicate
our work and then "throw away" one person's work at the end.

Regards,
Evan


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to