Re: [Math] LeastSquaresOptimizer Design

Ole Ersoy Fri, 25 Sep 2015 07:36:50 -0700


On 09/25/2015 06:55 AM, Gilles wrote:

On Thu, 24 Sep 2015 21:41:10 -0500, Ole Ersoy wrote:

On 09/24/2015 06:01 PM, Gilles wrote:

On Thu, 24 Sep 2015 17:02:15 -0500, Ole Ersoy wrote:

On 09/24/2015 03:23 PM, Luc Maisonobe wrote:

Le 24/09/2015 21:40, Ole Ersoy a écrit :

Hi Luc,

I gave this some more thought, and I think I may have tapped out to
soon, even though you are absolutely right about what an exception does
in terms bubbling execution to a point where it stops or we handle it.

Suppose we have an Optimizer and an Optimizer observer. The optimizer
will emit three different events given in the process of stepping
through to the max number of iterations it is allotted:
- SOLUTION_FOUND
- COULD_NOT_CONVERGE_FOR_REASON_1
- COULD_NOT_CONVERGE_FOR_REASON_2
- END (Max iterations reached)

So we have the observer interface:

interface OptimizerObserver {

     success(Solution solution)
     update(Enum enum, Optimizer optimizer)
     end(Optimizer optimizer)
}

So if the Optimizer notifies the observer of `success`, then the
observer does what it needs to with the results and moves on.  If the
observer gets an `update` notification, that means that given the
current [constraints, numbers of iterations, data] the optimizer cannot
finish.  But the update method receives the optimizer, so it can adapt
it, and tell it to continue or just trash it and try something
completely different.  If the `END` event is reached then the Optimizer
could not finish given the number of allotted iterations. The Optimizer
is passed back via the callback interface so the observer could allow
more iterations if it wants to...perhaps based on some metric indicating
how close the optimizer is to finding a solution.

What this could do is allow the implementation of the observer to throw
the exception if 'All is lost!', in which case the Optimizer does not
need an exception.  Totally understand that this may not work
everywhere, but it seems like it could work in this case.

WDYT?

With this version, you should also pass the optimizer in case of
success. In most cases, the observer will just ignore it, but in some
cases it may try to solve another problem, or to solve again with
stricter constraints, using the previous solution as the start point
for the more stringent problem. Another case would be to go from a
simple problem to a more difficult problem using some kind of
homotopy.

Great - whoooh - glad you like this version a little better - for a
sec I thought I had complete lost it :).


IIUC, I don't like it: it looks like "GOTO"...


Inside the optimizer it would work like this:

while (!done) {
   if (can't converge) {
       observer.update(Enum.CANT_CONVERGE, this);
   }
}


That's fine. What I don't like is to have provision for changing the
optimizer's settings and reuse the same instance.

If the design of the optimizer allows for this, then the interface for the 
Observer would facilitate it.  The person implementing the interface could 
throw an exception when they get the Enum.CANT_CONVERGE message, in which case 
the semantics are the same as they are now.

On the other hand if the optimizer is not designed for reuse, perhaps for the 
reason that it causes more complexity than it's worth, the Observer interface 
could just exclude this aspect.

The optimizer should be instantiated at the lowest possible level; it
will report everything to the observer, but the "report" is not to be
confused with the "optimizer".

The design of the observer is flexible.  It gives the person implementing the 
interface the ability to change the state of what is being observed.  It's a 
bit like warming up leftovers.  You are the observer.  You grab yesterdays the 
pizza.  Throw in in the microwave.  The microwave is the optimizer.  We hit the 
30 second button, and check on the pizza.  If we like it, we take it out, 
otherwise we hit 30 seconds again, or we throw the whole thing out, because we 
just realized that the Pizza rat took a chunk out:
https://www.youtube.com/watch?v=UPXUG8q4jKU


Then in the update method either modify the optimizer's parameters or
throw an exception.


If I'm referring to Luc's example of a high-level code "H" call to some
mid-level code "M" itself calling CM's optimizer "CM", then "M" may not
have enough info to know whether it's OK to retry "CM", but on the other
hand, "H" might not even be aware that "M" is using "CM".

So in this case the person implementing the Observer interface would keep the 
semantics that we have now.  There is one important distinction though.  The 
person uses the Enum parameter, indicating the root cause of the message, to 
throw their own (Meaningful to them) exception.


As I tried to explain several times along the years (but failed to
convince) is that the same problem exists with the exceptions: however
detailed the message, it might not make sense to the person that reads
the console because he is at level "H" and may have no idea that "CM"
is used deep down.

Great point!  This is why I like receiving a light weight Enum indicating a 
root cause, and then either adapting to it, or throwing my own exception that 
will trigger a simple explanation for my client (Person using the app or remote 
client (Computer) receiving a message).

Having a specific exception which "M" can catch, extract info from, and
raise a more meaningful exception (and/or translate the message!) is a
much more flexible solution IMO.

Indeed.  One option here, assuming a callback interface is not an option, is to 
move to a one to one mapping between the class doing the math and the 
corresponding exception.  The exception would then code root causes using an 
Enum that the receiver could use to map the root cause to their implemented way 
of handling it.

Because it is designed this way, the exception handler can get access to the 
entire object that caused the exception, and use it for the exception handling. 
 This eliminates most of the thinking around how the exception should be 
designed, what it needs to communicate, how it should be distinguished from all 
of the other contexts that can throw the exception, where it belongs in the 
hierarchy, etc.

[Well, if all iterative algorithms are rewritten within the "observer"
paradigm, then the logging can indeed be left at the caller's level (since
the optimizer will report "everything"...  Going that route is an option
to be mentioned in issue of allowing "slf4j" or not (see below).]

I already started a new mail thread, but I will bring it up, if there are 
objections.  It may be a poor fit in terms of developer productivity, since now 
everyone has to implement the logging statements again.

The Optimizer could publish information deemed
interesting on each ITERATION event.


If we'd go for an "OptimizerObserver" that gets called at every
iteration,
there shouldn't be any overlap between it and "Optimizer":

So inside the Optimizer we could have:

while (!done) {
    ...
    if (observer.notifyOnIncrement())
    {
        observer.increment(this);
    }
}

Which would give us an opportunity to cancel the run if, for example,
it's not converging fast enough.


Providing ways to assess "too slow convergence" would be a very
interesting feature, I think.

In that case we set done to true in
the observer, and then allow the Optimizer to get to the point where
it checks if it's done, calls the END notification on the observer,
and then the observer takes it from there.


iteration limit should be dealt with by the observer, the iterative
algorithm would just run "forever" until the observer is satisfied
with the current state (solution is good enough or the allotted
resources - be they time, iterations, evaluations, ... - are
exhausted).


It's possible to do it that way, although I think it's better if that
code stays on the algorithm such that the Observer interface (The
client / person using CM implements the Observer) is as simple as
possible to implement.


By definition, the iteration concept is also present in the "Observer".
(via "notifyOnIncrement()", IIUC).
If the observer is notified, it should act according to the caller's
policy (e.g. call "optimizer.stop()").
[Since the optimizer was stopped before completing the assignment (vs
finding a solution within the tolerance settings), it should not be in
charge of further action (e.g. "return" something).]

Right - once the optimizer is stopped, it's stopped.  However the semantics for 
doing that work like this (For the reason that if the observer calls 
optimizer.done(), the optimizer still has some code to run):

The observer tells the optimizer that it is done.

INSIDE OBSERVER:
optimizer.done();

Then the optimizer keeps going...until it exits the loop that it is in, because 
done = true;  At that point it notifies the observer again.

INSIDE OPTIMIZER:
observer.end(Enum.CANT_CONVERGE);

So the above case is the model for when the optimizer does not find a solution.

If it finds a solution then it will naturally exit the loop that it is in and 
make the final call:

observer.success(solution, optimizer) or just:
observer.success(optimizer) // In case the solution is bound to the optimizer

The argument list is flexible.  Once the observer.success is called the 
optimizer is done.  It has no code left to run.

The observer could then be wired
with SLF4J and perform the same type of logging that the Optimizer
would perform.  So CM could declare SLF4J as a test dependency, and
unit tests could log iterations using it.


As a "user", I'm interested in how the algorithms behave on my problem,
not in the CM unit tests.

You could still do that.  I usually take my problem, simplify it down
to a data set that I think covers all corner cases, and then run it
through my unit tests while looking at the logging output to get an
idea of how my algorithm is behaving.


When you "simplify", you don't the see how the (production) code really
behaves.
Not even mentioning that it takes a lot of time to "simplify", and might
be impossible (e.g. if the production code runs in another environment).

Very true.  As you point out, I could be logging in my tests using the 
observer, but now I have to reimplement the same logging pattern in my 
production code.

The question remains unanswered: why not use slf4j directly?


FWIU class path dependency conflicts for SLF4J are easily solved by
excluding logging dependencies that other libraries bring in and then
directly depending on the logging implementation that you want to use.
So people do run into issues, but I think they are solvable:

http://stackoverflow.com/questions/8921382/maven-slf4j-version-conflict-when-using-two-different-dependencies-that-requi


Then, could you please raise the question in a separate thread?

Done.

Lombok also has a @SLF4J annotation that's pretty sweet.  Saves the
SLF4J boilerplate.


I understand that using annotations can be a time-saver, but IMO not
so much for a library like CM; so in this case, the risk of depending
on another library must be weighed against the advantages.

Lombok is compile time only, so there should be few drawbacks:
http://stackoverflow.com/questions/6107197/how-does-lombok-work


Yes, I've just been wondering about that.
So, could you please raise the question in a separate thread?

Done.

I'll demo it on the LevenbergMarquardtOptimizer experiment, and we
can see the level of code reduction we are able to achieve.  I think
it's going to be fairly significant.


Great!

Sweet! :)

Cheers,
Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Math] LeastSquaresOptimizer Design

Reply via email to