On 08/30/2015 04:13 PM, Gilles wrote:
On Sun, 30 Aug 2015 14:45:19 -0500, Ole Ersoy wrote:
On 08/30/2015 08:18 AM, Gilles wrote:
Hi.

On Sat, 29 Aug 2015 22:21:55 -0500, Ole Ersoy wrote:
I'm deleting most of the discussion, because I think I may be
throwing too many ingredients on the table.  For further context
please see the previous thread.

I don't get that.  Here the main purpose is to set a hard limit
that will raise an exception (to avoid that some algo from running
forever).
Well there are two concerns here.  One is the precise number of steps
that should be executed and the other is whether we need to raise the
exception.

To stop the algorithm from running forever we let the `end` callback
notify the thing doing the incrementing that we are done. Does that
make sense?

Not sure. The current usage is
1. Set the hard limit (and action when that limit is reached)
2. Let the incrementor do its thing (count and call "trigger").
IIUC there are three ways of using IntegerSequence as is:
A)
1) Initialize the incrementor
2) check canIncrement()
3) get the increment
4) check canIncrement()
5) Increment if we can
6) Check again...
7 If we can't increment, were done, so we get the result of whatever
and move on.

B)
We register a calllback and just keep incrementing until the callback
is called.

C) We catch the MaxCountExceededException

We should only do B, because A is wasteful and C is not necessary.

A is not used in CM
Not yet :).  Seriously though developers look at Commons Math as a library made 
by brilliant people, such as yourself (Sincerely), and when they see something 
is part of the API, there's a good chance they will use it.  In this case I 
think that would be a bit tragic.


B and C are. The former in order to signal failure (unexpectedly large
number of iterations),
In which case I think the machinery for communicating and dealing with that 
belongs on the Algorithm class, and should be separate from the IntegerSequence.

the latter in order to change the exception into
a more meaningful one (catch and rethrow instead of registering another
callback).
Is there a case where that's necessary, or could it just as easily be dealt 
with in the CB?  Overall I think commons math, and consumers of it, would be 
better served if there were two types of CBs.  One that's internal and deals 
with internal stuff, and one that's for the public API.  So instead of 
receiving abstract exceptions that have to be mapped to a problem root cause 
before the application can decide on how to message it, a CB is registered that 
the application uses for messaging and as a bridge to the precise root cause.




IIUC, you propose that the increment tells its caller that the
limit has been reached.
The CB tells the caller that the limit has been reached.  If the
incrementer extends the callback interface, and registers with the
caller, then yes.  But the CB could be completely separate from the
Algorithm instance and the incrementer.

Then, I suppose, the caller will have to
act, rather than the incrementor.
Case A)
Continue until we converge on a solution that's good enough.  We
register the 'increment' CB which notifies the algorithm that we are
'good enough', and if we don't converge then the `end` CB is called,
and we take whatever action is necessary.

That's a completely different way of handling the evaluation of the
result.  Currently this is part of the algorithm logic (through
"tolerance" parameters).

So if we have one way only of handling various possibilities that's easy to use 
that's a good thing right?  What I like about it is that it affords the 
designer the ability to communicate exactly what has happened and notify the 
CB, and the CB API client can take it from there.  For example we could have an 
Enum that contains a series of values that map to various possibilities 
(Similar to HTTP status codes), and these could also be used to keys for I18N.



Case B)
We want to exhaust all the steps so we register an 'end' CB.  We want
to find the best possible solution in the allowed number of steps (We
already now that we will not find the global optimal).

Return whatever was found after some number of steps is dangerous.
By setting a max we are saying that we want to be notified at some point.  
Hopefully we handle the notification in a smart way.
There was a discussion about it in the past, and IIRC the conclusion
was that it would not be "standard" behaviour.
Maybe it should be?
In the optimizers, the "ConvergenceChecker" interface can be subverted
to count iterations/evaluations and accept a solution that does not
match the convergence criteria.
I would love to be able to register a CB that's specific to each optimizer 
where the Optimizer communicates the optimal result along with a 
LevenbergMarquardtResultsEnum.OPTIMAL, or no result with 
LevenbergMarquardtResultsEnum.COULD_NOT_CONVERGE, or perhaps something else 
that's well documented and unique.



I still don't see why some algo should pass through the incrementor
"service" (dealing with 3 callbacks) in order to evaluate the current
state of its work..

My bad on that one.  It should not, unless it should.  I just used an the 
ability to hook 3 different callbacks as an example of a 'Flexible / exensible' 
design.  The IntegerSequenceEventEmitter design should emit whatever events the 
algorithm needs.  The nice thing about the callbacks is that they can be used 
to break the algorithm up into discrete steps that listen for events that are 
specific to the stage that the algorithm is in.


Unless I'm missing something, I don't see the advantage (in the
examples from CM).

The advantages are that we eliminate:
- `canIncrement()`
-  `hasIncrement()`
- incrementNTimes()
- MaxCountExceededException
- MaxCountExceededException CB boilerplate

We gain a simpler API.

?
In CM, the throwing of an exception is mandated by the failure to
stop before counter exhaustion.
Lets unmandate.


The other methods are used within the Incrementor class, to provide
the "Iterator" and "range" functionality. [Those are not used within
CM, but I thought that they were interesting...]
If I had not been reading a ton of NodeJS code lately I would think so too, but 
now that I'm better at functional programming I think they are a bad idea :-(.


Secondly suppose we expect a sequence like 5, 10, 15, 20...but the
max is 17.  Do we loop past 17 and let that be the final loop, thus
passing in 20 to the increment listener / cb, or do we stop at 15?

The proposed IntegerSequence.Incrementor would trigger the callback
if "increment()" is called again after 15 is returned.
I think it's better if we just leave it up to the designer to figure
out the max number of steps.

Note that the increment does not stop by itself. If a range is created
its last value will be 15 (and the callback will never be called, by
construction).
That's another reason why I like just specifying only:
- start
- stepSize
- maxNumberOfSteps.

It's very easy to understand.  The callback is called when the
`maxNumberOfSteps` is reached  Not that the other way is that
complicated either :).

In the basic usage (where the step == 1), it's fairly identical.
Yup

Sometimes a "range" might be more natural.
Something like the IntStream API.



One thing I am saying though that is that if the `maxNumberOfSteps`
is reached and no `end` callback is registered, then we can still call
increment.  It just results in a noop().  An exception is not thrown.
If the caller wants to be notified when the counter is done, they have
to register a callback.

What would a use-case for CM?
It gets rid of the exception and `encourages` the client to register the CB if 
they want the notification.  I prefer the CB for notification, and I think we 
should have a CB, or an Exception, but not both.  I take it back.  I think we 
should just have a CB.



By
letting the developer calculate the number of steps, avoiding the use
of a max, we gain simplicity and flexibility.

In all CM usage, the number of steps is unknown; the Incrementor is
intended to avoid an infinite loop.

IIUC the max number of steps are known?  The number of steps until we
converge are unknown...and could reach the max number of steps.

As I said, it is mainly a fool-proof way to indicate failure rather than
run forever if e.g. the problem uncovered some implementation bug or
numerical problem (as it happened in a root solver).
A CB will do a better more lightweight (Less lines of code) job of delivering 
the message and mapping a root cause when necessary.


Lastly perhaps the `increment` callback wants to notify the
`incrementer / algorithm` that it's done.  In this case we only
specify the `start` and `stepSize` arguments and leave it up to the
`increment` callback to realize the termination point and stop the
algorithm.

Cf. previous paragraph.
I think we are saying the same thing there.

Termination of the counter is not the same as termination of the
algorithm.
I understand.  By using an incrementor and a max we are incrementing
either until:
- the algorithm concludes
- We reach a max and use the result

Dangerous (cf. above).  It's better to relax the convergence criterion
and be sure that the solution matches the request.
Sure - I won't talk about doing this via a CB as people following this are 
probably getting a headache, if they have not dropped out years ago :).


- We reach a max and indicate an error because we could not converge
on a result

All of these can be handled in a `increment` CB and a `end` CB.

Sure, but it requires a redesign.  IIRC there was a proposal quite some
time ago to rethink the class hierarchy in terms of an "IterativeAlgorithm".
Glad to hear that it has mindshare.


  In CM usage, the latter must occur first, and the second
signals a failure.

So when the `end` CB is called we either succeeded or had a failure.
The end CB can evaluate the results and determine either one. The
algorithm could also signal to the `end` CB whether the result is a
failure or success.

To further discuss, we'd have to see a bigger picture of how it would apply
to e.g. the "o.a.c.m.optim" package.

Cool.  One of these days I'm going to make time for a deep dive. The code is an 
invaluable learning resource.

Cheers,
- Ole


Regards,
Gilles

P.S. I'll commit the "IntegerSequence" class. You can try and patch it
     (or design an alternative) to show what you mean on some specific
     CM use of the old "Incrementor".

OK I think it would be fun to play more with this.  I've been meaning
to experiment with JDK 8 asynchronous callbacks as well. Hope to get
some free time soon!

Cheers,
- Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to