Re: [math] API proposal for differentiation

Luc Maisonobe Mon, 23 Jul 2012 01:56:50 -0700

Le 22/07/2012 22:48, Gilles Sadowski a écrit :
> On Sun, Jul 22, 2012 at 12:44:07PM -0700, Phil Steitz wrote:
>> On 7/22/12 9:35 AM, Luc Maisonobe wrote:
>>> Hi all,
>>>
>>> A few months ago, Gilles proposed to add an implementation of the
>>> Ridder's algorithm for differentiation (see
>>> <https://issues.apache.org/jira/browse/MATH-726>). After some discussion
>>> the issue was postponed, waiting for an API for this family of algorithms.
>>>
>>> Last week, at work, someone asked me again about differentiation and I
>>> directed him to this issue and also to the Nabla project. He told me he
>>> was interested in having some support in [math] for this (I'm not sure
>>> he reads this list, but if he does, I hope he will join the discussion).
>>>
>>> The basic needs are to get derivatives of either real-valued or vector
>>> valued variables, either univariate or multivariate. Depending on the
>>> dimensions we have derivatives, gradients, or Jacobians. In some cases,
>>> the derivatives can be computed by exact methods (either because user
>>> developed the code or using an automated tool like Nabla). In other
>>> cases we have to rely on approximate algorithms (like Ridder's algorithm
>>> with iterations or n-points finite differences).
>>>
>>> Most of the times, when a derivative is evaluated, either the user needs
>>> to evaluate the function too or the function is evaluated as a side
>>> effect of the derivative computation. Numerous tools therefore packe the
>>> value and the derivative in one object (sometimes called Rall numbers)
>>> and transfor something like:
>>>
>>>   double f(double x) { ... }
>>>
>>> into this:
>>>
>>>   RallNumber f(RallNumber x) { ... }
>>>
>>> This is a very classical approach, very simple for languages that
>>> implement operator overloading (see the book "Evaluating Derivatives" by
>>> Andreas Griewank and Andrea Waltherpaper, SIAM 2008 or the paper "On the
>>> Implementation of Automatic Differentiation Tools" by Christian H.
>>> Bischof, Paul D. Hovland and Boyana Norris, unfortunately I was not able
>>> to find a public version of this paper, or simply look at the wikipedia
>>> entry <http://en.wikipedia.org/wiki/Algorithmic_differentiation> and the
>>> presentation on automatic differentiation which seems to have been
>>> written by one of the people who contributed to the wikipedia article
>>> <http://www.docstoc.com/docs/29133537/Automatic-differentiation>).
>>>
>>> Using this approach is simple to understand and compatible with all
>>> differnetiation methods: direct user code for functions that are easily
>>> differentiated, generated exact code produced by tools, finite
>>> differences methods without any iteration and iterative methods like the
>>> Ridder's method.
>>>
>>> So I suggest we use this as our basic block. We already have an
>>> implementation of RallNumbers in Nabla, where it is called
>>> differentialPair
>>> (<http://commons.apache.org/sandbox/nabla/apidocs/org/apache/commons/nabla/core/DifferentialPair.html>).
>>> We can move this class into [math] and rename it RallNumbers.
> 
> Why is it called a Rall number?


I think it is a reference to L. B Rall and his 1984 report "The
Arithmetic of Differentiation".

> 
>>
>> +1 - though I don't really see the need or big value in renaming
>> this.  DifferentialPair is more descriptive.
> 
> Maybe Rall number is more standard (I don't know)?

I am not sure it is very widely used. I get this name from the following
citation in Bischof, Hovland and Norris paper:

 "In its simplest form, operator overloading introduces a new class
  that carries function values and derivatives (collectively called
  Rall numbers (Barton and Nackman, 1996) or doublets
  (Bartholomew-Biggs et al., 1995))."

I finally found a public link to this paper:
<ftp://info.mcs.anl.gov/pub/tech_reports/reports/P1152.pdf>.

There is also a similar reference in the paper "Object oriented
implementation of a second-order optimization method" by L. F. D. Bras
and A. F. M. Azevedo
<http://www.alvaroazevedo.com/publications/meetings/2001/OPTI/OOISOOM.pdf>.

In both cases the referenced papers is:
 Barton, J. J. and L. R. Nackman: 1996,
‘Automatic Differentiation’. C++ Report 8(2), 61–63.


> 
> Maybe yet another name: I don't understand "DifferentialPair" either.

It simply depicts the content of this small container: two fields, f0
for the value, f1 for the first derivative.

> 
>>
>>>
>>> For real valued univariate functions, we have the interface
>>> UnivariateFunction
>>> <http://commons.apache.org/math/apidocs/org/apache/commons/math3/analysis/UnivariateFunction.html>
>>> that defines the single method
>>>
>>>   double value(double x)
>>>
>>> This interface is used in many places, it is simple and it is well
>>> defined, so it should be kept as is.
>>>
>>> We also have a DifferentiableUnivariateFunction, which returns a
>>> separate UnivariateFunction for the derivative:
>>>
>>>  UnivariateFunction derivative();
>>>
>>> This interface is used only in the interface
>>> AbstractDifferentiableUnivariateSolver and in its single implementation
>>> NewtonSolver. This interface seems awkward to me and costly as it forces
>>> to evaluate separately the function even when evaluating the derivative
>>> implies computing the function a few times. For very computing intensive
>>> functions, this is a major problem.
>>>
>>> I suggest we deprecate this interface and add a new one defining method:
>>>
>>>  RallNumber value(RallNumber x)
>>>
>>> This interface could extend UnivariateFunction in which case when we
>>> want to compute only the value we use the same object, or it could also
>>> define a getPrimitive method the would return a separate primitive
>>> object (this is how it is done in Nabla, with the interface
>>> UnivariateDerivative
>>> (<http://commons.apache.org/sandbox/nabla/apidocs/org/apache/commons/nabla/core/UnivariateDerivative.html>),
>>> or it can even be completely separate from UnivariateFunction.
>>
>> I would say leave UnivariateFunction as is (no need to force it into
>> a hierarchy) and just add DifferentialPair, including getPrimitive -
>> basically stick with the [nabla] design.  I am +1 for deprecating
>> DifferentialUnivariateFunction.
> 
> I rather like the idea of extending the UnivariateFunction interface.
> This makes sense since the "UnivariateDerivative" really extends the
> functionality: you cannot have a derivative without a function.

Yes, this is one possibility.

> 
> What is "getPrimitive"?

It is a method defined in the current Nabla interface
<http://commons.apache.org/sandbox/nabla/apidocs/org/apache/commons/nabla/core/UnivariateDerivative.html>.

When we compute derivatives, we almost always want the value too, hence
the RallNumber/DifferentialPair. However the reverse is not true.
Sometimes, we want the value only and not the derivative. In this case,
computing everything using the differential pair and throwing the
derivative to keep only the value would be a waste of resources. This
method was intended to get a hold on the original (non differentiated
object).

So basically, we have the following scenario:

 - user implement a complicated function:

    class MyFunction implements UnivariateFunction {
      double value(double x) { ... }
    }

 - user create an instance and differentiate it:

    UnivariateDerivative fPrime =
      new SomeSuperDifferentiator(new MyFunction());

 - when he needs both value and derivative, he uses:

     DifferentialPair x = ...;
     DifferentialPair y = fPrime.value(x);

 - when only the value is needed:

     double x = ...;
     double y = fPrime().getPrimitive().value(x);

Of course, in this simple case the user may store the original instance
by himself, but he needs to pass both f and fPrime around in large
applications, which may be cumbersome. The getPrimitive method ensure he
can get back f from fPrime (it is simply a pointer to the original
function, no fancy integration).

If UnivariateDerivative extends UnivariateFunction, user simply has two
methods called value, one with a double argument, one with a
DifferentialPair argument, so it is even simpler than getPrimitive.

getPrimitive is probably still useful in Nabla, (we need to be able to
retrieve the instance and the the class since we do tricks with the
bytecode itself). So I wonder if it is useful to have it defined
directly in the UnivariateDerivative interface.

I think the user should have a way to compute the function without
derivative, so the two approaches (extending the interface or having a
getPrimitive are useful). This is merely a question of inheritance
versus aggregation. A derivative is obviously linked with an original
function, but should it "be" the function too or "have" the function as
one of its fields. The third approach (having the derivative completely
independent of the original function) is not good in my opinion since it
forces users to pass both objects along.

I slightly tend to prefer the aggregation. One highly subkective reason
is that when we create a derivative automatically -say using simple
finite differences), we do not create a new class that extends the
original class, we create a wrapper that contains a reference to the
original class and calls it n times for a n-points finite differences
scheme.

> 
>>>
>>> Multivariate differentiable functions or vector valued functions would
>>> be done exactly the same way: replacing the parameter or return value
>>> types from double arrays to RallNumber arrays. This would be much more
>>> consistent than what we currently have (we have methods called
>>> partialDerivatives, gradients, jacobians and even several methods called
>>> derivative which do not have the same signatures).
>>>
>>> For functions that can be differentiated directly at development time,
>>> this layer would be sufficient. We can typically us it for all the
>>> function objects we provide (Sin Exp ...) as well as the operators
>>> (Multiply, subtract ...).
>>>
>>> For more complex functions, we need to add another layer that would take
>>> a simple function (say a UnivariateFunction for the real value
>>> univariate case) and would create the differentiated version. For this,
>>> I suggest to create a Differentiator interface (similar in spirit to
>>> Nable UnivariateDiffernetiator
>>> <http://commons.apache.org/sandbox/nabla/apidocs/org/apache/commons/nabla/core/UnivariateDifferentiator.html>).
>>> We should probably build a set of interfaces in the same spirit as the
>>> solvers interfaces. This interface would be implemented by engines that
>>> can compute derivatives. In most cases, these engines would simply
>>> return a wrapper around the function they get as parameters. The wrapper
>>> would implement the appropriate derivative interface. When the wrapper
>>> "value" method would be called, the underlying raw univariate "value"
>>> method would be called a number of time to compute the derivative.
>>>
>>> We could provide several implementations of these interfaces. For
>>> univariate real functions, we should at least provide the finite
>>> differences methods in 2, 4, 6 and 8 points (we already have them in
>>> Nabla), and we should also provide the Ridder's algorithm. Users or
>>> upper level libraries and applications could provide other
>>> implementations as well. In fact, Nabla would be such a case and would
>>> provide an algorithmic implementaion and would depend on [math].
>>>
>>> We could also provide implementations for the upper dimensions
>>> (multivariate or vector valued) using simply the univariate
>>> implementations underneath (typically computing a gradient involves
>>> computing n derivatives for the n parameters of the same function,
>>> considering each time a different parameter as the free variable). Here
>>> again, more efficient implementations could be added for specific cases
>>> by users.
>>>
>>> The separation between the low level layer (derivatives definition) and
>>> the upper layer (transforming a function that do not compute derivatives
>>> into a function taht compute derivatives) is a major point. The guy who
>>> talked to me last week insisted about this, as he wanted to avoid
>>> approximate (and computation-intensive) algorithms when he can, and he
>>> wanted to have these algorithms (like Ridder) available when he cannot
>>> do without them. So the focus is not to have Ridder by itself, but
>>> rather to compute derivatives, using Ridder or something else.
> 
> I've withdrawn the Ridders' JIRA ticket. I've implemented it for my local
> needs.
> Given that I've some problems with it, I'd suggest not to focus on this
> algorithm as a test case for the framework which you propose (if that's what
> you meant in the above paragraph); at first, it's probably better to stick
> with the finite differences formulae.

OK, then I will start with a simple import of Nabla, perhaps folding all
finite differences class into one with a user argument for the number of
points (and caching of the coefficients, just as you did for
Gauss-Legendre rules).

> 
>>>
>>> Of course, I can provide all the raw interfaces and finite differences
>>> implementations. If someone provides me the paper for Ridder's method, I
>>> can also implement this (or if someone wants to do it once the interface
>>> have been set up, it is OK too).
>>>
>>> What do you think about this ?
> 
> That looks great.
> 
>>
>> All of this sounds reasonable.  The machinery in [nabla] for
>> building DifferentialPairs recursively should probably also be
>> brought in.

OK, I will bring everything in.

best regards,
Luc

>>
> 
> 
> Regards,
> Gilles
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math] API proposal for differentiation

Reply via email to