Hello,

I have played with an idea for a new project for a few months. Asking for a few advices both at the ApacheCon Europe and by direct contact, all responses I received were quite positive and suggested me to set up a component in sandbox. This message is the first public announcement and is intended to collect the opinion of all the commons community about this project. In short: can I play in sandbox with this or should I find another place for it ? Another possibility would be to put it inside [math], but that would be really strange.

The project already has a name: Nabla, which is an operator used in mathematics and physics for differentiation. It is a simple triangle pointing downwards (see http://mathworld.wolfram.com/Nabla.html). Lets call the component I want to develop [nabla] from now, to match our local habits here. There is some code for it, but only developed by myself on my spare time with my personal computer and never distributed to anyone. So I can consider I developed it under Apache umbrella and put it on sandbox with the Apache headers and license. I am already a commons committer and have filed an Individual Contributor License Agreement to Apache.

[nabla] will be a mathematics/physics library aimed at building the symbolic differentiation of any function provided as a bytecode compiled function.

Here is a typical use case for such a library. For some simulation purposes, suppose I use a class with a method computing the consumption of performing an action as a function of its start time:

public class DifficultComputation {
  public double f(double t) {
    // some lengthy equations here
  }
}

Now in addition to computing the consumption by itself, I want to be able to compute the sensitivity of this consumption to start time changes. This would allow me to say: if action is started at t = 10 seconds, then consumption will be 1.2 kilograms, and this consumption will increase by 10 grams for each second I delay the start. The value 10 grams per second of delay is computing by differentiating the original equation. There are several ways to do that.

The first way relies on by mathematical transformations on the equations implemented in the function f. It it implies mathematical analysis and new development which is very error-prone (computing the differential of a function is much more complex than computing the function itself). It is only feasible if you know the equations or have the source code of the function. This approach may be used with symbolic computation packages like Mathematica, Axiom where you develop your equations using these programs, and have them generate the implementation for you. However, the produce code is only for some languages (typically fortran and C), it is awful and cannot be maintained (it is not intended to be), and needs to be integrated with the rest of the application which is already a difficult task.

The second way is using numerical finite-differences schemes. These algorithms basically compute several values by changing the start time by a small known amount and looking at the various results. This implies setting up the step, which may be difficult if you don't already know the behavior of the function (should I use one microsecond or one century here, in fact it depends on the problem). This is also either quite computation intensive if you use high order schemes with 4, 6 or 8 points or inaccurate if you don't use them. It is also impossible to use too close to functions boundaries which are often locations were we really want to explore.

[nabla] provides a third way to get this result. It analyses the bytecode of the function at run time, performs the exact symbolic mathematical transforms, and generates a new class implementing the differentiated function. There is still a computation cost, but it is the same you would get from a manually differentiated code, plus a one time bytecode differentiation overhead (but we can also cache results).

This approach has the following benefits:
 - derivation is exact
 - there are no problem-dependent step size to handle
 - derivation can be computed even at domains boundaries
 - there is no special handling of source
   (no symbolic package with its own language, no source code
    generation, no integration with the rest of application)
 - one writes and maintains only the basic equation and get the
   derivative for free
 - it is effective even when source code is not available (but there
   are licensing issues in this case of course, since what I do
   automatically is really ... derived work)

The only drawback I see is that functions calling native code cannot be handled. In this case, we have a fallback available with finite-differences schemes.

The existing implementation is not yet ready for production. A lot of work has been done, but there are many missing features. [nabla] can handle simple functions from end to end (i.e. up to creating an instance of the differentiated class that is fully functional). Making this code available in the sandbox would allow to let people look at it, comment on it, participate if they are interested and make it go live.

What do you think about it ?
Luc


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to