Hello,
I have played with an idea for a new project for a few months. Asking
for a few advices both at the ApacheCon Europe and by direct contact,
all responses I received were quite positive and suggested me to set up
a component in sandbox. This message is the first public announcement
and is intended to collect the opinion of all the commons community
about this project. In short: can I play in sandbox with this or should
I find another place for it ? Another possibility would be to put it
inside [math], but that would be really strange.
The project already has a name: Nabla, which is an operator used in
mathematics and physics for differentiation. It is a simple triangle
pointing downwards (see http://mathworld.wolfram.com/Nabla.html). Lets
call the component I want to develop [nabla] from now, to match our
local habits here. There is some code for it, but only developed by
myself on my spare time with my personal computer and never distributed
to anyone. So I can consider I developed it under Apache umbrella and
put it on sandbox with the Apache headers and license. I am already a
commons committer and have filed an Individual Contributor License
Agreement to Apache.
[nabla] will be a mathematics/physics library aimed at building the
symbolic differentiation of any function provided as a bytecode compiled
function.
Here is a typical use case for such a library. For some simulation
purposes, suppose I use a class with a method computing the consumption
of performing an action as a function of its start time:
public class DifficultComputation {
public double f(double t) {
// some lengthy equations here
}
}
Now in addition to computing the consumption by itself, I want to be
able to compute the sensitivity of this consumption to start time
changes. This would allow me to say: if action is started at t = 10
seconds, then consumption will be 1.2 kilograms, and this consumption
will increase by 10 grams for each second I delay the start. The value
10 grams per second of delay is computing by differentiating the
original equation. There are several ways to do that.
The first way relies on by mathematical transformations on the equations
implemented in the function f. It it implies mathematical analysis and
new development which is very error-prone (computing the differential of
a function is much more complex than computing the function itself). It
is only feasible if you know the equations or have the source code of
the function. This approach may be used with symbolic computation
packages like Mathematica, Axiom where you develop your equations using
these programs, and have them generate the implementation for you.
However, the produce code is only for some languages (typically fortran
and C), it is awful and cannot be maintained (it is not intended to be),
and needs to be integrated with the rest of the application which is
already a difficult task.
The second way is using numerical finite-differences schemes. These
algorithms basically compute several values by changing the start time
by a small known amount and looking at the various results. This implies
setting up the step, which may be difficult if you don't already know
the behavior of the function (should I use one microsecond or one
century here, in fact it depends on the problem). This is also either
quite computation intensive if you use high order schemes with 4, 6 or 8
points or inaccurate if you don't use them. It is also impossible to use
too close to functions boundaries which are often locations were we
really want to explore.
[nabla] provides a third way to get this result. It analyses the
bytecode of the function at run time, performs the exact symbolic
mathematical transforms, and generates a new class implementing the
differentiated function. There is still a computation cost, but it is
the same you would get from a manually differentiated code, plus a one
time bytecode differentiation overhead (but we can also cache results).
This approach has the following benefits:
- derivation is exact
- there are no problem-dependent step size to handle
- derivation can be computed even at domains boundaries
- there is no special handling of source
(no symbolic package with its own language, no source code
generation, no integration with the rest of application)
- one writes and maintains only the basic equation and get the
derivative for free
- it is effective even when source code is not available (but there
are licensing issues in this case of course, since what I do
automatically is really ... derived work)
The only drawback I see is that functions calling native code cannot be
handled. In this case, we have a fallback available with
finite-differences schemes.
The existing implementation is not yet ready for production. A lot of
work has been done, but there are many missing features. [nabla] can
handle simple functions from end to end (i.e. up to creating an instance
of the differentiated class that is fully functional). Making this code
available in the sandbox would allow to let people look at it, comment
on it, participate if they are interested and make it go live.
What do you think about it ?
Luc
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]