Hello,
I have played with an idea for a new project for a few months.
Asking for
a few advices both at the ApacheCon Europe and by direct contact, all
responses I received were quite positive and suggested me to set up a
component in sandbox. This message is the first public announcement
and is
intended to collect the opinion of all the commons community about
this
project. In short: can I play in sandbox with this or should I find
another
place for it ? Another possibility would be to put it inside
[math], but
that would be really strange.
The project already has a name: Nabla, which is an operator used in
mathematics and physics for differentiation. It is a simple triangle
pointing downwards (see http://mathworld.wolfram.com/Nabla.html).
Lets
call the component I want to develop [nabla] from now, to match our
local
habits here. There is some code for it, but only developed by
myself on my
spare time with my personal computer and never distributed to
anyone. So I
can consider I developed it under Apache umbrella and put it on
sandbox with
the Apache headers and license. I am already a commons committer
and have
filed an Individual Contributor License Agreement to Apache.
[nabla] will be a mathematics/physics library aimed at building the
symbolic differentiation of any function provided as a bytecode
compiled
function.
Here is a typical use case for such a library. For some simulation
purposes, suppose I use a class with a method computing the
consumption of
performing an action as a function of its start time:
public class DifficultComputation {
public double f(double t) {
// some lengthy equations here
}
}
Now in addition to computing the consumption by itself, I want to
be able
to compute the sensitivity of this consumption to start time
changes. This
would allow me to say: if action is started at t = 10 seconds, then
consumption will be 1.2 kilograms, and this consumption will
increase by 10
grams for each second I delay the start. The value 10 grams per
second of
delay is computing by differentiating the original equation. There
are
several ways to do that.
The first way relies on by mathematical transformations on the
equations
implemented in the function f. It it implies mathematical analysis
and new
development which is very error-prone (computing the differential
of a
function is much more complex than computing the function itself).
It is
only feasible if you know the equations or have the source code of
the
function. This approach may be used with symbolic computation
packages like
Mathematica, Axiom where you develop your equations using these
programs,
and have them generate the implementation for you. However, the
produce code
is only for some languages (typically fortran and C), it is awful
and cannot
be maintained (it is not intended to be), and needs to be
integrated with
the rest of the application which is already a difficult task.
The second way is using numerical finite-differences schemes. These
algorithms basically compute several values by changing the start
time by a
small known amount and looking at the various results. This implies
setting
up the step, which may be difficult if you don't already know the
behavior
of the function (should I use one microsecond or one century here,
in fact
it depends on the problem). This is also either quite computation
intensive
if you use high order schemes with 4, 6 or 8 points or inaccurate
if you
don't use them. It is also impossible to use too close to functions
boundaries which are often locations were we really want to explore.
[nabla] provides a third way to get this result. It analyses the
bytecode
of the function at run time, performs the exact symbolic mathematical
transforms, and generates a new class implementing the differentiated
function. There is still a computation cost, but it is the same you
would
get from a manually differentiated code, plus a one time bytecode
differentiation overhead (but we can also cache results).
This approach has the following benefits:
- derivation is exact
- there are no problem-dependent step size to handle
- derivation can be computed even at domains boundaries
- there is no special handling of source
(no symbolic package with its own language, no source code
generation, no integration with the rest of application)
- one writes and maintains only the basic equation and get the
derivative for free
- it is effective even when source code is not available (but there
are licensing issues in this case of course, since what I do
automatically is really ... derived work)
The only drawback I see is that functions calling native code
cannot be
handled. In this case, we have a fallback available with finite-
differences
schemes.
The existing implementation is not yet ready for production. A lot
of work
has been done, but there are many missing features. [nabla] can
handle
simple functions from end to end (i.e. up to creating an instance
of the
differentiated class that is fully functional). Making this code
available
in the sandbox would allow to let people look at it, comment on it,
participate if they are interested and make it go live.
What do you think about it ?
Luc
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]