Re: [all][nabla] proposition for a new project in sandbox

Luc Maisonobe Mon, 14 Apr 2008 15:22:12 -0700

Rory Winston a écrit :

Hi Luc
I'm a bit confused about the purpose of [nabla]. You say you outlinethree methods of computing the derivative of the function you outlined.However in reality I think you have only described two. Unless I ammissing the point of your example, your idea is basically performingsymbolic differentiation, albeit at the bytecode level. Presumably youmean to have some sort of interpreter or grammar that can evaluate thebytecode sequence being executed and contains mappings between functionsyou know about (e.g. functions in Math.*) and their various derivativesf^(n)?

Yes

If this is the case, I'm just curious as to where this may beuseful...if I wanted to do something like this now, I would examine thefunction to be differentiated, and either evaluate the derivate manuallyor via a package like Mathematica (full symbolic differentiation) oreven R (limited symbolic differentiation via a simple table-basedapproach). I would then code the resulting derivative into myapplication explicitly.


This work only in limited case.

First you need to have to equation (the source code is not enough, youhave to translate it back to the language of these packages, if it wasnot generated from the start).Second, it has to fit in the data model of these packages, i.e. be anequation like :


 f(x) := a * sin (x) + ....

A real implementation of a physical model is not that simple. Itinvolves tens of functions calling each others, with loops,conditionals, assignements, local variables ... On the very low level,each individual operation could be represented in Mathematica, Axiom,whatever. However the programming language that glues together all thesepieces and add the loops, conditionals ... does not fit within themodels of these packages.

Another problem is that there is no such thing as a single equation tohandle. There are really hundreds, developed by different teams, atdifferent times, for different purposes and which end up being usedtogether in a complete application. Even if I know the equations I useat my level, I do not necessarily know the equations implemented in theunderlying models that were provided to me by co-workers or a researchteam from another country.

In such cases, I would have to do tremendous efforts to extract theequations from the various models, port them to Mathematica (withouterror), have them differentiated (this is the easy part with such atool), convert the generated code to my platform and integrateeverything back into a huge application with lots of interdependencies.This effort has to be repeated each time one part of the application ischanged and a new model is optimized, fixed or added.

If I do not have source code, then I cant dothis...but if I dont have source code, how can I isolate the derivativecalculation?

The only thing Nabla needs is the called function. From there, itfollows the data and control flows of the program. It simply does thesame thing the processor does to execute the code, just differently.

So an entry point is enough to automate the differentiation process,without any need to go out of the runtime platform, have some specifictools perform differentiation, and back into the platform.

Again, please excuse my ignorance of the crux of your approach if I amway off in my assumptions...if there is something more here than that,then this could be interesting...otherwise if it is just symbolic diff(even deduced from bytecodes), then it doesnt seem much different thanthe symbolic diff covered in SICP 20 years ago.

It is just symbolic diff! It does exist since years. Mathematica, Axiomand even Emacs do this. However it is not completely integrated with ageneral purpose runtime platform like the JVM. It has been done in Lispbefore, but this platform does not have the same momentum as the JVM now.

Nabla is simply an implementation, and it only merges two techniquesfrom different fields : symbolic differentiation andanalysis/generation/loading of bytecode at runtime.

I didn't say it was something completely new and unseen. It is animprovement in the workflow of scientific development, hiding boring andcumbersome transforms away from developer.

Luc

Cheers!
Rory

On Sun, Apr 13, 2008 at 12:22 PM, Luc Maisonobe <[EMAIL PROTECTED]>
wrote:
Hello,
I have played with an idea for a new project for a few months. Askingfor
a few advices both at the ApacheCon Europe and by direct contact, all
responses I received were quite positive and suggested me to set up a
component in sandbox. This message is the first public announcementand is
intended to collect the opinion of all the commons community about this
project. In short: can I play in sandbox with this or should I findanother
place for it ? Another possibility would be to put it inside [math], but
that would be really strange.

The project already has a name: Nabla, which is an operator used in
mathematics and physics for differentiation. It is a simple triangle
pointing downwards (see http://mathworld.wolfram.com/Nabla.html). Lets
call the component I want to develop [nabla] from now, to match ourlocalhabits here. There is some code for it, but only developed by myselfon myspare time with my personal computer and never distributed to anyone.So Ican consider I developed it under Apache umbrella and put it onsandbox withthe Apache headers and license. I am already a commons committer andhave
filed an Individual Contributor License Agreement to Apache.

[nabla] will be a mathematics/physics library aimed at building the
symbolic differentiation of any function provided as a bytecode compiled
function.

Here is a typical use case for such a library. For some simulation
purposes, suppose I use a class with a method computing theconsumption of
performing an action as a function of its start time:

public class DifficultComputation {
public double f(double t) {
 // some lengthy equations here
}
}
Now in addition to computing the consumption by itself, I want to beableto compute the sensitivity of this consumption to start time changes.This
would allow me to say: if action is started at t = 10 seconds, then
consumption will be 1.2 kilograms, and this consumption will increaseby 10grams for each second I delay the start. The value 10 grams persecond of
delay is computing by differentiating the original equation. There are
several ways to do that.

The first way relies on by mathematical transformations on the equations
implemented in the function f. It it implies mathematical analysisand new
development which is very error-prone (computing the differential of a
function is much more complex than computing the function itself). It is
only feasible if you know the equations or have the source code of the
function. This approach may be used with symbolic computationpackages likeMathematica, Axiom where you develop your equations using theseprograms,and have them generate the implementation for you. However, theproduce codeis only for some languages (typically fortran and C), it is awful andcannotbe maintained (it is not intended to be), and needs to be integratedwith
the rest of the application which is already a difficult task.

The second way is using numerical finite-differences schemes. These
algorithms basically compute several values by changing the starttime by asmall known amount and looking at the various results. This impliessettingup the step, which may be difficult if you don't already know thebehaviorof the function (should I use one microsecond or one century here, infactit depends on the problem). This is also either quite computationintensive
if you use high order schemes with 4, 6 or 8 points or inaccurate if you
don't use them. It is also impossible to use too close to functions
boundaries which are often locations were we really want to explore.
[nabla] provides a third way to get this result. It analyses thebytecode
of the function at run time, performs the exact symbolic mathematical
transforms, and generates a new class implementing the differentiated
function. There is still a computation cost, but it is the same youwould
get from a manually differentiated code, plus a one time bytecode
differentiation overhead (but we can also cache results).

This approach has the following benefits:
- derivation is exact
- there are no problem-dependent step size to handle
- derivation can be computed even at domains boundaries
- there is no special handling of source
(no symbolic package with its own language, no source code
 generation, no integration with the rest of application)
- one writes and maintains only the basic equation and get the
derivative for free
- it is effective even when source code is not available (but there
are licensing issues in this case of course, since what I do
automatically is really ... derived work)

The only drawback I see is that functions calling native code cannot be
handled. In this case, we have a fallback available withfinite-differences
schemes.
The existing implementation is not yet ready for production. A lot ofwork
has been done, but there are many missing features. [nabla] can handle
simple functions from end to end (i.e. up to creating an instance of the
differentiated class that is fully functional). Making this codeavailable
in the sandbox would allow to let people look at it, comment on it,
participate if they are interested and make it go live.

What do you think about it ?
Luc


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [all][nabla] proposition for a new project in sandbox

Reply via email to