Rory Winston a écrit :
Hi Luc

I'm a bit confused about the purpose of [nabla]. You say you outline three methods of computing the derivative of the function you outlined. However in reality I think you have only described two. Unless I am missing the point of your example, your idea is basically performing symbolic differentiation, albeit at the bytecode level. Presumably you mean to have some sort of interpreter or grammar that can evaluate the bytecode sequence being executed and contains mappings between functions you know about (e.g. functions in Math.*) and their various derivatives f^(n)?

Yes

If this is the case, I'm just curious as to where this may be useful...if I wanted to do something like this now, I would examine the function to be differentiated, and either evaluate the derivate manually or via a package like Mathematica (full symbolic differentiation) or even R (limited symbolic differentiation via a simple table-based approach). I would then code the resulting derivative into my application explicitly.

This work only in limited case.
First you need to have to equation (the source code is not enough, you have to translate it back to the language of these packages, if it was not generated from the start). Second, it has to fit in the data model of these packages, i.e. be an equation like :

 f(x) := a * sin (x) + ....

A real implementation of a physical model is not that simple. It involves tens of functions calling each others, with loops, conditionals, assignements, local variables ... On the very low level, each individual operation could be represented in Mathematica, Axiom, whatever. However the programming language that glues together all these pieces and add the loops, conditionals ... does not fit within the models of these packages.

Another problem is that there is no such thing as a single equation to handle. There are really hundreds, developed by different teams, at different times, for different purposes and which end up being used together in a complete application. Even if I know the equations I use at my level, I do not necessarily know the equations implemented in the underlying models that were provided to me by co-workers or a research team from another country.

In such cases, I would have to do tremendous efforts to extract the equations from the various models, port them to Mathematica (without error), have them differentiated (this is the easy part with such a tool), convert the generated code to my platform and integrate everything back into a huge application with lots of interdependencies. This effort has to be repeated each time one part of the application is changed and a new model is optimized, fixed or added.

If I do not have source code, then I cant do this...but if I dont have source code, how can I isolate the derivative calculation?

The only thing Nabla needs is the called function. From there, it follows the data and control flows of the program. It simply does the same thing the processor does to execute the code, just differently.

So an entry point is enough to automate the differentiation process, without any need to go out of the runtime platform, have some specific tools perform differentiation, and back into the platform.


Again, please excuse my ignorance of the crux of your approach if I am way off in my assumptions...if there is something more here than that, then this could be interesting...otherwise if it is just symbolic diff (even deduced from bytecodes), then it doesnt seem much different than the symbolic diff covered in SICP 20 years ago.

It is just symbolic diff! It does exist since years. Mathematica, Axiom and even Emacs do this. However it is not completely integrated with a general purpose runtime platform like the JVM. It has been done in Lisp before, but this platform does not have the same momentum as the JVM now.

Nabla is simply an implementation, and it only merges two techniques from different fields : symbolic differentiation and analysis/generation/loading of bytecode at runtime.

I didn't say it was something completely new and unseen. It is an improvement in the workflow of scientific development, hiding boring and cumbersome transforms away from developer.

Luc


Cheers!
Rory

On Sun, Apr 13, 2008 at 12:22 PM, Luc Maisonobe <[EMAIL PROTECTED]>

wrote:

Hello,

I have played with an idea for a new project for a few months. Asking for
a few advices both at the ApacheCon Europe and by direct contact, all
responses I received were quite positive and suggested me to set up a
component in sandbox. This message is the first public announcement and is
intended to collect the opinion of all the commons community about this
project. In short: can I play in sandbox with this or should I find another
place for it ? Another possibility would be to put it inside [math], but
that would be really strange.

The project already has a name: Nabla, which is an operator used in
mathematics and physics for differentiation. It is a simple triangle
pointing downwards (see http://mathworld.wolfram.com/Nabla.html). Lets
call the component I want to develop [nabla] from now, to match our local habits here. There is some code for it, but only developed by myself on my spare time with my personal computer and never distributed to anyone. So I can consider I developed it under Apache umbrella and put it on sandbox with the Apache headers and license. I am already a commons committer and have
filed an Individual Contributor License Agreement to Apache.

[nabla] will be a mathematics/physics library aimed at building the
symbolic differentiation of any function provided as a bytecode compiled
function.

Here is a typical use case for such a library. For some simulation
purposes, suppose I use a class with a method computing the consumption of
performing an action as a function of its start time:

public class DifficultComputation {
public double f(double t) {
 // some lengthy equations here
}
}

Now in addition to computing the consumption by itself, I want to be able to compute the sensitivity of this consumption to start time changes. This
would allow me to say: if action is started at t = 10 seconds, then
consumption will be 1.2 kilograms, and this consumption will increase by 10 grams for each second I delay the start. The value 10 grams per second of
delay is computing by differentiating the original equation. There are
several ways to do that.

The first way relies on by mathematical transformations on the equations
implemented in the function f. It it implies mathematical analysis and new
development which is very error-prone (computing the differential of a
function is much more complex than computing the function itself). It is
only feasible if you know the equations or have the source code of the
function. This approach may be used with symbolic computation packages like Mathematica, Axiom where you develop your equations using these programs, and have them generate the implementation for you. However, the produce code is only for some languages (typically fortran and C), it is awful and cannot be maintained (it is not intended to be), and needs to be integrated with
the rest of the application which is already a difficult task.

The second way is using numerical finite-differences schemes. These
algorithms basically compute several values by changing the start time by a small known amount and looking at the various results. This implies setting up the step, which may be difficult if you don't already know the behavior of the function (should I use one microsecond or one century here, in fact it depends on the problem). This is also either quite computation intensive
if you use high order schemes with 4, 6 or 8 points or inaccurate if you
don't use them. It is also impossible to use too close to functions
boundaries which are often locations were we really want to explore.

[nabla] provides a third way to get this result. It analyses the bytecode
of the function at run time, performs the exact symbolic mathematical
transforms, and generates a new class implementing the differentiated
function. There is still a computation cost, but it is the same you would
get from a manually differentiated code, plus a one time bytecode
differentiation overhead (but we can also cache results).

This approach has the following benefits:
- derivation is exact
- there are no problem-dependent step size to handle
- derivation can be computed even at domains boundaries
- there is no special handling of source
(no symbolic package with its own language, no source code
 generation, no integration with the rest of application)
- one writes and maintains only the basic equation and get the
derivative for free
- it is effective even when source code is not available (but there
are licensing issues in this case of course, since what I do
automatically is really ... derived work)

The only drawback I see is that functions calling native code cannot be
handled. In this case, we have a fallback available with finite-differences
schemes.

The existing implementation is not yet ready for production. A lot of work
has been done, but there are many missing features. [nabla] can handle
simple functions from end to end (i.e. up to creating an instance of the
differentiated class that is fully functional). Making this code available
in the sandbox would allow to let people look at it, comment on it,
participate if they are interested and make it go live.

What do you think about it ?
Luc


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to