El 25/02/2011 12:08, Marco van de Voort escribió:

Have a look at FPC package symbolic. It sounds like roughly the same kind of
soup.  (parsing, differentiating, fast eval)

Yes, roughly. Because if I have read it correctly, the package evaluates things ultimately by calls to the system functions like sin, arctan, *, etc. This is very different than building on the fly the minimal object code for evaluating the function, and much slower.

Also, for derivatives, Symbolic builds first the expression that is the derivative of the initial expression. When the derivative is called, it evaluates the new function.

Automatic differentiation does not compute the derivative function, but at each step of the evaluation of the RPN it evaluates the function and the first (or the function, the first and the second) derivative of that step. For instance, if the RPN step is "cosine", the Symbolic approach is to calculate first -sine (and for the second derivative, to calculate then _again_ the cosine, etc.). But in my approach, instead of computing the cosine, it is computed the sincos pair by using the x87 instruction of AMD or Intel. The two values are used directly for computing the first and second derivatives: thus at most _one_ call to an expensive FPU instruction.

In my approach, if a call of a RPN step costs n clock cycles, then the call of the first derivative of that function takes a cost c*n clock cycles, where c depends only on the function called (that is, whether it is * or arctan, etc). And the same (with another greater coefficient) with the second derivative. Suppose that c = 3 for the operator * and one has to compute the value of x*y*z*w, where these symbols denote variables or expressions on which the formula depends. Then Symbolic will compute

x'*y*z*w + x*y'*z*w + x*y*z'*w + x*y*z*w',

that is roughly four times the computation of the function. Automatic differentiation will take only three (= c) times more (this should be taken only as a sloppy explanation). Thus, the more complex the formula the better relative performance.

On the other hand, porting my library to other architectures will be possible if they have an FPU based on a stack of perhaps more than six registers, a collection of instructions that move numbers between them and between them and RAM, and a minimal set of instructions for each of the basic mathematical functions of Pascal: +,-,*,/,sqr, sqrt, sin, cos, tan, arctan, abs, exp, log. But I have no idea about those architectures.


--
montesin at uv dot es


_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to