Re: [MATH] What is the derivative of 0^x

Luc Maisonobe Sun, 25 Aug 2013 13:21:42 -0700

Le 24/08/2013 11:24, Luc Maisonobe a écrit :
> Le 23/08/2013 19:20, Ajo Fod a écrit :
>> Hello,
> 
> Hi Ajo,
> 
>>
>> This shows one way of interpreting the derivative for strictly +ve numbers.
>>
>>     public static void main(final String[] args) {
>>         final double x = 1d;
>>         DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, x);
>>         System.out.println("Derivative of |a|^x wrt x");
>>         for (int p = 10; p < 21; p++) {
>>             double a;
>>             if (p < 20) {
>>                 a = 1d / Math.pow(2d, p);
>>             } else {
>>                 a = 0d;
>>             }
>>             final DerivativeStructure a_ds = new DerivativeStructure(1, 1,
>> a);
>>             final DerivativeStructure out = a_ds.pow(dsA);
>>             final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) /
>> EPS;
>>             System.out.format("Derivative@%f=%f  %f\n", a, calc,
>> out.getPartialDerivative(new int[]{1}));
>>         }
>>     }
>>
>> At this point I"m explicitly substituting the rule that derivative(|a|^x) =
>> 0 for |a|=0.
> 
> Yes, but this fails for x = 0, as the limit of the finite difference is
> -infinity and not 0.
> 
> You can build your own function which explicitly assumes a is constant
> and takes care of special values as follows:
> 
>  public static DerivativeStructure aToX(final double a,
>                                         final DerivativeStructure x) {
>      final double lnA = (a == 0 && x.getValue() == 0) ?
>                   Double.NEGATIVE_INFINITY :
>                   FastMath.log(a);
>      final double[] function = new double[1 + x.getOrder()];
>      function[0] = FastMath.pow(a, x.getValue());
>      for (int i = 1; i < function.length; ++i) {
>          function[i] = lnA * function[i - 1];
>      }
>      return x.compose(function);
>  }
> 
> This will work and provides derivatives to any order for almost any
> values of a and x, including a=0, x=1 as in your exemple, but also
> slightly better for a=0, x=0. However, it still has an important
> drawback: it won't compute the n-th order derivative correctly for a=0,
> x=0 and n > 1. It will provide NaN for these higher order derivatives
> instead of +/-infinity according to parity of n.


I have added a similar function to the DerivativeStructure class (with
some errors above corrected). The main interesting property of this
function is that it is more accurate that converting a to a
DerivativeStructure and using the general x^y function. It does its best
to handle the special case, but as written above, this does NOT work for
general combination (i.e. more than one variable or more than one
order). As soon as there is a combination, the derivative will involve
something like df/dx * dg/dy and as infinities and zeros are everywheren
NaN appears immediately for these partial derivatives. This cannot be
avoided.

If you stay away from the singularity, the function behaves correctly.

best regards,
Luc

> 
> This is a known problem that we already encountered when dealing with
> rootN. Here is an extract of a comment in the test case
> testRootNSingularity, where similar NaN appears instead of +/- infinity.
> The dsZero instance in the comment is simple the x parameter of the
> function, as a derivativeStructure with value 0.0 and depending on
> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)):
> 
> 
> // the following checks shows a LIMITATION of the current implementation
> // we have no way to tell dsZero is a pure linear variable x = 0
> // we only say: "dsZero is a structure with value = 0.0,
> // first derivative = 1.0, second and higher derivatives = 0.0".
> // Function composition rule for second derivatives is:
> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x)
> // when function f is the nth root and x = 0 we have:
> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher
> // derivatives keep switching between +infinity and -infinity)
> // so given that in our case dsZero represents g, we have g(x) = 0,
> // g'(x) = 1 and g''(x) = 0
> // applying the composition rules gives:
> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x)
> //                 = -infinity * 1^2 + +infinity * 0
> //                 = -infinity + NaN
> //                 = NaN
> // if we knew dsZero is really the x variable and not the identity
> // function applied to x, we would not have computed f'(g(x)) * g''(x)
> // and we would have found that the result was -infinity and not NaN
> 
> Hope this helps
> Luc
> 
>>
>> Thanks,
>> Ajo.
>>
>>
>>
>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe <luc.maison...@free.fr>wrote:
>>
>>> Hi Ajo,
>>>
>>> Le 23/08/2013 17:48, Ajo Fod a écrit :
>>>> Try this and I'm happy to explain if necessary:
>>>>
>>>> public class Derivative {
>>>>
>>>>     public static void main(final String[] args) {
>>>>         DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, 1d);
>>>>         System.out.println("Derivative of constant^x wrt x");
>>>>         for (int a = -3; a < 3; a++) {
>>>
>>> We have chosen the classical definition which implies c^x is not defined
>>> for real r and negative c.
>>>
>>> Our implementation is based on the decomposition c^r = exp(r * ln(c)),
>>> so the NaN comes from the logarithm when c <= 0.
>>>
>>> Noe also that as explained in the documentation here:
>>> <
>>> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation
>>>> ,
>>> there are no concepts of "constants" and "variables" in this framework,
>>> so we cannot draw a line between c^r as seen as a univariate function of
>>> r, or as a univariate function of c, or as a bivariate function of c and
>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with both c
>>> and r being computed elsewhere from p1...p5. So we don't make special
>>> cases for the case c = 0 for example.
>>>
>>> Does this explanation make sense to you?
>>>
>>> best regards,
>>> Luc
>>>
>>>
>>>>             final DerivativeStructure a_ds = new DerivativeStructure(1,
>>> 1,
>>>> a);
>>>>             final DerivativeStructure out = a_ds.pow(dsA);
>>>>             System.out.format("Derivative@%d=%f\n", a,
>>>> out.getPartialDerivative(new int[]{1}));
>>>>         }
>>>>     }
>>>> }
>>>>
>>>>
>>>>
>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles <gil...@harfang.homelinux.org
>>>> wrote:
>>>>
>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote:
>>>>>
>>>>>> Seems like the DerivativeCompiler returns NaN.
>>>>>>
>>>>>> IMHO it should return 0.
>>>>>>
>>>>>
>>>>> What should be 0?  And Why?
>>>>>
>>>>>
>>>>>
>>>>>> Is this worthy of an issue?
>>>>>>
>>>>>
>>>>> As is, no.
>>>>>
>>>>> Gilles
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> -Ajo
>>>>>>
>>>>>
>>>>>
>>>>>
>>> ------------------------------**------------------------------**---------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org<
>>> dev-unsubscr...@commons.apache.org>
>>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>>>
>>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>
>>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [MATH] What is the derivative of 0^x

Reply via email to