Re: [MATH] What is the derivative of 0^x

Luc Maisonobe Tue, 27 Aug 2013 13:50:35 -0700

Hi Ajo,

Le 27/08/2013 16:44, Ajo Fod a écrit :
> Thanks for the constant structure.
> 
> No. The limit value when x->0+ is 1, not O.
> 
> I agree with this. I was just going for the derivatives = 0.
> 
> 
>> The nth derivative of a^x can be computed analytically as ln(a)^n a^x,
>> so the initial slope at x=0 is simply ln(a), positive for a > 1, zero
>> for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> 0+.
>>
> 
> Lets think about this for a sec:
> Derivative of |a|^x wrt x at x=2.0 for various values of a
> Derivative@0.031250=-0.003384
> Derivative@0.015625=-0.001015
> Derivative@0.007813=-0.000296
> Derivative@0.003906=-0.000085
> Derivative@0.001953=-0.000024
> ... tends to 0


yes, because 2.0 > 0.

> 
> Derivative of |a|^x wrt x at x=0.5 for various values of a
> Derivative@0.031250=-0.612555
> Derivative@0.007813=-0.428759
> Derivative@0.001953=-0.275612
> Derivative@0.000488=-0.168418
> Derivative@0.000122=-0.099513
> Derivative@0.000031=-0.057407
> Derivative@0.000008=-0.032528
> Derivative@0.000002=-0.018176
> ... tends to 0 when a->0

yes because 0.5 > 0.

> 
> The code I used for the print outs is:
>     static final double EPS = 0.0001d;
> 
>     public static void main(final String[] args) {
>         final double x = 0.5d;
>         int from = 5;
>         int to = 20;
>         System.out.println("Derivative of |a|^x wrt x at x=" + x);
>         for (int p = from; p < to; p+=2) {
>             double a = Math.pow(2d, -p);
>             final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) /
> EPS;
>             System.out.format("Derivative@%f=%f \n", a, calc);
>         }
>     }
> 
> As for the x=0 case:
> 1^0 = 1
> 0.5^0 = 1
> 0.0001^0 = 1
> 0^0 is technically undefined, but  1 is a good definition:
> http://www.math.hmc.edu/funfacts/ffiles/10005.3-5.shtml

Yes.

> ... so, a good value for the differential of da^x/dx  limit x->0 and a->0 =
> 0

I don't agree. What you wrote in the lines above is another way to say
what I wrote in my previous message: the value at x=0 is always y=1, and
the value for x > 0 tends to 0 as a->0+.

So the function always starts at 1 and dives more and more steeply as a
becomes smaller, and the derivative at 0 becomes more and more negative,
up to -infinity, *not* 0.

The function is ill-behaved and the fact the derivative is infinite is
consistent with this ill-behaviour.

The definition of the derivative is :

 f'(x) = lim (f(x+h) - f(x))/h when h -> 0+

when f(x) = 0^x and assuming 0^0 = 1 as you have agreed above, this gives:

 f'(0) = lim (0^(0+h) - 0^0)/h = lim (0 - 1)/h = -infinity

which is exactly the same result as computing for a non-null a and then
reducing it: d(a^x)/dx = ln(a) a^x = ln(a) when x=0, diverges to
-infinity when a converges to 0.

> 
> 
> As mentioned earlier, I think the cause for this is that log|a| -> infinity
> slower than |a|^x -> 0 as |a|->0 .

But a^x does *not* converge to 0 for x = 0! a^0 is always 1 (rigorously)
regardless of the value of a as long as it is not 0, and then when we
change a we can also consider the limit is 1 when a-> 0. This convention
is well accepted. This convention is implemented in the Java standard
Math.pow function, and we followed this trend. This is the reason why
the functions becomes more and more steep as a becomes smaller. At the
end, it is a discontinuous function (and hence should not be
differentiable, or it is differentiable only if we use extended real
numbers with infinity added).

This is the heart of the ill-behaviour of 0^0. We want to compute it as
a limit value for a^b when both parameters converge to 0, but we get a
different result if we first set a fixed and converge b to 0, and later
reduce a down to zero (your approach), and when we do the opposite. In
one case we get 0, in the other case we get 1.

Lets put it another way:
If we consider the derivative f'(0) should be 0, then the value f(0)
should also be considered equal to zero. This would mean as soon as we
get a tiny non-zero a (say the smallest number that can be represented
as a double), then f(0) would jump from 0 to 1 instantly, and f'(0)
would jump from 0 to -infinity instantly. So we would have at a = 0 an
initial null derivative, then a jump to a very negative derivative as a
leaves 0, then the derivative would become less and less negative as a
increase up to 1, at a=1 the derivative would again be 0, then the
derivative would continue to increase and becode positive as a grows
larger than 1 (all these derivatives are computed at x=0, and as written
previously, they are simply equal to log(a)).

To summarize, the two choices are:
 1) - first considering a fixed a, strictly positive,
    - then looking globally at the function a^x for all values x>=0,
    - then reducing a, noting that all functions start at the same
      point x=0, y=1 and the derivatives become more and more negative
      as the function becomes more and more ill-behaved
 2) - first considering a fixed x, strictly positive,
    - then reducing a and identifying the limit values is 0 for all a,
    - then building a function by packing all the x>0, which is very
      smooth as it is identically 0 for all x>0
    - finally adding the limit value at x=0, which in this case would
      be 0 (and the derivative would also be 0).

it seems well accepted to consider the value of 0^0 should be set to 1,
and as a consequence the corresponding derivative with respect to x
should be set to -infinity.

I fully agree it is not a perfect solution, it is an arbitrary choice.
However, this choice is consistent with what all implementations of the
pow function I have seen (i.e. 0^0 set to 1 instead of 0).

Your approach is not wrong, it is as valid as the other one. It is
simply not the common choice.

I would say an even better choice would have been to say 0^0 *is not*
defined and even the value should be set to NaN (not even speaking of
the derivative).

Does this seem acceptable to you?

best regards,
Luc

> 
> Cheers,
> Ajo.
> 
> 
>> The limit curve corresponding to a = 0 is therefore a singular function
>> with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 is
>> consistent with the derivative being negative infinity, as by definition
>> the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the
>> finite difference is -1/h.
>>
>>>                 }
>>>             }else{
>>>                 for (int i = 0; i < function.length; ++i) {
>>>                     function[i] = Double.NaN;
>>>                 }
>>
>> This alternative case is a good improvement, thanks for it. I forgot to
>> handle negative cases properly. I have therefore changed the code
>> (committed as r1517788) with this improvement, together with several
>> test cases.
>>
>>>             }
>>>         } else {
>>>
>>>
>>> in place of :
>>>
>>>         if (a == 0) {
>>>             if (operand[operandOffset] == 0) {
>>>                 function[0] = 1;
>>>                 double infinity = Double.POSITIVE_INFINITY;
>>>                 for (int i = 1; i < function.length; ++i) {
>>>                     infinity = -infinity;
>>>                     function[i] = infinity;
>>>                 }
>>>             }
>>>         } else {
>>>
>>>
>>> PS: I think you made a change to DSCompiler.pow too. If so, what happens
>>> when a=0 & x!=0  in that function?
>>
>> No, I didn't change the other signatures of the pow function. So the
>> value should be OK (i.e. 1) but all derivatives, including the first
>> one, should be NaN. What the new function brings is a correct negetive
>> infinity first derivative at singularity point, better accuracy for
>> non-singular points, and possibly faster computation.
>>
>> best regards,
>> Luc
>>
>>>
>>>
>>> On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <l...@spaceroots.org>
>> wrote:
>>>
>>>>
>>>>
>>>>
>>>> Ajo Fod <ajo....@gmail.com> a écrit :
>>>>> Are you saying patched the code? Can you provide the link?
>>>>
>>>> I committed it in the development version. You just have to update your
>>>> checked out copy from either the official
>>>>  Apache subversion repository or the git mirror we talked about in a
>>>> previous thread.
>>>>
>>>> The new method is a static one called pow and taking a and x as
>> arguments
>>>> and returning a^x. Not to
>>>> Be confused with the non-static methods that take only the power as
>>>> argument (either int, double or
>>>> DerivativeStructure) and use the instance as the base to apply power on.
>>>>
>>>> Best regards,
>>>> Luc
>>>>
>>>>>
>>>>> -Ajo
>>>>>
>>>>>
>>>>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <l...@spaceroots.org>
>>>>> wrote:
>>>>>
>>>>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit :
>>>>>>> Le 23/08/2013 19:20, Ajo Fod a écrit :
>>>>>>>> Hello,
>>>>>>>
>>>>>>> Hi Ajo,
>>>>>>>
>>>>>>>>
>>>>>>>> This shows one way of interpreting the derivative for strictly +ve
>>>>>> numbers.
>>>>>>>>
>>>>>>>>     public static void main(final String[] args) {
>>>>>>>>         final double x = 1d;
>>>>>>>>         DerivativeStructure dsA = new DerivativeStructure(1, 1, 0,
>>>>> x);
>>>>>>>>         System.out.println("Derivative of |a|^x wrt x");
>>>>>>>>         for (int p = 10; p < 21; p++) {
>>>>>>>>             double a;
>>>>>>>>             if (p < 20) {
>>>>>>>>                 a = 1d / Math.pow(2d, p);
>>>>>>>>             } else {
>>>>>>>>                 a = 0d;
>>>>>>>>             }
>>>>>>>>             final DerivativeStructure a_ds = new
>>>>> DerivativeStructure(1,
>>>>>> 1,
>>>>>>>> a);
>>>>>>>>             final DerivativeStructure out = a_ds.pow(dsA);
>>>>>>>>             final double calc = (Math.pow(a, x + EPS) -
>>>>> Math.pow(a, x))
>>>>>> /
>>>>>>>> EPS;
>>>>>>>>             System.out.format("Derivative@%f=%f  %f\n", a, calc,
>>>>>>>> out.getPartialDerivative(new int[]{1}));
>>>>>>>>         }
>>>>>>>>     }
>>>>>>>>
>>>>>>>> At this point I"m explicitly substituting the rule that
>>>>>> derivative(|a|^x) =
>>>>>>>> 0 for |a|=0.
>>>>>>>
>>>>>>> Yes, but this fails for x = 0, as the limit of the finite
>>>>> difference is
>>>>>>> -infinity and not 0.
>>>>>>>
>>>>>>> You can build your own function which explicitly assumes a is
>>>>> constant
>>>>>>> and takes care of special values as follows:
>>>>>>>
>>>>>>>  public static DerivativeStructure aToX(final double a,
>>>>>>>                                         final DerivativeStructure
>>>>> x) {
>>>>>>>      final double lnA = (a == 0 && x.getValue() == 0) ?
>>>>>>>                   Double.NEGATIVE_INFINITY :
>>>>>>>                   FastMath.log(a);
>>>>>>>      final double[] function = new double[1 + x.getOrder()];
>>>>>>>      function[0] = FastMath.pow(a, x.getValue());
>>>>>>>      for (int i = 1; i < function.length; ++i) {
>>>>>>>          function[i] = lnA * function[i - 1];
>>>>>>>      }
>>>>>>>      return x.compose(function);
>>>>>>>  }
>>>>>>>
>>>>>>> This will work and provides derivatives to any order for almost any
>>>>>>> values of a and x, including a=0, x=1 as in your exemple, but also
>>>>>>> slightly better for a=0, x=0. However, it still has an important
>>>>>>> drawback: it won't compute the n-th order derivative correctly for
>>>>> a=0,
>>>>>>> x=0 and n > 1. It will provide NaN for these higher order
>>>>> derivatives
>>>>>>> instead of +/-infinity according to parity of n.
>>>>>>
>>>>>> I have added a similar function to the DerivativeStructure class
>>>>> (with
>>>>>> some errors above corrected). The main interesting property of this
>>>>>> function is that it is more accurate that converting a to a
>>>>>> DerivativeStructure and using the general x^y function. It does its
>>>>> best
>>>>>> to handle the special case, but as written above, this does NOT work
>>>>> for
>>>>>> general combination (i.e. more than one variable or more than one
>>>>>> order). As soon as there is a combination, the derivative will
>>>>> involve
>>>>>> something like df/dx * dg/dy and as infinities and zeros are
>>>>> everywheren
>>>>>> NaN appears immediately for these partial derivatives. This cannot be
>>>>>> avoided.
>>>>>>
>>>>>> If you stay away from the singularity, the function behaves
>>>>> correctly.
>>>>>>
>>>>>> best regards,
>>>>>> Luc
>>>>>>
>>>>>>>
>>>>>>> This is a known problem that we already encountered when dealing
>>>>> with
>>>>>>> rootN. Here is an extract of a comment in the test case
>>>>>>> testRootNSingularity, where similar NaN appears instead of +/-
>>>>> infinity.
>>>>>>> The dsZero instance in the comment is simple the x parameter of the
>>>>>>> function, as a derivativeStructure with value 0.0 and depending on
>>>>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)):
>>>>>>>
>>>>>>>
>>>>>>> // the following checks shows a LIMITATION of the current
>>>>> implementation
>>>>>>> // we have no way to tell dsZero is a pure linear variable x = 0
>>>>>>> // we only say: "dsZero is a structure with value = 0.0,
>>>>>>> // first derivative = 1.0, second and higher derivatives = 0.0".
>>>>>>> // Function composition rule for second derivatives is:
>>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x)
>>>>>>> // when function f is the nth root and x = 0 we have:
>>>>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher
>>>>>>> // derivatives keep switching between +infinity and -infinity)
>>>>>>> // so given that in our case dsZero represents g, we have g(x) = 0,
>>>>>>> // g'(x) = 1 and g''(x) = 0
>>>>>>> // applying the composition rules gives:
>>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x)
>>>>>>> //                 = -infinity * 1^2 + +infinity * 0
>>>>>>> //                 = -infinity + NaN
>>>>>>> //                 = NaN
>>>>>>> // if we knew dsZero is really the x variable and not the identity
>>>>>>> // function applied to x, we would not have computed f'(g(x)) *
>>>>> g''(x)
>>>>>>> // and we would have found that the result was -infinity and not
>>>>> NaN
>>>>>>>
>>>>>>> Hope this helps
>>>>>>> Luc
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Ajo.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe
>>>>> <luc.maison...@free.fr
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Ajo,
>>>>>>>>>
>>>>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit :
>>>>>>>>>> Try this and I'm happy to explain if necessary:
>>>>>>>>>>
>>>>>>>>>> public class Derivative {
>>>>>>>>>>
>>>>>>>>>>     public static void main(final String[] args) {
>>>>>>>>>>         DerivativeStructure dsA = new DerivativeStructure(1, 1,
>>>>> 0,
>>>>>> 1d);
>>>>>>>>>>         System.out.println("Derivative of constant^x wrt x");
>>>>>>>>>>         for (int a = -3; a < 3; a++) {
>>>>>>>>>
>>>>>>>>> We have chosen the classical definition which implies c^x is not
>>>>>> defined
>>>>>>>>> for real r and negative c.
>>>>>>>>>
>>>>>>>>> Our implementation is based on the decomposition c^r = exp(r *
>>>>> ln(c)),
>>>>>>>>> so the NaN comes from the logarithm when c <= 0.
>>>>>>>>>
>>>>>>>>> Noe also that as explained in the documentation here:
>>>>>>>>> <
>>>>>>>>>
>>>>>>
>>>>>
>>>>
>> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation
>>>>>>>>>> ,
>>>>>>>>> there are no concepts of "constants" and "variables" in this
>>>>> framework,
>>>>>>>>> so we cannot draw a line between c^r as seen as a univariate
>>>>> function
>>>>>> of
>>>>>>>>> r, or as a univariate function of c, or as a bivariate function
>>>>> of c
>>>>>> and
>>>>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with
>>>>> both c
>>>>>>>>> and r being computed elsewhere from p1...p5. So we don't make
>>>>> special
>>>>>>>>> cases for the case c = 0 for example.
>>>>>>>>>
>>>>>>>>> Does this explanation make sense to you?
>>>>>>>>>
>>>>>>>>> best regards,
>>>>>>>>> Luc
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>             final DerivativeStructure a_ds = new
>>>>>> DerivativeStructure(1,
>>>>>>>>> 1,
>>>>>>>>>> a);
>>>>>>>>>>             final DerivativeStructure out = a_ds.pow(dsA);
>>>>>>>>>>             System.out.format("Derivative@%d=%f\n", a,
>>>>>>>>>> out.getPartialDerivative(new int[]{1}));
>>>>>>>>>>         }
>>>>>>>>>>     }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles
>>>>> <gil...@harfang.homelinux.org
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Seems like the DerivativeCompiler returns NaN.
>>>>>>>>>>>>
>>>>>>>>>>>> IMHO it should return 0.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> What should be 0?  And Why?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Is this worthy of an issue?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> As is, no.
>>>>>>>>>>>
>>>>>>>>>>> Gilles
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -Ajo
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>
>> ------------------------------**------------------------------**---------
>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org<
>>>>>>>>> dev-unsubscr...@commons.apache.org>
>>>>>>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>>>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>>>>
>>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>>
>>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [MATH] What is the derivative of 0^x

Reply via email to