To define things precisely: y = f(a,x) = |a|^x Can we agree that: df(a,x)/dx -> 0 when a->0 and x > 0 :[ NOTE: x > 0]
If this is acceptable, we get this very useful property that df (a,x)/dx is defined and continuous for all a provided x>0 because we use the modulus of a in the function definition. In optimization, with this patch at |a|=0, I can set an optimizer to search the whole real line without worrying about a=0 otherwise I've to look out for a=0 explicitly. It seems unnecessary to add a constraint to make |a|>0. I already have a constraint for x >0. Cheers, Ajo. On Tue, Aug 27, 2013 at 1:49 PM, Luc Maisonobe <[email protected]>wrote: > Hi Ajo, > > Le 27/08/2013 16:44, Ajo Fod a écrit : > > Thanks for the constant structure. > > > > No. The limit value when x->0+ is 1, not O. > > > > I agree with this. I was just going for the derivatives = 0. > > > > > >> The nth derivative of a^x can be computed analytically as ln(a)^n a^x, > >> so the initial slope at x=0 is simply ln(a), positive for a > 1, zero > >> for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> > 0+. > >> > > > > Lets think about this for a sec: > > Derivative of |a|^x wrt x at x=2.0 for various values of a > > [email protected]=-0.003384 > > [email protected]=-0.001015 > > [email protected]=-0.000296 > > [email protected]=-0.000085 > > [email protected]=-0.000024 > > ... tends to 0 > > yes, because 2.0 > 0. > > > > > Derivative of |a|^x wrt x at x=0.5 for various values of a > > [email protected]=-0.612555 > > [email protected]=-0.428759 > > [email protected]=-0.275612 > > [email protected]=-0.168418 > > [email protected]=-0.099513 > > [email protected]=-0.057407 > > [email protected]=-0.032528 > > [email protected]=-0.018176 > > ... tends to 0 when a->0 > > yes because 0.5 > 0. > > > > > The code I used for the print outs is: > > static final double EPS = 0.0001d; > > > > public static void main(final String[] args) { > > final double x = 0.5d; > > int from = 5; > > int to = 20; > > System.out.println("Derivative of |a|^x wrt x at x=" + x); > > for (int p = from; p < to; p+=2) { > > double a = Math.pow(2d, -p); > > final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / > > EPS; > > System.out.format("Derivative@%f=%f \n", a, calc); > > } > > } > > > > As for the x=0 case: > > 1^0 = 1 > > 0.5^0 = 1 > > 0.0001^0 = 1 > > 0^0 is technically undefined, but 1 is a good definition: > > http://www.math.hmc.edu/funfacts/ffiles/10005.3-5.shtml > > Yes. > > > ... so, a good value for the differential of da^x/dx limit x->0 and > a->0 = > > 0 > > I don't agree. What you wrote in the lines above is another way to say > what I wrote in my previous message: the value at x=0 is always y=1, and > the value for x > 0 tends to 0 as a->0+. > > So the function always starts at 1 and dives more and more steeply as a > becomes smaller, and the derivative at 0 becomes more and more negative, > up to -infinity, *not* 0. > > The function is ill-behaved and the fact the derivative is infinite is > consistent with this ill-behaviour. > > The definition of the derivative is : > > f'(x) = lim (f(x+h) - f(x))/h when h -> 0+ > > when f(x) = 0^x and assuming 0^0 = 1 as you have agreed above, this gives: > > f'(0) = lim (0^(0+h) - 0^0)/h = lim (0 - 1)/h = -infinity > > which is exactly the same result as computing for a non-null a and then > reducing it: d(a^x)/dx = ln(a) a^x = ln(a) when x=0, diverges to > -infinity when a converges to 0. > > > > > > > As mentioned earlier, I think the cause for this is that log|a| -> > infinity > > slower than |a|^x -> 0 as |a|->0 . > > But a^x does *not* converge to 0 for x = 0! a^0 is always 1 (rigorously) > regardless of the value of a as long as it is not 0, and then when we > change a we can also consider the limit is 1 when a-> 0. This convention > is well accepted. This convention is implemented in the Java standard > Math.pow function, and we followed this trend. This is the reason why > the functions becomes more and more steep as a becomes smaller. At the > end, it is a discontinuous function (and hence should not be > differentiable, or it is differentiable only if we use extended real > numbers with infinity added). > > This is the heart of the ill-behaviour of 0^0. We want to compute it as > a limit value for a^b when both parameters converge to 0, but we get a > different result if we first set a fixed and converge b to 0, and later > reduce a down to zero (your approach), and when we do the opposite. In > one case we get 0, in the other case we get 1. > > Lets put it another way: > If we consider the derivative f'(0) should be 0, then the value f(0) > should also be considered equal to zero. This would mean as soon as we > get a tiny non-zero a (say the smallest number that can be represented > as a double), then f(0) would jump from 0 to 1 instantly, and f'(0) > would jump from 0 to -infinity instantly. So we would have at a = 0 an > initial null derivative, then a jump to a very negative derivative as a > leaves 0, then the derivative would become less and less negative as a > increase up to 1, at a=1 the derivative would again be 0, then the > derivative would continue to increase and becode positive as a grows > larger than 1 (all these derivatives are computed at x=0, and as written > previously, they are simply equal to log(a)). > > To summarize, the two choices are: > 1) - first considering a fixed a, strictly positive, > - then looking globally at the function a^x for all values x>=0, > - then reducing a, noting that all functions start at the same > point x=0, y=1 and the derivatives become more and more negative > as the function becomes more and more ill-behaved > 2) - first considering a fixed x, strictly positive, > - then reducing a and identifying the limit values is 0 for all a, > - then building a function by packing all the x>0, which is very > smooth as it is identically 0 for all x>0 > - finally adding the limit value at x=0, which in this case would > be 0 (and the derivative would also be 0). > > it seems well accepted to consider the value of 0^0 should be set to 1, > and as a consequence the corresponding derivative with respect to x > should be set to -infinity. > > I fully agree it is not a perfect solution, it is an arbitrary choice. > However, this choice is consistent with what all implementations of the > pow function I have seen (i.e. 0^0 set to 1 instead of 0). > > Your approach is not wrong, it is as valid as the other one. It is > simply not the common choice. > > I would say an even better choice would have been to say 0^0 *is not* > defined and even the value should be set to NaN (not even speaking of > the derivative). > > Does this seem acceptable to you? > > best regards, > Luc > > > > > Cheers, > > Ajo. > > > > > >> The limit curve corresponding to a = 0 is therefore a singular function > >> with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 is > >> consistent with the derivative being negative infinity, as by definition > >> the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the > >> finite difference is -1/h. > >> > >>> } > >>> }else{ > >>> for (int i = 0; i < function.length; ++i) { > >>> function[i] = Double.NaN; > >>> } > >> > >> This alternative case is a good improvement, thanks for it. I forgot to > >> handle negative cases properly. I have therefore changed the code > >> (committed as r1517788) with this improvement, together with several > >> test cases. > >> > >>> } > >>> } else { > >>> > >>> > >>> in place of : > >>> > >>> if (a == 0) { > >>> if (operand[operandOffset] == 0) { > >>> function[0] = 1; > >>> double infinity = Double.POSITIVE_INFINITY; > >>> for (int i = 1; i < function.length; ++i) { > >>> infinity = -infinity; > >>> function[i] = infinity; > >>> } > >>> } > >>> } else { > >>> > >>> > >>> PS: I think you made a change to DSCompiler.pow too. If so, what > happens > >>> when a=0 & x!=0 in that function? > >> > >> No, I didn't change the other signatures of the pow function. So the > >> value should be OK (i.e. 1) but all derivatives, including the first > >> one, should be NaN. What the new function brings is a correct negetive > >> infinity first derivative at singularity point, better accuracy for > >> non-singular points, and possibly faster computation. > >> > >> best regards, > >> Luc > >> > >>> > >>> > >>> On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[email protected]> > >> wrote: > >>> > >>>> > >>>> > >>>> > >>>> Ajo Fod <[email protected]> a écrit : > >>>>> Are you saying patched the code? Can you provide the link? > >>>> > >>>> I committed it in the development version. You just have to update > your > >>>> checked out copy from either the official > >>>> Apache subversion repository or the git mirror we talked about in a > >>>> previous thread. > >>>> > >>>> The new method is a static one called pow and taking a and x as > >> arguments > >>>> and returning a^x. Not to > >>>> Be confused with the non-static methods that take only the power as > >>>> argument (either int, double or > >>>> DerivativeStructure) and use the instance as the base to apply power > on. > >>>> > >>>> Best regards, > >>>> Luc > >>>> > >>>>> > >>>>> -Ajo > >>>>> > >>>>> > >>>>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[email protected]> > >>>>> wrote: > >>>>> > >>>>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : > >>>>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : > >>>>>>>> Hello, > >>>>>>> > >>>>>>> Hi Ajo, > >>>>>>> > >>>>>>>> > >>>>>>>> This shows one way of interpreting the derivative for strictly +ve > >>>>>> numbers. > >>>>>>>> > >>>>>>>> public static void main(final String[] args) { > >>>>>>>> final double x = 1d; > >>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, > >>>>> x); > >>>>>>>> System.out.println("Derivative of |a|^x wrt x"); > >>>>>>>> for (int p = 10; p < 21; p++) { > >>>>>>>> double a; > >>>>>>>> if (p < 20) { > >>>>>>>> a = 1d / Math.pow(2d, p); > >>>>>>>> } else { > >>>>>>>> a = 0d; > >>>>>>>> } > >>>>>>>> final DerivativeStructure a_ds = new > >>>>> DerivativeStructure(1, > >>>>>> 1, > >>>>>>>> a); > >>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>>>>>> final double calc = (Math.pow(a, x + EPS) - > >>>>> Math.pow(a, x)) > >>>>>> / > >>>>>>>> EPS; > >>>>>>>> System.out.format("Derivative@%f=%f %f\n", a, calc, > >>>>>>>> out.getPartialDerivative(new int[]{1})); > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> At this point I"m explicitly substituting the rule that > >>>>>> derivative(|a|^x) = > >>>>>>>> 0 for |a|=0. > >>>>>>> > >>>>>>> Yes, but this fails for x = 0, as the limit of the finite > >>>>> difference is > >>>>>>> -infinity and not 0. > >>>>>>> > >>>>>>> You can build your own function which explicitly assumes a is > >>>>> constant > >>>>>>> and takes care of special values as follows: > >>>>>>> > >>>>>>> public static DerivativeStructure aToX(final double a, > >>>>>>> final DerivativeStructure > >>>>> x) { > >>>>>>> final double lnA = (a == 0 && x.getValue() == 0) ? > >>>>>>> Double.NEGATIVE_INFINITY : > >>>>>>> FastMath.log(a); > >>>>>>> final double[] function = new double[1 + x.getOrder()]; > >>>>>>> function[0] = FastMath.pow(a, x.getValue()); > >>>>>>> for (int i = 1; i < function.length; ++i) { > >>>>>>> function[i] = lnA * function[i - 1]; > >>>>>>> } > >>>>>>> return x.compose(function); > >>>>>>> } > >>>>>>> > >>>>>>> This will work and provides derivatives to any order for almost any > >>>>>>> values of a and x, including a=0, x=1 as in your exemple, but also > >>>>>>> slightly better for a=0, x=0. However, it still has an important > >>>>>>> drawback: it won't compute the n-th order derivative correctly for > >>>>> a=0, > >>>>>>> x=0 and n > 1. It will provide NaN for these higher order > >>>>> derivatives > >>>>>>> instead of +/-infinity according to parity of n. > >>>>>> > >>>>>> I have added a similar function to the DerivativeStructure class > >>>>> (with > >>>>>> some errors above corrected). The main interesting property of this > >>>>>> function is that it is more accurate that converting a to a > >>>>>> DerivativeStructure and using the general x^y function. It does its > >>>>> best > >>>>>> to handle the special case, but as written above, this does NOT work > >>>>> for > >>>>>> general combination (i.e. more than one variable or more than one > >>>>>> order). As soon as there is a combination, the derivative will > >>>>> involve > >>>>>> something like df/dx * dg/dy and as infinities and zeros are > >>>>> everywheren > >>>>>> NaN appears immediately for these partial derivatives. This cannot > be > >>>>>> avoided. > >>>>>> > >>>>>> If you stay away from the singularity, the function behaves > >>>>> correctly. > >>>>>> > >>>>>> best regards, > >>>>>> Luc > >>>>>> > >>>>>>> > >>>>>>> This is a known problem that we already encountered when dealing > >>>>> with > >>>>>>> rootN. Here is an extract of a comment in the test case > >>>>>>> testRootNSingularity, where similar NaN appears instead of +/- > >>>>> infinity. > >>>>>>> The dsZero instance in the comment is simple the x parameter of the > >>>>>>> function, as a derivativeStructure with value 0.0 and depending on > >>>>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): > >>>>>>> > >>>>>>> > >>>>>>> // the following checks shows a LIMITATION of the current > >>>>> implementation > >>>>>>> // we have no way to tell dsZero is a pure linear variable x = 0 > >>>>>>> // we only say: "dsZero is a structure with value = 0.0, > >>>>>>> // first derivative = 1.0, second and higher derivatives = 0.0". > >>>>>>> // Function composition rule for second derivatives is: > >>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >>>>>>> // when function f is the nth root and x = 0 we have: > >>>>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher > >>>>>>> // derivatives keep switching between +infinity and -infinity) > >>>>>>> // so given that in our case dsZero represents g, we have g(x) = 0, > >>>>>>> // g'(x) = 1 and g''(x) = 0 > >>>>>>> // applying the composition rules gives: > >>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >>>>>>> // = -infinity * 1^2 + +infinity * 0 > >>>>>>> // = -infinity + NaN > >>>>>>> // = NaN > >>>>>>> // if we knew dsZero is really the x variable and not the identity > >>>>>>> // function applied to x, we would not have computed f'(g(x)) * > >>>>> g''(x) > >>>>>>> // and we would have found that the result was -infinity and not > >>>>> NaN > >>>>>>> > >>>>>>> Hope this helps > >>>>>>> Luc > >>>>>>> > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Ajo. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe > >>>>> <[email protected] > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi Ajo, > >>>>>>>>> > >>>>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : > >>>>>>>>>> Try this and I'm happy to explain if necessary: > >>>>>>>>>> > >>>>>>>>>> public class Derivative { > >>>>>>>>>> > >>>>>>>>>> public static void main(final String[] args) { > >>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, > >>>>> 0, > >>>>>> 1d); > >>>>>>>>>> System.out.println("Derivative of constant^x wrt x"); > >>>>>>>>>> for (int a = -3; a < 3; a++) { > >>>>>>>>> > >>>>>>>>> We have chosen the classical definition which implies c^x is not > >>>>>> defined > >>>>>>>>> for real r and negative c. > >>>>>>>>> > >>>>>>>>> Our implementation is based on the decomposition c^r = exp(r * > >>>>> ln(c)), > >>>>>>>>> so the NaN comes from the logarithm when c <= 0. > >>>>>>>>> > >>>>>>>>> Noe also that as explained in the documentation here: > >>>>>>>>> < > >>>>>>>>> > >>>>>> > >>>>> > >>>> > >> > http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation > >>>>>>>>>> , > >>>>>>>>> there are no concepts of "constants" and "variables" in this > >>>>> framework, > >>>>>>>>> so we cannot draw a line between c^r as seen as a univariate > >>>>> function > >>>>>> of > >>>>>>>>> r, or as a univariate function of c, or as a bivariate function > >>>>> of c > >>>>>> and > >>>>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with > >>>>> both c > >>>>>>>>> and r being computed elsewhere from p1...p5. So we don't make > >>>>> special > >>>>>>>>> cases for the case c = 0 for example. > >>>>>>>>> > >>>>>>>>> Does this explanation make sense to you? > >>>>>>>>> > >>>>>>>>> best regards, > >>>>>>>>> Luc > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> final DerivativeStructure a_ds = new > >>>>>> DerivativeStructure(1, > >>>>>>>>> 1, > >>>>>>>>>> a); > >>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>>>>>>>> System.out.format("Derivative@%d=%f\n", a, > >>>>>>>>>> out.getPartialDerivative(new int[]{1})); > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles > >>>>> <[email protected] > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Seems like the DerivativeCompiler returns NaN. > >>>>>>>>>>>> > >>>>>>>>>>>> IMHO it should return 0. > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> What should be 0? And Why? > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> Is this worthy of an issue? > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> As is, no. > >>>>>>>>>>> > >>>>>>>>>>> Gilles > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> -Ajo > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>> > >>>>> > >> > ------------------------------**------------------------------**--------- > >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< > >>>>>>>>> [email protected]> > >>>>>>>>>>> For additional commands, e-mail: [email protected] > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>> --------------------------------------------------------------------- > >>>>>>>>> To unsubscribe, e-mail: [email protected] > >>>>>>>>> For additional commands, e-mail: [email protected] > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>> --------------------------------------------------------------------- > >>>>>>> To unsubscribe, e-mail: [email protected] > >>>>>>> For additional commands, e-mail: [email protected] > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > --------------------------------------------------------------------- > >>>>>> To unsubscribe, e-mail: [email protected] > >>>>>> For additional commands, e-mail: [email protected] > >>>>>> > >>>>>> > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [email protected] > >>>> For additional commands, e-mail: [email protected] > >>>> > >>>> > >>> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [email protected] > >> For additional commands, e-mail: [email protected] > >> > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
