Hi Ajo, Le 28/08/2013 16:56, Ajo Fod a écrit : > To define things precisely: > y = f(a,x) = |a|^x > > Can we agree that: > df(a,x)/dx -> 0 when a->0 and x > 0 :[ NOTE: x > 0]
Yes, of course, it is perfectly true. > > If this is acceptable, we get this very useful property that df (a,x)/dx is > defined and continuous for all a provided x>0 because we use the modulus of > a in the function definition. Yes, as long as we don't have x = 0, we remain in a smooth, indefinitely differentiable domain. > In optimization, with this patch at |a|=0, I > can set an optimizer to search the whole real line without worrying about > a=0 otherwise I've to look out for a=0 explicitly. It seems unnecessary to > add a constraint to make |a|>0. I already have a constraint for x >0. I don't understand what you mean here. If you already know that x > 0, then you don't have to worry about a=0 or a>0 since in this case both approaches lead to the same result. If you look at the graph for df(a,x)/dx for a few values of a, you will see that we have: lim a->0+ df(a,x)/dx = 0 for x > 0 lim a->0+ df(a,x)/dx = -infinity for x = 0 and this despite df(a,x)/dx = ln(a) a^x is a continuous function, indefinitely differentiable. The limit of a continuous indefinitely differentiable function may be a non continuous function. It is a counter-intuitive result, I agree, but thre are many other examples of such strange behaviour in mathematics (if I remember well, Fourier transforms of step function exhibit the same paroble, backward). If you have x>0, you are already on the safe side of the singularity, so this is were I lose your tracks and don't understand how the singular point x=0 bothers you. best regards, Luc > > Cheers, > Ajo. > > > > On Tue, Aug 27, 2013 at 1:49 PM, Luc Maisonobe <luc.maison...@free.fr>wrote: > >> Hi Ajo, >> >> Le 27/08/2013 16:44, Ajo Fod a écrit : >>> Thanks for the constant structure. >>> >>> No. The limit value when x->0+ is 1, not O. >>> >>> I agree with this. I was just going for the derivatives = 0. >>> >>> >>>> The nth derivative of a^x can be computed analytically as ln(a)^n a^x, >>>> so the initial slope at x=0 is simply ln(a), positive for a > 1, zero >>>> for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> >> 0+. >>>> >>> >>> Lets think about this for a sec: >>> Derivative of |a|^x wrt x at x=2.0 for various values of a >>> Derivative@0.031250=-0.003384 >>> Derivative@0.015625=-0.001015 >>> Derivative@0.007813=-0.000296 >>> Derivative@0.003906=-0.000085 >>> Derivative@0.001953=-0.000024 >>> ... tends to 0 >> >> yes, because 2.0 > 0. >> >>> >>> Derivative of |a|^x wrt x at x=0.5 for various values of a >>> Derivative@0.031250=-0.612555 >>> Derivative@0.007813=-0.428759 >>> Derivative@0.001953=-0.275612 >>> Derivative@0.000488=-0.168418 >>> Derivative@0.000122=-0.099513 >>> Derivative@0.000031=-0.057407 >>> Derivative@0.000008=-0.032528 >>> Derivative@0.000002=-0.018176 >>> ... tends to 0 when a->0 >> >> yes because 0.5 > 0. >> >>> >>> The code I used for the print outs is: >>> static final double EPS = 0.0001d; >>> >>> public static void main(final String[] args) { >>> final double x = 0.5d; >>> int from = 5; >>> int to = 20; >>> System.out.println("Derivative of |a|^x wrt x at x=" + x); >>> for (int p = from; p < to; p+=2) { >>> double a = Math.pow(2d, -p); >>> final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / >>> EPS; >>> System.out.format("Derivative@%f=%f \n", a, calc); >>> } >>> } >>> >>> As for the x=0 case: >>> 1^0 = 1 >>> 0.5^0 = 1 >>> 0.0001^0 = 1 >>> 0^0 is technically undefined, but 1 is a good definition: >>> http://www.math.hmc.edu/funfacts/ffiles/10005.3-5.shtml >> >> Yes. >> >>> ... so, a good value for the differential of da^x/dx limit x->0 and >> a->0 = >>> 0 >> >> I don't agree. What you wrote in the lines above is another way to say >> what I wrote in my previous message: the value at x=0 is always y=1, and >> the value for x > 0 tends to 0 as a->0+. >> >> So the function always starts at 1 and dives more and more steeply as a >> becomes smaller, and the derivative at 0 becomes more and more negative, >> up to -infinity, *not* 0. >> >> The function is ill-behaved and the fact the derivative is infinite is >> consistent with this ill-behaviour. >> >> The definition of the derivative is : >> >> f'(x) = lim (f(x+h) - f(x))/h when h -> 0+ >> >> when f(x) = 0^x and assuming 0^0 = 1 as you have agreed above, this gives: >> >> f'(0) = lim (0^(0+h) - 0^0)/h = lim (0 - 1)/h = -infinity >> >> which is exactly the same result as computing for a non-null a and then >> reducing it: d(a^x)/dx = ln(a) a^x = ln(a) when x=0, diverges to >> -infinity when a converges to 0. >> >>> >>> >>> As mentioned earlier, I think the cause for this is that log|a| -> >> infinity >>> slower than |a|^x -> 0 as |a|->0 . >> >> But a^x does *not* converge to 0 for x = 0! a^0 is always 1 (rigorously) >> regardless of the value of a as long as it is not 0, and then when we >> change a we can also consider the limit is 1 when a-> 0. This convention >> is well accepted. This convention is implemented in the Java standard >> Math.pow function, and we followed this trend. This is the reason why >> the functions becomes more and more steep as a becomes smaller. At the >> end, it is a discontinuous function (and hence should not be >> differentiable, or it is differentiable only if we use extended real >> numbers with infinity added). >> >> This is the heart of the ill-behaviour of 0^0. We want to compute it as >> a limit value for a^b when both parameters converge to 0, but we get a >> different result if we first set a fixed and converge b to 0, and later >> reduce a down to zero (your approach), and when we do the opposite. In >> one case we get 0, in the other case we get 1. >> >> Lets put it another way: >> If we consider the derivative f'(0) should be 0, then the value f(0) >> should also be considered equal to zero. This would mean as soon as we >> get a tiny non-zero a (say the smallest number that can be represented >> as a double), then f(0) would jump from 0 to 1 instantly, and f'(0) >> would jump from 0 to -infinity instantly. So we would have at a = 0 an >> initial null derivative, then a jump to a very negative derivative as a >> leaves 0, then the derivative would become less and less negative as a >> increase up to 1, at a=1 the derivative would again be 0, then the >> derivative would continue to increase and becode positive as a grows >> larger than 1 (all these derivatives are computed at x=0, and as written >> previously, they are simply equal to log(a)). >> >> To summarize, the two choices are: >> 1) - first considering a fixed a, strictly positive, >> - then looking globally at the function a^x for all values x>=0, >> - then reducing a, noting that all functions start at the same >> point x=0, y=1 and the derivatives become more and more negative >> as the function becomes more and more ill-behaved >> 2) - first considering a fixed x, strictly positive, >> - then reducing a and identifying the limit values is 0 for all a, >> - then building a function by packing all the x>0, which is very >> smooth as it is identically 0 for all x>0 >> - finally adding the limit value at x=0, which in this case would >> be 0 (and the derivative would also be 0). >> >> it seems well accepted to consider the value of 0^0 should be set to 1, >> and as a consequence the corresponding derivative with respect to x >> should be set to -infinity. >> >> I fully agree it is not a perfect solution, it is an arbitrary choice. >> However, this choice is consistent with what all implementations of the >> pow function I have seen (i.e. 0^0 set to 1 instead of 0). >> >> Your approach is not wrong, it is as valid as the other one. It is >> simply not the common choice. >> >> I would say an even better choice would have been to say 0^0 *is not* >> defined and even the value should be set to NaN (not even speaking of >> the derivative). >> >> Does this seem acceptable to you? >> >> best regards, >> Luc >> >>> >>> Cheers, >>> Ajo. >>> >>> >>>> The limit curve corresponding to a = 0 is therefore a singular function >>>> with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 is >>>> consistent with the derivative being negative infinity, as by definition >>>> the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the >>>> finite difference is -1/h. >>>> >>>>> } >>>>> }else{ >>>>> for (int i = 0; i < function.length; ++i) { >>>>> function[i] = Double.NaN; >>>>> } >>>> >>>> This alternative case is a good improvement, thanks for it. I forgot to >>>> handle negative cases properly. I have therefore changed the code >>>> (committed as r1517788) with this improvement, together with several >>>> test cases. >>>> >>>>> } >>>>> } else { >>>>> >>>>> >>>>> in place of : >>>>> >>>>> if (a == 0) { >>>>> if (operand[operandOffset] == 0) { >>>>> function[0] = 1; >>>>> double infinity = Double.POSITIVE_INFINITY; >>>>> for (int i = 1; i < function.length; ++i) { >>>>> infinity = -infinity; >>>>> function[i] = infinity; >>>>> } >>>>> } >>>>> } else { >>>>> >>>>> >>>>> PS: I think you made a change to DSCompiler.pow too. If so, what >> happens >>>>> when a=0 & x!=0 in that function? >>>> >>>> No, I didn't change the other signatures of the pow function. So the >>>> value should be OK (i.e. 1) but all derivatives, including the first >>>> one, should be NaN. What the new function brings is a correct negetive >>>> infinity first derivative at singularity point, better accuracy for >>>> non-singular points, and possibly faster computation. >>>> >>>> best regards, >>>> Luc >>>> >>>>> >>>>> >>>>> On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <l...@spaceroots.org> >>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> >>>>>> Ajo Fod <ajo....@gmail.com> a écrit : >>>>>>> Are you saying patched the code? Can you provide the link? >>>>>> >>>>>> I committed it in the development version. You just have to update >> your >>>>>> checked out copy from either the official >>>>>> Apache subversion repository or the git mirror we talked about in a >>>>>> previous thread. >>>>>> >>>>>> The new method is a static one called pow and taking a and x as >>>> arguments >>>>>> and returning a^x. Not to >>>>>> Be confused with the non-static methods that take only the power as >>>>>> argument (either int, double or >>>>>> DerivativeStructure) and use the instance as the base to apply power >> on. >>>>>> >>>>>> Best regards, >>>>>> Luc >>>>>> >>>>>>> >>>>>>> -Ajo >>>>>>> >>>>>>> >>>>>>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <l...@spaceroots.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : >>>>>>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : >>>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Hi Ajo, >>>>>>>>> >>>>>>>>>> >>>>>>>>>> This shows one way of interpreting the derivative for strictly +ve >>>>>>>> numbers. >>>>>>>>>> >>>>>>>>>> public static void main(final String[] args) { >>>>>>>>>> final double x = 1d; >>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, >>>>>>> x); >>>>>>>>>> System.out.println("Derivative of |a|^x wrt x"); >>>>>>>>>> for (int p = 10; p < 21; p++) { >>>>>>>>>> double a; >>>>>>>>>> if (p < 20) { >>>>>>>>>> a = 1d / Math.pow(2d, p); >>>>>>>>>> } else { >>>>>>>>>> a = 0d; >>>>>>>>>> } >>>>>>>>>> final DerivativeStructure a_ds = new >>>>>>> DerivativeStructure(1, >>>>>>>> 1, >>>>>>>>>> a); >>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>>>>> final double calc = (Math.pow(a, x + EPS) - >>>>>>> Math.pow(a, x)) >>>>>>>> / >>>>>>>>>> EPS; >>>>>>>>>> System.out.format("Derivative@%f=%f %f\n", a, calc, >>>>>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> At this point I"m explicitly substituting the rule that >>>>>>>> derivative(|a|^x) = >>>>>>>>>> 0 for |a|=0. >>>>>>>>> >>>>>>>>> Yes, but this fails for x = 0, as the limit of the finite >>>>>>> difference is >>>>>>>>> -infinity and not 0. >>>>>>>>> >>>>>>>>> You can build your own function which explicitly assumes a is >>>>>>> constant >>>>>>>>> and takes care of special values as follows: >>>>>>>>> >>>>>>>>> public static DerivativeStructure aToX(final double a, >>>>>>>>> final DerivativeStructure >>>>>>> x) { >>>>>>>>> final double lnA = (a == 0 && x.getValue() == 0) ? >>>>>>>>> Double.NEGATIVE_INFINITY : >>>>>>>>> FastMath.log(a); >>>>>>>>> final double[] function = new double[1 + x.getOrder()]; >>>>>>>>> function[0] = FastMath.pow(a, x.getValue()); >>>>>>>>> for (int i = 1; i < function.length; ++i) { >>>>>>>>> function[i] = lnA * function[i - 1]; >>>>>>>>> } >>>>>>>>> return x.compose(function); >>>>>>>>> } >>>>>>>>> >>>>>>>>> This will work and provides derivatives to any order for almost any >>>>>>>>> values of a and x, including a=0, x=1 as in your exemple, but also >>>>>>>>> slightly better for a=0, x=0. However, it still has an important >>>>>>>>> drawback: it won't compute the n-th order derivative correctly for >>>>>>> a=0, >>>>>>>>> x=0 and n > 1. It will provide NaN for these higher order >>>>>>> derivatives >>>>>>>>> instead of +/-infinity according to parity of n. >>>>>>>> >>>>>>>> I have added a similar function to the DerivativeStructure class >>>>>>> (with >>>>>>>> some errors above corrected). The main interesting property of this >>>>>>>> function is that it is more accurate that converting a to a >>>>>>>> DerivativeStructure and using the general x^y function. It does its >>>>>>> best >>>>>>>> to handle the special case, but as written above, this does NOT work >>>>>>> for >>>>>>>> general combination (i.e. more than one variable or more than one >>>>>>>> order). As soon as there is a combination, the derivative will >>>>>>> involve >>>>>>>> something like df/dx * dg/dy and as infinities and zeros are >>>>>>> everywheren >>>>>>>> NaN appears immediately for these partial derivatives. This cannot >> be >>>>>>>> avoided. >>>>>>>> >>>>>>>> If you stay away from the singularity, the function behaves >>>>>>> correctly. >>>>>>>> >>>>>>>> best regards, >>>>>>>> Luc >>>>>>>> >>>>>>>>> >>>>>>>>> This is a known problem that we already encountered when dealing >>>>>>> with >>>>>>>>> rootN. Here is an extract of a comment in the test case >>>>>>>>> testRootNSingularity, where similar NaN appears instead of +/- >>>>>>> infinity. >>>>>>>>> The dsZero instance in the comment is simple the x parameter of the >>>>>>>>> function, as a derivativeStructure with value 0.0 and depending on >>>>>>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): >>>>>>>>> >>>>>>>>> >>>>>>>>> // the following checks shows a LIMITATION of the current >>>>>>> implementation >>>>>>>>> // we have no way to tell dsZero is a pure linear variable x = 0 >>>>>>>>> // we only say: "dsZero is a structure with value = 0.0, >>>>>>>>> // first derivative = 1.0, second and higher derivatives = 0.0". >>>>>>>>> // Function composition rule for second derivatives is: >>>>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>>>>>> // when function f is the nth root and x = 0 we have: >>>>>>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher >>>>>>>>> // derivatives keep switching between +infinity and -infinity) >>>>>>>>> // so given that in our case dsZero represents g, we have g(x) = 0, >>>>>>>>> // g'(x) = 1 and g''(x) = 0 >>>>>>>>> // applying the composition rules gives: >>>>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>>>>>> // = -infinity * 1^2 + +infinity * 0 >>>>>>>>> // = -infinity + NaN >>>>>>>>> // = NaN >>>>>>>>> // if we knew dsZero is really the x variable and not the identity >>>>>>>>> // function applied to x, we would not have computed f'(g(x)) * >>>>>>> g''(x) >>>>>>>>> // and we would have found that the result was -infinity and not >>>>>>> NaN >>>>>>>>> >>>>>>>>> Hope this helps >>>>>>>>> Luc >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Ajo. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe >>>>>>> <luc.maison...@free.fr >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Ajo, >>>>>>>>>>> >>>>>>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : >>>>>>>>>>>> Try this and I'm happy to explain if necessary: >>>>>>>>>>>> >>>>>>>>>>>> public class Derivative { >>>>>>>>>>>> >>>>>>>>>>>> public static void main(final String[] args) { >>>>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, >>>>>>> 0, >>>>>>>> 1d); >>>>>>>>>>>> System.out.println("Derivative of constant^x wrt x"); >>>>>>>>>>>> for (int a = -3; a < 3; a++) { >>>>>>>>>>> >>>>>>>>>>> We have chosen the classical definition which implies c^x is not >>>>>>>> defined >>>>>>>>>>> for real r and negative c. >>>>>>>>>>> >>>>>>>>>>> Our implementation is based on the decomposition c^r = exp(r * >>>>>>> ln(c)), >>>>>>>>>>> so the NaN comes from the logarithm when c <= 0. >>>>>>>>>>> >>>>>>>>>>> Noe also that as explained in the documentation here: >>>>>>>>>>> < >>>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >>>>>>>>>>>> , >>>>>>>>>>> there are no concepts of "constants" and "variables" in this >>>>>>> framework, >>>>>>>>>>> so we cannot draw a line between c^r as seen as a univariate >>>>>>> function >>>>>>>> of >>>>>>>>>>> r, or as a univariate function of c, or as a bivariate function >>>>>>> of c >>>>>>>> and >>>>>>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with >>>>>>> both c >>>>>>>>>>> and r being computed elsewhere from p1...p5. So we don't make >>>>>>> special >>>>>>>>>>> cases for the case c = 0 for example. >>>>>>>>>>> >>>>>>>>>>> Does this explanation make sense to you? >>>>>>>>>>> >>>>>>>>>>> best regards, >>>>>>>>>>> Luc >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> final DerivativeStructure a_ds = new >>>>>>>> DerivativeStructure(1, >>>>>>>>>>> 1, >>>>>>>>>>>> a); >>>>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>>>>>>> System.out.format("Derivative@%d=%f\n", a, >>>>>>>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles >>>>>>> <gil...@harfang.homelinux.org >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Seems like the DerivativeCompiler returns NaN. >>>>>>>>>>>>>> >>>>>>>>>>>>>> IMHO it should return 0. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> What should be 0? And Why? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Is this worthy of an issue? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> As is, no. >>>>>>>>>>>>> >>>>>>>>>>>>> Gilles >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -Ajo >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>>> >>>> >> ------------------------------**------------------------------**--------- >>>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >>>>>>>>>>> dev-unsubscr...@commons.apache.org> >>>>>>>>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>>>>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>>> >>>>>> >>>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>> >>>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org