Le 30/11/2012 19:38, Konstantin Berlin a écrit : > In my view the framework should be as simple as possible. > > class OptimizationFunction > { > public DiffValue value(double[] x) > } > > where > > class DiffValue > { > double val; > double[] gradient; > }
I understood your previous messages, but am confused by this one and especially by the two classes above. Apart from naming, your DiffValue *is* what DerivativeStructure provide. The difference being that val is stored as data[0] and gradient as data[1]...data[p-1]. As per your previous messages, I though you wanted to have two functions, one returning val and the other returning gradiend and that you were troubled we had only one function returning both. So I guess I did not understood your previous messages. Luc > > class DiffValueHessian > { > double val; > double[] gradient; > double[][] Hesssian; > } > > or for least squares > > class DiffValueLeastSquares > { > double[] values; > double[][] J; > } > > this is all that any newton based optimization would need does not require > large number of new objects per evaluation, allows one to simply reuse "val" > to compute "gradient", etc. If the user wants to do automatic differentiation > we should provide them a wrap function. Please lets keep optimization > function as clean and simple as possible. > > On Nov 30, 2012, at 1:12 PM, Gilles Sadowski <gil...@harfang.homelinux.org> > wrote: > >> Hello. >> >>> As a user of the optimization algorithms I am completely confused by the >>> change. It seems different from how optimization function are typically >>> used and seems to be creating a barrier for no reason. >> >> If you think that it's for no reason, then you probably missed some >> important point: If you can express the objective function in terms of >> "DerivativeStructure" parameters, then you get all derivatives for free! >> >> Of course, that's not always easy (e.g. if the objective function is the >> result of a sizeable amount of code). >> >>> >>> I am not clear why you can't just leave the standard interface to an >>> optimizer be a function that computes the value and the Jacobian (in case >>> of least-squares), the gradient (for quasi-Newton methods) and if you >>> actually have a full newton method, also the Hessian. >> >> Maybe we will; that's the discussion point I raised in this thread. >> IIUC, there are cases where it is indeed a barrier to force user >> into using "DerivativeStructure" although it does not bring any advantage >> (like when the gradient and Jacobian are only accessible through finite >> differences). >> >>> >>> If the user wants to compute the Jacobian (gradient) using finite >>> differences they can do it themselves, or wrap it into a class that you can >>> provide them that will compute finite differences using the desired >>> algorithm. >> >> That's one of the points below: We could assume that a finite difference >> differentiator is outside the realm of sensible use of "DerivativeStructure" >> (and we keep the converters) or we figure out what necessary and sufficient >> features such a differentiator must have to cover all usages of CM (e.g. >> enabling the derivative-based algorithms to work). >> >>> >>> Also I can image a case when computation of the Jacobian can be sped up if >>> the function value is known, yet if you have two separate functions handle >>> the derivatives and the actual function value. For example f^2(x). You can >>> probably derive some kind of caching scheme, but still. >> >> If using the "forward" formula for first-order derivative, then knowing the >> value of the function spares one function evaluation per optimized >> parameter. >> >>> >>> Maybe I am missing something, but I spend about an hour trying to figure >>> out how change my code to adapt to your new framework. Still haven't >>> figured it out. >> >> You are not alone. I've spent much more than an hour, and only came with >> questions. ;-) >> >> >> Regards, >> Gilles >> >>> >>> On Nov 30, 2012, at 11:11 AM, Gilles Sadowski >>> <gil...@harfang.homelinux.org> wrote: >>> >>>> Hello. >>>> >>>> Context: >>>> 1. A user application computes the Jacobian of a multivariate vector >>>> function (the output of a simulation) using finite differences. >>>> 2. The covariance matrix is obtained from "AbstractLeastSquaresOptimizer". >>>> In the new API, the Jacobian is supposed to be "automatically" computed >>>> from the "MultivariateDifferentiableVectorFunction" objective function. >>>> 3. The converter from "DifferentiableMultivariateVectorFunction" to >>>> "MultivariateDifferentiableVectorFunction" (in "FunctionUtils") is >>>> deprecated. >>>> 4. A "FiniteDifferencesDifferentiator" operator currently exists but only >>>> for univariate functions. >>>> Unles I'm mistaken, a direct extension to multiple variables won't do: >>>> * because the implementation uses the symmetric formula, but in some >>>> cases (bounded parameter range), it will fail, and >>>> * because of the floating point representation of real values, the >>>> delta for sampling the function should depend on the magnitude of >>>> of the parameter value around which the sampling is done whereas the >>>> "stepSize" is constant in the implementation. >>>> >>>> Questions: >>>> 1. Shouldn't we keep the converters so that users can keep their >>>> "home-made" >>>> (first-order) derivative computations? >>>> [Converters exist for gradient of "DifferentiableMultivariateFunction" >>>> and Jacobian of "DifferentiableMultivariateVectorFunction".] >>>> 2. Is it worth creating the multivariate equivalent of the univariate >>>> "FiniteDifferencesDifferentiator", assuming that higher orders will >>>> rarely be used because of >>>> * the loss of accuracy (as stated in the doc), and/or >>>> * the sometimes prohibitively expensive number of evaluations of the >>>> objective function? [1] >>>> 3. As current CM optimization algorithms need only the gradient or >>>> Jacobian, would it be sufficient to only provide a limited (two-points >>>> first-order) finite differences operator (with the possiblity to choose >>>> either "symmetric", "forward" or "backward" formula for each parameter)? >>>> >>>> >>>> Best regards, >>>> Gilles >>>> >>>> [1] And this cost is somewhat "hidden" (as the "DerivativeStructure" is >>>> supposed to provide the derivatives for free, which is not true in this >>>> case). >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org