Le 30/11/2012 19:38, Konstantin Berlin a écrit :
> In my view the framework should be as simple as possible.
>
> class OptimizationFunction
> {
> public DiffValue value(double[] x)
> }
>
> where
>
> class DiffValue
> {
> double val;
> double[] gradient;
> }
I understood your previous messages, but am confused by this one and
especially by the two classes above. Apart from naming, your DiffValue
*is* what DerivativeStructure provide. The difference being that val is
stored as data[0] and gradient as data[1]...data[p-1].
As per your previous messages, I though you wanted to have two
functions, one returning val and the other returning gradiend and that
you were troubled we had only one function returning both. So I guess I
did not understood your previous messages.
Luc
>
> class DiffValueHessian
> {
> double val;
> double[] gradient;
> double[][] Hesssian;
> }
>
> or for least squares
>
> class DiffValueLeastSquares
> {
> double[] values;
> double[][] J;
> }
>
> this is all that any newton based optimization would need does not require
> large number of new objects per evaluation, allows one to simply reuse "val"
> to compute "gradient", etc. If the user wants to do automatic differentiation
> we should provide them a wrap function. Please lets keep optimization
> function as clean and simple as possible.
>
> On Nov 30, 2012, at 1:12 PM, Gilles Sadowski <[email protected]>
> wrote:
>
>> Hello.
>>
>>> As a user of the optimization algorithms I am completely confused by the
>>> change. It seems different from how optimization function are typically
>>> used and seems to be creating a barrier for no reason.
>>
>> If you think that it's for no reason, then you probably missed some
>> important point: If you can express the objective function in terms of
>> "DerivativeStructure" parameters, then you get all derivatives for free!
>>
>> Of course, that's not always easy (e.g. if the objective function is the
>> result of a sizeable amount of code).
>>
>>>
>>> I am not clear why you can't just leave the standard interface to an
>>> optimizer be a function that computes the value and the Jacobian (in case
>>> of least-squares), the gradient (for quasi-Newton methods) and if you
>>> actually have a full newton method, also the Hessian.
>>
>> Maybe we will; that's the discussion point I raised in this thread.
>> IIUC, there are cases where it is indeed a barrier to force user
>> into using "DerivativeStructure" although it does not bring any advantage
>> (like when the gradient and Jacobian are only accessible through finite
>> differences).
>>
>>>
>>> If the user wants to compute the Jacobian (gradient) using finite
>>> differences they can do it themselves, or wrap it into a class that you can
>>> provide them that will compute finite differences using the desired
>>> algorithm.
>>
>> That's one of the points below: We could assume that a finite difference
>> differentiator is outside the realm of sensible use of "DerivativeStructure"
>> (and we keep the converters) or we figure out what necessary and sufficient
>> features such a differentiator must have to cover all usages of CM (e.g.
>> enabling the derivative-based algorithms to work).
>>
>>>
>>> Also I can image a case when computation of the Jacobian can be sped up if
>>> the function value is known, yet if you have two separate functions handle
>>> the derivatives and the actual function value. For example f^2(x). You can
>>> probably derive some kind of caching scheme, but still.
>>
>> If using the "forward" formula for first-order derivative, then knowing the
>> value of the function spares one function evaluation per optimized
>> parameter.
>>
>>>
>>> Maybe I am missing something, but I spend about an hour trying to figure
>>> out how change my code to adapt to your new framework. Still haven't
>>> figured it out.
>>
>> You are not alone. I've spent much more than an hour, and only came with
>> questions. ;-)
>>
>>
>> Regards,
>> Gilles
>>
>>>
>>> On Nov 30, 2012, at 11:11 AM, Gilles Sadowski
>>> <[email protected]> wrote:
>>>
>>>> Hello.
>>>>
>>>> Context:
>>>> 1. A user application computes the Jacobian of a multivariate vector
>>>> function (the output of a simulation) using finite differences.
>>>> 2. The covariance matrix is obtained from "AbstractLeastSquaresOptimizer".
>>>> In the new API, the Jacobian is supposed to be "automatically" computed
>>>> from the "MultivariateDifferentiableVectorFunction" objective function.
>>>> 3. The converter from "DifferentiableMultivariateVectorFunction" to
>>>> "MultivariateDifferentiableVectorFunction" (in "FunctionUtils") is
>>>> deprecated.
>>>> 4. A "FiniteDifferencesDifferentiator" operator currently exists but only
>>>> for univariate functions.
>>>> Unles I'm mistaken, a direct extension to multiple variables won't do:
>>>> * because the implementation uses the symmetric formula, but in some
>>>> cases (bounded parameter range), it will fail, and
>>>> * because of the floating point representation of real values, the
>>>> delta for sampling the function should depend on the magnitude of
>>>> of the parameter value around which the sampling is done whereas the
>>>> "stepSize" is constant in the implementation.
>>>>
>>>> Questions:
>>>> 1. Shouldn't we keep the converters so that users can keep their
>>>> "home-made"
>>>> (first-order) derivative computations?
>>>> [Converters exist for gradient of "DifferentiableMultivariateFunction"
>>>> and Jacobian of "DifferentiableMultivariateVectorFunction".]
>>>> 2. Is it worth creating the multivariate equivalent of the univariate
>>>> "FiniteDifferencesDifferentiator", assuming that higher orders will
>>>> rarely be used because of
>>>> * the loss of accuracy (as stated in the doc), and/or
>>>> * the sometimes prohibitively expensive number of evaluations of the
>>>> objective function? [1]
>>>> 3. As current CM optimization algorithms need only the gradient or
>>>> Jacobian, would it be sufficient to only provide a limited (two-points
>>>> first-order) finite differences operator (with the possiblity to choose
>>>> either "symmetric", "forward" or "backward" formula for each parameter)?
>>>>
>>>>
>>>> Best regards,
>>>> Gilles
>>>>
>>>> [1] And this cost is somewhat "hidden" (as the "DerivativeStructure" is
>>>> supposed to provide the derivatives for free, which is not true in this
>>>> case).
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]