Re: [math] major problem with new released version 3.1

Luc Maisonobe Sat, 29 Dec 2012 04:17:30 -0800

Le 29/12/2012 12:34, Gilles Sadowski a écrit :
> On Sat, Dec 29, 2012 at 10:43:11AM +0100, Luc Maisonobe wrote:
>> Le 29/12/2012 10:08, Luc Maisonobe a écrit :
>>> Le 29/12/2012 04:09, Gilles Sadowski a écrit :
>>>> Hi.
>>>
>>> Hi Gilles,
>>>
>>>>
>>>>> [...]
>>>>
>>>>>> third is just bug-fix.  Does the fix for the issue that started this
>>>>>> thread change the API?  If so, we would need to cut 3.2 in any case.
>>>>
>>>> The current fix does change the usage (one needs to call optimize with an
>>>> argument of a different type). IIUC, it also silently removes the handling
>>>> of uncorrelated observations.
>>>>
>>>>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
>>>>> and committed the fix as of r1426616.
>>>>>
>>>>> Could someone please review what I have done?
>>>>
>>>> I don't like the fix...
>>>
>>> Thanks for reviewing.
>>>
>>>>
>>>> Handling weighted observations must take correlations into account, i.e. 
>>>> use
>>>> a _matrix_.
>>>> There is the _practical_ problem of memory. Solving it correctly is by 
>>>> using
>>>> a sparse implementation (and this is actually an implementation _detail_).
>>>
>>> Yes.
>>>
>>>> If we _need_ such an implementation to solve the practical problem, I
>>>> strongly suggest that we focus on creating it (or fixing what CM already
>>>> has, or accept that some inconsistency will be present), rather than
>>>> reducing the code applicability (i.e. allowing only uncorrelated data).
>>>> If the observations are not correlated, the matrix is a diagonal matrix, 
>>>> not
>>>> a vector.
>>>
>>> It's fine with me. I simply thought it wouldn't be that easy. You proved
>>> me wrong.
>>>
>>>> CM also lacks implementations for symmetric, triangular, and diagonal 
>>>> matrix,
>>>> which all would go some way to solving several practical problems of the 
>>>> same
>>>> kind as the current one without sacrificing generality.
>>>
>>> Yes, we have known that for years.
>>>
>>>>
>>>> Now, and several years ago, it was noticed that CM does not _have_ to
>>>> provide the "weights" feature because users can handle this before they
>>>> call CM code. [IIRC, no CM unit test actually use weights different from 
>>>> 1.]
>>>> IMO, the other valid option is thus to have a simpler version of the
>>>> algorithm, but still a correct one.
>>>> This would also have the advantage that we won't have the urgent need to
>>>> keep the sparse matrix implementation.
>>>> [Then, if at some point we include helper code to handle weights
>>>> (_including_ correlations), we should also do it in a "preprocessing" step,
>>>> without touching the optimization algorithms.]
>>>
>>> So what do you suggest: keep the current support (with proper handling
>>> as you did) or drop it?
>>>
>>> Since several people asked for removing it (Dimitri, Konstantin and now
>>> you), we can do that. Unitl now, this feature was a convenience that was
>>> really useful for some cases, and it was simple. There were some errors
>>> in it and Dimitri solved them in 2010, but no other problems appeared
>>> since them, so it made sense simply keeping it as is. Now we are hit by
>>> a second problem, and it seems it opens a can of worm. Dropping it
>>> completely as a not so useful convenience which is tricky to set up
>>> right would be fair.
>>>
>>>>
>>>>> I also think (but did not test it), that there may be a problem with
>>>>> missing OptimizationData. If someone call the optimizer but forget to
>>>>> set the weight (or the target, or some other mandatory parameters), then
>>>>> I'm not sure we fail with an appropriate error. Looking for example at
>>>>> the private checkParameters method in the MultivariateVectorOptimizer
>>>>> abstract class, I guess we may have a NullPointer Exception if either
>>>>> Target or Weight/NonCorrelatedWeight has not been parsed in the
>>>>> parseOptimizationData method. Could someone confirm this?
>>>>
>>>> Yes.
>>>> And this (not checking for missing data) _could_ be considered a feature, 
>>>> as
>>>> I stressed several times on this ML, and in the code documentation
>>>> (eliciting zero constructive comment).
>>>> We also _agreed_ that users not passing needed data will result in NPE.
>>>> [I imagined that applications would check that valid and complete input is
>>>> passed to the lower level "optimize(OptimizationData ...)" methods.]
>>>
>>> Yes, abd I agree with that. However, I found the javadoc to be
>>> ambiguous. It says "The following data will be looked for:" followed by
>>> a list. There is no distinction between optional and required
>>> parameters. As an example, simple bounds are optional whereas initial
>>> guess or weight are required, but there is nothing to tell it to user.
>>> So in this case, either we should provide proper exception or proper
>>> documentation. I am OK with both.
>>>
>>>>
>>>> It is however straightforward to add a "checkParameters()" method that 
>>>> would
>>>> raise more specific exceptions. [Although that would contradict the
>>>> conclusion of the previous discussion about NPE in CM. And restart it,
>>>> without even getting a chance to go forward with what had been decided!]
>>>
>>> As long as we identify the parameters that are optional (and hence user
>>> can deduce the other one are mandatory and will raise an NPE), this
>>> would be fine. I don't ask to restart this tiring discussion, just make
>>> sure users have the proper way to understand why they get an NPE when
>>> the forget weight and why they don't get one when the forget simple bounds.
>>>
>>> Also weight should really be optional and have a fallback to identity
>>> matrix, but this is another story.
>>>
>>>>
>>>>
>>>> Hence, to summarize:
>>>>  * The fix, in a 3.2 release, should be to replace the matrix 
>>>> implementation
>>>>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] 
>>>> or
>>>>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
>>>
>>> +1
>>
>> I forgot to ask: do you want me to first revert my change before you
>> commit yours
> 
> Yes, please.


Done as of r1426751.

> 
>> or will you do both at the same time?
> 
> I wouldn't know how to do that.
> What is the svn command to revert a list a files to their state in a given
> revision?

I don't know. I now use git-svn to connect to our subversion repository
while using git commands locally.

Luc

> 
> Thanks,
> Gilles
> 
>>
>> Luc
>>
>>>
>>>>  * We must decide wether to deprecate the weights feature in that release
>>>>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>>>>    un-deprecate "OpenMapRealMatrix"[2]).
>>>
>>> +1 to deprecate.
>>>
>>> best regards
>>> Luc
>>>
>>>>
>>>>
>>>> Best regards,
>>>> Gilles
>>>>
>>>> [1] The inconsistencies that led to the deprecation will have no bearing
>>>>     on usage within the optimizers.
>>>> [2] The latter option seems likely to entail a fair amount of work to
>>>>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>>>>     for some operations, e.g. "multiply").
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math] major problem with new released version 3.1

Reply via email to