Re: [math] UnexpectedNegativeIntegerException

Luc Maisonobe Tue, 28 Aug 2012 05:36:56 -0700

Le 26/08/2012 19:42, Phil Steitz a écrit :
> On 8/26/12 12:20 AM, Luc Maisonobe wrote:
>> Le 25/08/2012 23:43, Gilles Sadowski a écrit :
>>> Hello Luc.
>> Hi Gilles,
>>
>>> On Fri, Aug 24, 2012 at 09:31:41AM +0200, Luc Maisonobe wrote:
>>>> Le 24/08/2012 01:35, Gilles Sadowski a écrit :
>>>>> On Thu, Aug 23, 2012 at 12:00:56PM -0700, Phil Steitz wrote:
>>>>>> On 8/23/12 5:37 AM, Luc Maisonobe wrote:
>>>>>>> Le 23/08/2012 13:37, Gilles Sadowski a écrit :
>>>>>>>> On Thu, Aug 23, 2012 at 12:35:18PM +0200, Sébastien Brisard wrote:
>>>>>>>>> Hi Gilles,
>>>>>>>>>
>>>>>>>>> 2012/8/23 Gilles Sadowski <gil...@harfang.homelinux.org>:
>>>>>>>>>> On Thu, Aug 23, 2012 at 10:05:10AM +0200, Sébastien Brisard wrote:
>>>>>>>>>>> Hi Luc,
>>>>>>>>>>>
>>>>>>>>>>> 2012/8/23 Luc Maisonobe <luc.maison...@free.fr>:
>>>>>>>>>>>> Le 23/08/2012 05:16, Sébastien Brisard a écrit :
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> in MATH-849, I have proposed an implementation of Gamma(x)
>>>>>>>>>>>>> (previously, class Gamma had only logGamma(x)). Gamma(x) is not
>>>>>>>>>>>>> defined for x negative integer. In such instances, I would like to
>>>>>>>>>>>>> throw an exception instead of returning Double.NaN. However, 
>>>>>>>>>>>>> creating
>>>>>>>>>>>>> a new exception in o.a.c.m.exception seems exagerated, since it's 
>>>>>>>>>>>>> very
>>>>>>>>>>>>> unlikely that this exception should be used elsewhere (or maybe).
>>>>>>>>>>>>> Should I define a nested exception instead [1]?
>>>>>>>>>>>>>
>>>>>>>>>>>>> What do you think of the name 
>>>>>>>>>>>>> "UnexpectedNegativeIntegerException"? It
>>>>>>>>>>>>> does not really match the pattern of already defined exceptions, 
>>>>>>>>>>>>> but I
>>>>>>>>>>>>> can't find a better name.
>>>>>>>>>>>> Don't we already have NotPositiveException?
>>>>>>>>>>>>
>>>>>>>>>>>> Luc
>>>>>>>>>>>>
>>>>>>>>>>> We do, but Gamma is defined for all negative values, except integer 
>>>>>>>>>>> ones...
>>>>>>>>>> I think that in some circumstances, it might be useful to not throw
>>>>>>>>>> exceptions...
>>>>>>>>>> FastMath's "log" returns NaN for negative input.
>>>>>>>>>>
>>>>>>>>> then I guess that logGamma(x) should also return NaN if x <= 0?
>>>>>>>> Anyways, that's what it does currently.
>>>>>>>>
>>>>>>>>> I have to say I do not really like this option.
>>>>>>>> So did you intend to change that?
>>>>>>>>
>>>>>>>>> My life would
>>>>>>>>> sometimes be much easier if NaNs didn't exist... the good old days of
>>>>>>>>> the "floating-point error".
>>>>>> We have different memories - I am old enough to remember wasting
>>>>>> real money due to jobs failing on "floating point checks" when I
>>>>>> would have preferred to have computations complete and return NaN
>>>>>> (which had not been invented yet).
>>>>>>
>>>>>>>> IIRC NaN could be useful for example in an optimization algorithm; 
>>>>>>>> excerpt
>>>>>>>> from Kahan:
>>>>>>>> ---
>>>>>>>> [...] NaNs is an opportunity ( not obligation ) for software ( 
>>>>>>>> especially
>>>>>>>> when searching ) to follow an unexceptional path ( no need for exotic
>>>>>>>> control structures ) to a point where an exceptional event can be 
>>>>>>>> appraised
>>>>>>>> after the event, when additional evidence may have accrued. [...]
>>>>>>>> ---
>>>>>>>>
>>>>>>>> I do not say that Commons Math should prefer NaN over throwing 
>>>>>>>> exceptions.
>>>>>>> If the choice is allowed would really prefer NaN for such cases.
>>>>>> +1
>>>>>>>> Maybe that it depends on how high-level an application is (i.e. if 
>>>>>>>> there are
>>>>>>>> already calls to methods that could throw exceptions, then an 
>>>>>>>> algorithm that
>>>>>>>> would not use try/catch to protect itself would fail anyway).
>>>>>>>> If we want Commons Math to allow taking advantage of NaNs, it would 
>>>>>>>> probably
>>>>>>>> need to be updated so that a lot of precondition checks ought to be 
>>>>>>>> removed
>>>>>>>> (but this will likely lead to reduced robustness in some applications 
>>>>>>>> that
>>>>>>>> do not do their own checks...).
>>>>>>> This would clearly be cumbersome for users.
>>>>>>> Since we have changed our exception hierarchy, we don't have a single
>>>>>>> root anymore, so users simply cannot catch all exception we throw at
>>>>>>> once, they have to check every single type, and make sure they are
>>>>>>> thrown by themselves without any help from compiler.
>>>>>>>
>>>>>>> Just adding new exceptions is too much, we have already gone too far
>>>>>>> this way.
>>>>>> +1 and the fact that all exceptions are unchecked makes it even
>>>>>> harder for client apps as the compiler will not help / force them. 
>>>>> For those new to this list, they can search the archive for lengthy
>>>>> discussions on this subject.
>>>>> The summary is that CM has absolutely no use of checked exceptions.
>>>> Another really important part is that CM doesn't have anymore a
>>>> hierarchy of exception. It has a bunch of completely unrelated
>>>> exceptions, all extending different standard Java exceptions. This is
>>>> the part that really bothers me, but this is the current consensus.
>>> I would be OK to change it to back to the previous state of affairs, that
>>> is, the one where we had agreed on a singly rooted hierarchy with base class
>>> "MathRuntimeException".
>>> The current consensus was reached because you didn't voice the concern you
>>> are now mentioning.
>> I thought I had. Perhaps this feature was set up after I gave up on this
>> discussion.
>>
>>> It would be quite easy to change, if it would make your life easier.
>>>
>>> The more so that I never saw what is gained from copying the Java hierachy
>>> (in the particular case of the exceptions): Because some exception inherits
>>> from the Java standard one does not bring special benefits to the
>>> application that has to catch that exception. I mean: Is there any piece of
>>> code that would behave differently if it caught "IllegalArgumentException"
>>> vs "IllegalStateException"? If not, it could as well be prepared to catch
>>> a "MathRuntimeException" (and do the same thing).
>>> [The various exception types are primarily there to discriminate between
>>> various _problems_; but are not likely helpful to help the caller devise a
>>> way to react to the exception once it is raised (other than acknowledge the
>>> fact than CM could not perform the requested action).]
>>>
>>> In CM, the vastly overwhelming majority of exceptions are instances of
>>> "MathIllegalArgumentException" or one of its subclasses.
>>>
>>> We have a "NullArgumentException" but we also agreed that it did not have to
>>> be a subclass of the standard Java "NullPointerException". So in this case,
>>> we already depart from the "standard". [But we also speculated that the
>>> policy could to never check for "null" and let the JVM do that, This 
>>> behaviour
>>> is _not_ consistent throughout CM.]
>>>
>>> Number of occurrences of CM exceptions that are subclasses of those Java
>>> standard exceptions:
>>>  * IllegalStateException (43)
>>>  * UnsupportedOperationException (22)
>>>  * ArithmeticException (54)
>>>
>>> In summary, I have no problem with a "MathRuntimeException" base class which
>>> "MathIllegalArgumentException", "MathIllegalStateException",
>>> "MathUnsupportedOperationException", "MathArithmeticException" would inherit
>>> from.
>>>
>>> Applications that call CM would be safe (apart from bugs raising "NPE")
>>> with a unique catch clause intercepting "MathRuntimeException".
>> I am happy (and surprised) to read that.
>> I would really much like to go back to a single root exception
>> hierarchy. This both helps top level application as depending on context
>> they can either pinpoint the exception they want to catch or they can
>> have a grab all strategy. It is their choice.
> 
> I like throwing (and catching) standard exceptions instead of
> inventing variants of them, which is why I favored having MathIAE
> inherit from IAE, etc.  I would have preferred to just throw IAE
> directly, but we could not agree on how to do that and preserve
> localization, so we ended up with the current setup where we have
> custom variants, but they inherit from the standard exceptions.  I
> am curious, Luc, about exactly what kinds of use cases will really
> be easier / better for users if we go back to a single-rooted
> hierarchy.  I get that instead of "catch Exception" or "catch
> RuntimeException" you can at the top level "catch MathRTE" and that
> will catch only the exceptions that come (at least originally) from
> the [math] code.


Yes, but this is only one aspect.

> Can you help me understand via an example how that
> is a big benefit that is worth more than being able to "catch IAE"
> or "catch IOE" directly?

One of the problems I encounter occurs when building large applications
with several components layers. At an intermediate level, say just above
[math], developers know what they are calling and they may decide to
catch an exception they know about, if they are able to identify it is
thrown (which is not always obvious). They may also decide the exception
cannot be handled at their level and simply let it propagate upward. As
you go upward in the software layers, with different development teams,
you lose this knowledge and people don't even understand anything about
mathematics. They can however still catch some large scope exceptions,
one type per component (say a MathRuntimeException, and a MylibException
if they know these two sub-components are used). They won't do anything
with the exception but nicely display them in the graphical user
interface and stop the application. This works well as long as there is
one single root per library, but it does not scale with 40 different
exceptions per libraries.

Another problem is maintenance. Even if you consider the intermediate
developer did his work really accurately and managed to identify all
exceptions thrown by the methods he calls in one version of Apache
Commons Math. When we change an error detection and decide that a method
that did throw only MaxCountExceededException a method should throw
NumberIsToolLargeException instead (or in addition to the existing one),
then the calling code would still compile, but the new exception would
now go all the way upward. The two exceptions have no common ancestor
that can be catched, except Exception itself. With a single rooted
hierarchy, users can use some defensive programming: they can catch the
common root and be safe when we change some internal details.

A single root would also bring two things I find useful.

The first useful thing is that the ExceptionContextProvider could be
implemented at the root level, so we could retrieve this context (in
fact, I sometime needs even to retrive the pattern and the arguments
from the context, and we also miss getters for that, but they are easy
to add). It is not possible to catch ExceptionContextProvider because it
is not a throwable (Throwable is a class, not an interface, so we
inherit the Throwable nature from the top level class, not as
implementing the ExceptionContextProvider interface.

The second useful thing is for [math] development itself. With a single
root, we can temporarily change its parent class from RuntimeException
to Exception, then fix all missing throws declaration and javadoc, then
put the parent class back before committing. This would help having more
up to date declarations. For now, I am sure we have missed a lot of our
own exceptions and let them propagate upward without anybody knowing it.
As a test, I have just changed the parent for
MathIllegalArgumentException to Exception. I got 1384 compilation
errors. Just going to the first one (a constructor of
BaseAbstractUnivariateIntegrator), I saw we did not advertise the fact
it may throw NumberIsTooSmallException and NotStrictlyPositiveException,
neither in a throws declaration nor in the javadoc. I did not look at
the 1383 other errors...

> What I am missing is how knowing that an
> aspecific RTE came from within [math] makes a difference.  I am
> skeptical about ever depending on that kind of conclusion because
> dependencies may bring [math] code in at multiple levels.  Also, is
> there an implied assumption in your ideal setup that *no* exceptions
> propagate to [math] clients other than MRTE (i.e. we catch and wrap
> everything)?

No, I don't make this assumption. I consider that at upper levels, code
can receive exception from all layers underneath ([math] at the very
bottom, but also other layers in between). With two or three layers, you
can still handle a few library-wide exceptions (see my example with
MathRuntimeException, and MylibException above). However, if at one
level the development rules state that all exception must be caught and
wrapped (this happens in some critical systems contexts), then a single
root hierarchy helps a lot.

My point is that with a single root, we can get the best of two worlds:
large scope catches and pinpointed catches. The choice remains open for
users. With a multi-rooted hierarchy, we force users to duplicate the
same work for all exceptions we may throw, and we also force them to
recheck everything when we publish a new version, even despite we
ourselves fail to document these exceptions accurately.

best regards,
Luc

> 
> Phil
>>
>> For sure, this is something that can be done only for a major release.
>>
>>>>> Client apps cannot do more with checked exceptions, and can be made as
>>>>> "safe" by wrapping calls in try-blocks.
>>>>> On the other hand, client source code is much cleaner without unnecessary
>>>>> "throws" clauses or wrapping of checked expections at all levels.
>>>>> Some Java experts go as far as saying that checked exceptions were a
>>>>> language design mistake (never repeated in languages invented more
>>>>> recently).
>>>>>
>>>>>> There is a reason that NaNs exist.  It is much cheaper to return a
>>>>>> NaN than to raise (and force the client to handle) an exception. 
>>>>>> This is not precise and probably can't be made so, but I have always
>>>>>> looked at things more or less like this:
>>>>>>
>>>>>> 0) IAE (which I see no need to specialize as elaborately as we have
>>>>>> done in [math]) is for clear violation of the documented API
>>>>>> contract.  The actual parameters "don't make sense" in the context
>>>>>> of the API.
>>>>> The "elaboration" is actually very basic (but that's a matter of taste), 
>>>>> but
>>>>> it was primarily promoted (by me) in order to hide (as much as possible) 
>>>>> the
>>>>> ugliness (another matter of taste) of the "LocalizedFormats" enum, and its
>>>>> inconsequent use (duplication). [Cf. discussions in the archive.]
>>>>>
>>>>>> 1) NaN can be returned as the result of a computation which, when
>>>>>> started with legitimate arguments, does not result in a
>>>>>> representable value.
>>>>> According to this description, Sébastien's case _must_ be handled by an
>>>>> exception: the argument is _not_ legtimate.
>>>>> The usage of NaN I was referring to is to let a computation proceed 
>>>>> ("follow
>>>>> an unexceptional path") in the hope that the final result might still be
>>>>> meaningful.
>>>>> If the NaN persists, not checking for it and signalling the problem (i.e.
>>>>> raise an exception) is a bug. This is to avoid that (and be robust) that 
>>>>> we
>>>>> do extensive precondition checks in CM. But this has the unavoidable
>>>>> drawback that the use of NaN as suggested is much less likely to be 
>>>>> feasible
>>>>> when calling CM code. Once realizing that, it becomes much less obvious 
>>>>> that
>>>>> there is _any_ advantage of letting NaNs propagate...
>>>>> [Anyone has an example of NaN usage? Please let me know.]
>>>> I use NaN a lot as an indicator that a variable has not been fully
>>>> initialized yet. This occurs for example in iterative algorithms, where
>>>> some result is computed deep inside some loop and we don't know when the
>>>> loop will end. Then I write something along these lines:
>>>>
>>>>   while (Double.isNaN(result)) {
>>>>      // do something that hopefully will change result to non-NaN
>>>>   }
>>>>
>>>>   // now I know result has been computed
>>>>
>>>> Another use is to initialize some fields in class to values I know are
>>>> not meaningful. I can then us NaN as a marker to do lazy evaluation for
>>>> values that takes time to compute and should be computed only when both
>>>> really needed and when everything required for their computation is
>>>> available.
>>> I should have said "[...] example of NaN usage, beyond singling out
>>> unitialized data [...]". The above makes use of NaN as "invalid" because it
>>> is not initialized (yet).
>> Yes.
>>
>>> I'd assume that if "result" stays NaN after the allowed number of
>>> iterations, you raise an exception, i.e. you don't propagate NaN as the
>>> output of a computation that cannot provide a useful result. However, this
>>> (propagating NaN) is the behaviour of "srqt(-1)", for example.
>>> Thus, if you raise an exception, your computation does not behave in the
>>> same way as the function "sqrt".
>>>
>>>> Another use is simply to detect some special cases in computations (like
>>>> sqrt(-1) or 0/0). I do the computation first and check the NaN
>>>> afterwards. See for example the detection of NaNs in the linear
>>>> combinations in MathArrays or in the nth order Brent solver.
>>> OK, this is a good example, in line with the intended usage of NaN (as it
>>> avoids inserting control structures in the computation).
>> Yes. One of the main use case for this is when a computation involves a
>> loop and failure is very rare. So we avoid costly numerous if statements
>> within the loop and do a single check. In the few cases this single
>> check fails, we go to a diffrent branch to handle the failure. This is
>> exactly what is done in linear combination.
>>
>>>> Another use of NaNs occurs when integrating various code components from
>>>> different origins in a single application. Data is forwarded between the
>>>> various components in all directions. Components never share the same
>>>> exceptions mechanisms. Components either process NaNs specially (which
>>>> is good) or they let the processor propagate them (it is what the IEEE
>>>> standard mandates) and at the end you can detect it reliably at
>>>> application level.
>>> I'm not sure I understand this. Is it good or bad that a component lets NaNs
>>> propagate? Are there situations when it's good and others where it's bad?
>> In the cases I encountered, it is always good to have NaNs propagated. A
>> component that is not an application by itself but only a part (low or
>> intermediate level) often cannot decide at its level how to handle NaNs
>> except in rare cases. So it propagates them upward. The previous example
>> (linear combination in [math]) is of course a counter-example: we are at
>> low level, we know how to handle the NaN for this operation, so we
>> detect it and fix it.
>>
>>> That's why I was asking (cf. quote from previous post below) what are the
>>> criteria, so that contributors know how to write code when the feature falls
>>> in one or the other category.
>>>
>>>>>> The problem is that contracts can often be written so that instances
>>>>>> of 1) are turned into instances of 0).  Gamma(-) is a great
>>>>>> example.  The singularities at negative integers could be viewed as
>>>>>> making negative integer arguments "illegal" or "nonsense" from the
>>>>>> API standpoint,
>>>>> They are just nonsense (not just from an API standpoint).
>>>>>
>>>>>> or legitimate arguments for which no well-defined,
>>>>>> representable value can be returned.  Personally, I would prefer to
>>>>>> get NaN back from this function and just point out the existence of
>>>>>> the singularities in the javadoc.
>>>>> This is consistent with how basic math functions behave, but not with the
>>>>> general rule/convention of most of CM code.
>>>>> It may be fine that we have several ways to deal with exceptional
>>>>> conditions, but it might be nice, as with formatting, to have rules so 
>>>>> that
>>>>> we know how to write contributions.
>>>> Too many rules are not a solution, especially when there are no tools to
>>>> help enforce these rules are obeyed. Relying only on the fact human
>>>> proof-reading will enforce them is wishful thinking.
>>>>
>>> What is "too many"? ["How long should a person's legs be?" ;-)]
>>> I don't agree with the "wishful thinking" statement; a "diff" could probably
>>> show a lot a manual corrections to the code and comment formatting. [Mainly
>>> in the sources which I touched at some point...]
>> I'm not sure I understand your point. Mine is that rules that are not
>> backed by automated tools are a pain to enforce, and hence are not
>> fulfilled most of the time, except at a tremendous human resource cost.
>> In fact, even rules which can be associated with tools are broken during
>> development for some time. We do not use
>> checkstyle/CLIRR/findbugs/PMD/RAT for all commits for example, but do a
>> fix pass from time to time.
>>
>>> There are other areas where there is only human control, namely the "svn
>>> log" messages where (no less picky) rules are enforced just because it
>>> helps _humans_ in their change overview task.
>>>
>>> As pointed out by Jared, it's not a big problem to comply with rules once
>>> you know them.
>> I fully agree with that, but I also think Phil is right when he says too
>> many rules may discourage potential contributors. I remember a link he
>> sent to us months ago about to a presentation by Michael Meeks about
>> interacting with new developers
>> <http://people.gnome.org/~michael/data/2011-10-13-new-developers.pdf>.
>> Slides numers 3 an 4 are a fabulous example. I think we are lucky Jared
>> has this state of mind and accepts picky rules easily. I'm not sure such
>> an open mind is widespread among potential contributors.
>>
>>> Keeping source code tidy is quite helpful, and potential contributors will
>>> be happy that they can read any CM source files and immediately recognize
>>> that they are part of the same library...
>> Yes, of course. But the entry barrier should not be too high.
>>
>> best regards,
>> Luc
>>
>>>
>>> Best regards,
>>> Gilles
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math] UnexpectedNegativeIntegerException

Reply via email to