Le 25/08/2012 23:43, Gilles Sadowski a écrit : > Hello Luc. Hi Gilles,
> > On Fri, Aug 24, 2012 at 09:31:41AM +0200, Luc Maisonobe wrote: >> Le 24/08/2012 01:35, Gilles Sadowski a écrit : >>> On Thu, Aug 23, 2012 at 12:00:56PM -0700, Phil Steitz wrote: >>>> On 8/23/12 5:37 AM, Luc Maisonobe wrote: >>>>> Le 23/08/2012 13:37, Gilles Sadowski a écrit : >>>>>> On Thu, Aug 23, 2012 at 12:35:18PM +0200, Sébastien Brisard wrote: >>>>>>> Hi Gilles, >>>>>>> >>>>>>> 2012/8/23 Gilles Sadowski <gil...@harfang.homelinux.org>: >>>>>>>> On Thu, Aug 23, 2012 at 10:05:10AM +0200, Sébastien Brisard wrote: >>>>>>>>> Hi Luc, >>>>>>>>> >>>>>>>>> 2012/8/23 Luc Maisonobe <luc.maison...@free.fr>: >>>>>>>>>> Le 23/08/2012 05:16, Sébastien Brisard a écrit : >>>>>>>>>>> Hi, >>>>>>>>>>> in MATH-849, I have proposed an implementation of Gamma(x) >>>>>>>>>>> (previously, class Gamma had only logGamma(x)). Gamma(x) is not >>>>>>>>>>> defined for x negative integer. In such instances, I would like to >>>>>>>>>>> throw an exception instead of returning Double.NaN. However, >>>>>>>>>>> creating >>>>>>>>>>> a new exception in o.a.c.m.exception seems exagerated, since it's >>>>>>>>>>> very >>>>>>>>>>> unlikely that this exception should be used elsewhere (or maybe). >>>>>>>>>>> Should I define a nested exception instead [1]? >>>>>>>>>>> >>>>>>>>>>> What do you think of the name "UnexpectedNegativeIntegerException"? >>>>>>>>>>> It >>>>>>>>>>> does not really match the pattern of already defined exceptions, >>>>>>>>>>> but I >>>>>>>>>>> can't find a better name. >>>>>>>>>> Don't we already have NotPositiveException? >>>>>>>>>> >>>>>>>>>> Luc >>>>>>>>>> >>>>>>>>> We do, but Gamma is defined for all negative values, except integer >>>>>>>>> ones... >>>>>>>> I think that in some circumstances, it might be useful to not throw >>>>>>>> exceptions... >>>>>>>> FastMath's "log" returns NaN for negative input. >>>>>>>> >>>>>>> then I guess that logGamma(x) should also return NaN if x <= 0? >>>>>> Anyways, that's what it does currently. >>>>>> >>>>>>> I have to say I do not really like this option. >>>>>> So did you intend to change that? >>>>>> >>>>>>> My life would >>>>>>> sometimes be much easier if NaNs didn't exist... the good old days of >>>>>>> the "floating-point error". >>>> >>>> We have different memories - I am old enough to remember wasting >>>> real money due to jobs failing on "floating point checks" when I >>>> would have preferred to have computations complete and return NaN >>>> (which had not been invented yet). >>>> >>>>>> IIRC NaN could be useful for example in an optimization algorithm; >>>>>> excerpt >>>>>> from Kahan: >>>>>> --- >>>>>> [...] NaNs is an opportunity ( not obligation ) for software ( especially >>>>>> when searching ) to follow an unexceptional path ( no need for exotic >>>>>> control structures ) to a point where an exceptional event can be >>>>>> appraised >>>>>> after the event, when additional evidence may have accrued. [...] >>>>>> --- >>>>>> >>>>>> I do not say that Commons Math should prefer NaN over throwing >>>>>> exceptions. >>>>> If the choice is allowed would really prefer NaN for such cases. >>>> >>>> +1 >>>>> >>>>>> Maybe that it depends on how high-level an application is (i.e. if there >>>>>> are >>>>>> already calls to methods that could throw exceptions, then an algorithm >>>>>> that >>>>>> would not use try/catch to protect itself would fail anyway). >>>>>> If we want Commons Math to allow taking advantage of NaNs, it would >>>>>> probably >>>>>> need to be updated so that a lot of precondition checks ought to be >>>>>> removed >>>>>> (but this will likely lead to reduced robustness in some applications >>>>>> that >>>>>> do not do their own checks...). >>>>> This would clearly be cumbersome for users. >>>>> Since we have changed our exception hierarchy, we don't have a single >>>>> root anymore, so users simply cannot catch all exception we throw at >>>>> once, they have to check every single type, and make sure they are >>>>> thrown by themselves without any help from compiler. >>>>> >>>>> Just adding new exceptions is too much, we have already gone too far >>>>> this way. >>>> >>>> +1 and the fact that all exceptions are unchecked makes it even >>>> harder for client apps as the compiler will not help / force them. >>> >>> For those new to this list, they can search the archive for lengthy >>> discussions on this subject. >>> The summary is that CM has absolutely no use of checked exceptions. >> >> Another really important part is that CM doesn't have anymore a >> hierarchy of exception. It has a bunch of completely unrelated >> exceptions, all extending different standard Java exceptions. This is >> the part that really bothers me, but this is the current consensus. > > I would be OK to change it to back to the previous state of affairs, that > is, the one where we had agreed on a singly rooted hierarchy with base class > "MathRuntimeException". > The current consensus was reached because you didn't voice the concern you > are now mentioning. I thought I had. Perhaps this feature was set up after I gave up on this discussion. > > It would be quite easy to change, if it would make your life easier. > > The more so that I never saw what is gained from copying the Java hierachy > (in the particular case of the exceptions): Because some exception inherits > from the Java standard one does not bring special benefits to the > application that has to catch that exception. I mean: Is there any piece of > code that would behave differently if it caught "IllegalArgumentException" > vs "IllegalStateException"? If not, it could as well be prepared to catch > a "MathRuntimeException" (and do the same thing). > [The various exception types are primarily there to discriminate between > various _problems_; but are not likely helpful to help the caller devise a > way to react to the exception once it is raised (other than acknowledge the > fact than CM could not perform the requested action).] > > In CM, the vastly overwhelming majority of exceptions are instances of > "MathIllegalArgumentException" or one of its subclasses. > > We have a "NullArgumentException" but we also agreed that it did not have to > be a subclass of the standard Java "NullPointerException". So in this case, > we already depart from the "standard". [But we also speculated that the > policy could to never check for "null" and let the JVM do that, This behaviour > is _not_ consistent throughout CM.] > > Number of occurrences of CM exceptions that are subclasses of those Java > standard exceptions: > * IllegalStateException (43) > * UnsupportedOperationException (22) > * ArithmeticException (54) > > In summary, I have no problem with a "MathRuntimeException" base class which > "MathIllegalArgumentException", "MathIllegalStateException", > "MathUnsupportedOperationException", "MathArithmeticException" would inherit > from. > > Applications that call CM would be safe (apart from bugs raising "NPE") > with a unique catch clause intercepting "MathRuntimeException". I am happy (and surprised) to read that. I would really much like to go back to a single root exception hierarchy. This both helps top level application as depending on context they can either pinpoint the exception they want to catch or they can have a grab all strategy. It is their choice. For sure, this is something that can be done only for a major release. > >> >>> Client apps cannot do more with checked exceptions, and can be made as >>> "safe" by wrapping calls in try-blocks. >>> On the other hand, client source code is much cleaner without unnecessary >>> "throws" clauses or wrapping of checked expections at all levels. >>> Some Java experts go as far as saying that checked exceptions were a >>> language design mistake (never repeated in languages invented more >>> recently). >>> >>>> There is a reason that NaNs exist. It is much cheaper to return a >>>> NaN than to raise (and force the client to handle) an exception. >>>> This is not precise and probably can't be made so, but I have always >>>> looked at things more or less like this: >>>> >>>> 0) IAE (which I see no need to specialize as elaborately as we have >>>> done in [math]) is for clear violation of the documented API >>>> contract. The actual parameters "don't make sense" in the context >>>> of the API. >>> >>> The "elaboration" is actually very basic (but that's a matter of taste), but >>> it was primarily promoted (by me) in order to hide (as much as possible) the >>> ugliness (another matter of taste) of the "LocalizedFormats" enum, and its >>> inconsequent use (duplication). [Cf. discussions in the archive.] >>> >>>> 1) NaN can be returned as the result of a computation which, when >>>> started with legitimate arguments, does not result in a >>>> representable value. >>> >>> According to this description, Sébastien's case _must_ be handled by an >>> exception: the argument is _not_ legtimate. >>> The usage of NaN I was referring to is to let a computation proceed ("follow >>> an unexceptional path") in the hope that the final result might still be >>> meaningful. >>> If the NaN persists, not checking for it and signalling the problem (i.e. >>> raise an exception) is a bug. This is to avoid that (and be robust) that we >>> do extensive precondition checks in CM. But this has the unavoidable >>> drawback that the use of NaN as suggested is much less likely to be feasible >>> when calling CM code. Once realizing that, it becomes much less obvious that >>> there is _any_ advantage of letting NaNs propagate... >>> [Anyone has an example of NaN usage? Please let me know.] >> >> I use NaN a lot as an indicator that a variable has not been fully >> initialized yet. This occurs for example in iterative algorithms, where >> some result is computed deep inside some loop and we don't know when the >> loop will end. Then I write something along these lines: >> >> while (Double.isNaN(result)) { >> // do something that hopefully will change result to non-NaN >> } >> >> // now I know result has been computed >> >> Another use is to initialize some fields in class to values I know are >> not meaningful. I can then us NaN as a marker to do lazy evaluation for >> values that takes time to compute and should be computed only when both >> really needed and when everything required for their computation is >> available. > > I should have said "[...] example of NaN usage, beyond singling out > unitialized data [...]". The above makes use of NaN as "invalid" because it > is not initialized (yet). Yes. > I'd assume that if "result" stays NaN after the allowed number of > iterations, you raise an exception, i.e. you don't propagate NaN as the > output of a computation that cannot provide a useful result. However, this > (propagating NaN) is the behaviour of "srqt(-1)", for example. > Thus, if you raise an exception, your computation does not behave in the > same way as the function "sqrt". > >> Another use is simply to detect some special cases in computations (like >> sqrt(-1) or 0/0). I do the computation first and check the NaN >> afterwards. See for example the detection of NaNs in the linear >> combinations in MathArrays or in the nth order Brent solver. > > OK, this is a good example, in line with the intended usage of NaN (as it > avoids inserting control structures in the computation). Yes. One of the main use case for this is when a computation involves a loop and failure is very rare. So we avoid costly numerous if statements within the loop and do a single check. In the few cases this single check fails, we go to a diffrent branch to handle the failure. This is exactly what is done in linear combination. > >> >> Another use of NaNs occurs when integrating various code components from >> different origins in a single application. Data is forwarded between the >> various components in all directions. Components never share the same >> exceptions mechanisms. Components either process NaNs specially (which >> is good) or they let the processor propagate them (it is what the IEEE >> standard mandates) and at the end you can detect it reliably at >> application level. > > I'm not sure I understand this. Is it good or bad that a component lets NaNs > propagate? Are there situations when it's good and others where it's bad? In the cases I encountered, it is always good to have NaNs propagated. A component that is not an application by itself but only a part (low or intermediate level) often cannot decide at its level how to handle NaNs except in rare cases. So it propagates them upward. The previous example (linear combination in [math]) is of course a counter-example: we are at low level, we know how to handle the NaN for this operation, so we detect it and fix it. > That's why I was asking (cf. quote from previous post below) what are the > criteria, so that contributors know how to write code when the feature falls > in one or the other category. > >> >>> >>>> The problem is that contracts can often be written so that instances >>>> of 1) are turned into instances of 0). Gamma(-) is a great >>>> example. The singularities at negative integers could be viewed as >>>> making negative integer arguments "illegal" or "nonsense" from the >>>> API standpoint, >>> >>> They are just nonsense (not just from an API standpoint). >>> >>>> or legitimate arguments for which no well-defined, >>>> representable value can be returned. Personally, I would prefer to >>>> get NaN back from this function and just point out the existence of >>>> the singularities in the javadoc. >>> >>> This is consistent with how basic math functions behave, but not with the >>> general rule/convention of most of CM code. >>> It may be fine that we have several ways to deal with exceptional >>> conditions, but it might be nice, as with formatting, to have rules so that >>> we know how to write contributions. >> >> Too many rules are not a solution, especially when there are no tools to >> help enforce these rules are obeyed. Relying only on the fact human >> proof-reading will enforce them is wishful thinking. >> > > What is "too many"? ["How long should a person's legs be?" ;-)] > I don't agree with the "wishful thinking" statement; a "diff" could probably > show a lot a manual corrections to the code and comment formatting. [Mainly > in the sources which I touched at some point...] I'm not sure I understand your point. Mine is that rules that are not backed by automated tools are a pain to enforce, and hence are not fulfilled most of the time, except at a tremendous human resource cost. In fact, even rules which can be associated with tools are broken during development for some time. We do not use checkstyle/CLIRR/findbugs/PMD/RAT for all commits for example, but do a fix pass from time to time. > > There are other areas where there is only human control, namely the "svn > log" messages where (no less picky) rules are enforced just because it > helps _humans_ in their change overview task. > > As pointed out by Jared, it's not a big problem to comply with rules once > you know them. I fully agree with that, but I also think Phil is right when he says too many rules may discourage potential contributors. I remember a link he sent to us months ago about to a presentation by Michael Meeks about interacting with new developers <http://people.gnome.org/~michael/data/2011-10-13-new-developers.pdf>. Slides numers 3 an 4 are a fabulous example. I think we are lucky Jared has this state of mind and accepts picky rules easily. I'm not sure such an open mind is widespread among potential contributors. > Keeping source code tidy is quite helpful, and potential contributors will > be happy that they can read any CM source files and immediately recognize > that they are part of the same library... Yes, of course. But the entry barrier should not be too high. best regards, Luc > > > Best regards, > Gilles > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org