Re: [MATH] Interest in large patches for small cleanup / performance changes?

Thomas Neidhart Sun, 10 Nov 2013 09:40:26 -0800

On 11/10/2013 05:32 PM, Phil Steitz wrote:
> On 11/10/13 1:15 AM, Thomas Neidhart wrote:
>> On 11/10/2013 07:03 AM, Phil Steitz wrote:
>>> On 11/9/13 3:27 PM, Thomas Neidhart wrote:
>>>> On 11/09/2013 11:21 PM, Gilles wrote:
>>>>> On Sat, 09 Nov 2013 13:13:05 -0800, Phil Steitz wrote:
>>>>>> On 11/5/13 5:21 AM, Gilles wrote:
>>>>>>>>>> [...]
>>>>>>>>>> I have scanned for exact duplicates quite a few times and never
>>>>>>>>>> found any.  There are quite a few that are similar, but differ in
>>>>>>>>>> material ways (strict versus non-strict inequalities, endpoints
>>>>>>>>>> included / not included, etc.).  Please do not "collapse" messages
>>>>>>>>>> at the expense of loss of specificity or correctness.
>>>>>>>>> FAILED_BRACKETING
>>>>>>>>> UNABLE_TO_BRACKET_OPTIMUM_IN_LINE_SEARCH
>>>>>>>>> INVALID_BRACKETING_PARAMETERS
>>>>>>>> Look at the messages.  These are different.  They convey different
>>>>>>>> information and are appropriate in different contexts.  See below.
>>>>>>> I've argued that context information should be constructed at the
>>>>>>> point where the exception is thrown (where the context is known).
>>>>>>> Not all combinations of exceptions and context need be present in
>>>>>>> the pattern list.
>>>>>>> This is the essence of my proposal below.
>>>>>>>
>>>>>>>>> My position: the error (failed bracketing) should have its own
>>>>>>>>> exception
>>>>>>>>> type. The varying contexts could (do not have to) be part of the
>>>>>>>>> message
>>>>>>>>> built at exception instantiation.
>>>>>>>>>
>>>>>>>>> If we want to include an indication of location (despite it is
>>>>>>>>> already
>>>>>>>>> part of the stack trace, so it is _redundant_), we could perhaps
>>>>>>>>> add methods
>>>>>>>>> to the "ExceptionContext", e.g. "where(LocalizeFormats pattern)"
>>>>>>>>> (?).
>>>>>>>>> Then, we would have thos patterns in the list:
>>>>>>>>>
>>>>>>>>> BRACKETING
>>>>>>>>> LINE_SEARCH
>>>>>>>>>
>>>>>>>>> Note: INVALID and FAILED are redundant since the pattern is
>>>>>>>>> intended to be
>>>>>>>>> included in an exception.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> A second "interesting" case is
>>>>>>>>>
>>>>>>>>> INVALID_ROUNDING_METHOD
>>>>>>>>>
>>>>>>>>> which mixes documentation with error description. Does anyone
>>>>>>>>> really thinks
>>>>>>>>> that the enumeration of the rounding methods in the error message
>>>>>>>>> is necessary
>>>>>>>>> or even helpful?
>>>>>>>> When I throw an exception, I want to provide an error message that
>>>>>>>> is meaningful in the context of the caller, i.e., that someone
>>>>>>>> looking at a log or stack trace can make sense of.  That sometimes
>>>>>>>> means restating preconditions, sometimes pointing to boundary
>>>>>>>> conditions, sometimes giving hints describing common causes of the
>>>>>>>> exception - lots of different things that depend on the API, the
>>>>>>>> activation context and the nature of the exception.  The natural way
>>>>>>>> to do this is to use natural language sentences.  Please allow me to
>>>>>>>> retain a straightforward way to construct these messages and to
>>>>>>>> maintain the specificity and meaning of the messages.
>>>>>>> IMHO, the level of details in the message is not needed: if the
>>>>>>> exception
>>>>>>> was thrown, the user should probably look at the documentation,
>>>>>>> rather
>>>>>>> than try another value at random; I'd say that it is harmful to
>>>>>>> tempt the
>>>>>>> users with something like "Pick another number". ;-)
>>>>>>>
>>>>>>> [Shouldn't we rather provide function where the rounding type is
>>>>>>> an enum?]
>>>>>>>
>>>>>>> The main problem in those discussions is that you consider only "toy"
>>>>>>> situations, where the message generated by Commons Math should
>>>>>>> make sense
>>>>>>> wherever the exception is caught, and even if it is not caught.
>>>>>> What you keep failing to acknowledge is that in many real world
>>>>>> applications, reading exception stack traces and application logs
>>>>>> that contain error messages is an important operational activity.
>>>>>> Having clear error messages that make sense in the context of the
>>>>>> stack trace or application activation context makes the job of those
>>>>>> maintaining and debugging those applications easier.  However hard
>>>>>> we decide to make it, I will continue to provide these.
>>>>> IMO, the real problem is old habits, period. Despite your repeating it
>>>>> over and over, I never expressed anything in the sense of having less
>>>>> information in the error messages. [I don't get what the stack trace has
>>>>> to do here. And I just gave you a real example where whatever details
>>>>> CM tries to provide, it will _never_ be sufficient because it cannot
>>>>> know why the call failed; I suggested that the _same_ amount of
>>>>> (necessary but not sufficient) information could perhaps be provided
>>>>> with "little block" patterns glued with "addMessage" (or an improvement
>>>>> thereof).]
>>>>> Specific exceptions always provide more information than less specific
>>>>> ones. Keeping low-level message (e.g. precondition failure) does not
>>>>> preclude adding more specific messages when the context is known (that
>>>>> happens in the code, and every little variant does not need to be
>>>>> hard-coded in the currently overly long list of patterns).
>>>>> My proposals were solely aimed at making the "preparation" of the
>>>>> messages more efficient from a developer's perspective (e.g. no scanning
>>>>> of 300+ patterns).
>>>>> Stalling the experiment in endless arguments makes it less and less
>>>>> worth trying.
>>>>>
>>>>> All in all, the main argument seems to always be that if the user
>>>>> cannot see the difference, it is not worth changing the design.
>>>> Which is also a pragmatic and valid approach here imho.
>>>>
>>>> If there are no real user complaints about this topic (and I am not
>>>> aware of any) and no other solution will greatly enhance the current
>>>> state, it is really not worth doing it.
>>>>
>>>> Part of my day job is to debug very complex systems and the most
>>>> important thing is that you get what you expect, i.e. according to the
>>>> contract of a method, which btw also includes the method name. Detailed
>>>> error messages are nice to have but not really required (as long as you
>>>> understand the purpose of the code which any user/developer of CM should
>>>> do).
>>>>
>>>> More meaningful error messages would make sense if our targeted audience
>>>> are really end-users but I think this is a bit far-fetched.
>>> Not really.  Does not have to be end users, just someone looking at
>>> an application log that reproduces the messages or the stack traces
>>> themselves.  Sometimes operations / production support teams do not
>>> have access or time to look at source code or javadoc.  Informative
>>> error messages that give more information about the failure can be
>>> useful to these people.  Sure, you can push all of that off to the
>>> client app developers; but it can make *their* lives easier too to
>>> get more information, especially information about parameter values,
>>> which invariants were violated, etc.  The simplest and easiest to
>>> digest way to do that is to provide good error messages.
>> If the contract is violated, then it is a bug and has to go to 3rd level
>> support (aka developer) anyway and this you can see immediately if there
>> is an IllegalArgumentException.
> 
> Not necessarily.  Could be a data or environment problem.
>>
>> For algorithm related problems, like MaxIterationExceededExceptions or
>> similar ones can you really express why this happened? I think it is
>> much more robust to take such exceptions into account in the first
>> place, i.e. algorithms may not converge for certain inputs and a client
>> app developer has to present a meaningful error (in the context of the
>> application!) to a user or the log file.
>>
>> I even think that more detailed error messages give people the illusion
>> that they can skip proper exception handling as CM already does it so well.
> 
> No, they just provide more info that can be useful in
> troubleshooting or debugging, the most important of which is usually
> info about application state when the error happened.
>>
>> Just an example: if you try to open a non-existent file with the Java
>> API, do you get the error code of the respective system call on your
>> operating system? No, you get a FileNotFoundException, but what do you
>> do with it? 
>> Would you log the error message contained in the exception
>> or a more specific one in the context of your application?
>>
>> Imho, the first option makes applications very hard to maintain.
> 
> It is not either/or.  Depends on the application use case.  If on
> the other hand you *don't* provide decent error messages, developers
> don't have this choice.
>>
>> Now the Java API is quite easy to understand and a lot of people know it
>> very well, but take the example further for any 3rd party library your
>> application may be using. How is your operation engineer supposed to
>> know all error states of all the used 3rd party libraries and put them
>> in the right context? If he/she is lucky enough there is a an operations
>> manual for their own application ...
> 
> Yep, such manuals exist all over the place in the real world.  And
> the first failure data capture (FFDC) in the logs is used both by
> first-level support as well as when issues are escalated.   As Bloch
> points out, when failures are hard to reproduce, the FFDC data is
> all that those working a production issue have to go on.  For this
> reason, "it is critically important that an Exception's toString
> method return as much information as possible about the cause of the
> failure as possible..." [1]
>>
>> Interpreting and proper handling of 3rd party code is the job of the
>> client app developer, and he/she has to do it right. If you may get an
>> exception you have to take care about it, everything else will just
>> create headaches.
> 
> Well, in real-world applications, headaches happen.  Preparing those
> who have to deal with them with good FFDC can make it much easier to
> deal with them. You are right that top notch developers using [math]
> will manage all of this themselves, carefully preserving all context
> data around failures and doing what they need to do to provide
> themselves and whoever else helps support their apps with the info
> they need.  Unfortunately, not all developers consistently do this
> and when failures from lower-level libraries make it into stack
> traces, it really helps for them to provide informative messages. 
> Moving to unchecked exceptions uniformly makes this even more important.


I am somehow lost, as from my understanding we have all these things in
CM already. There are mainly two classes of exceptions that may occur:

 * invalid input
 * algorithm did not converge

For both we provide in most cases meaningful error messages together
with the exception. In the case of invalid input, Bloch clearly states
that these are programming errors, and if something like this appears in
a production environment you should rather question the development
process than the FFDC policy of 3rd party libraries.

And for the case of convergence exceptions, I think it is very difficult
to express more meaningful information into the error message than we
already have. It is highly dependent on the context in which an
algorithm is used, thus again, the application developer is in the best
position to provide meaningful error information if something fails, as
he/she knows precisely the context / purpose.

I know the simplex algorithm quite well, and it may happen, dependent on
the problem definition that the SimplexSolver may not find a solution
after x iterations. So what do you put as an error message in such a case?

The constraints may be too tight, the convergence criteria too strict,
... but in the end the solver could not find a solution. The developer
must figure out why (or ask on the mailinglist ;-) and adjust the
problem or parameters to the solver in order to make it work.

So I agree with all the points stated in Effective Java and I think we
sufficiently applied them in CM.

There are parts in CM where I have no clue how to use it and this is the
more important issue to tackle imho.

Thomas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [MATH] Interest in large patches for small cleanup / performance changes?

Reply via email to