This and other RFCs are available on the web at
http://dev.perl.org/rfc/
=head1 TITLE
object neutral error handling via exceptions
=head1 VERSION
Maintainer: Glenn Linderman <[EMAIL PROTECTED]>
Date: 16 Aug 2000
Version: 1
Mailing List: [EMAIL PROTECTED]
Number: 119
=head1 ABSTRACT
Revisit what the goals of error handling and exceptions are for, to
determine the set of desirable unit operations, rather than start with
a bundle of stuff from another language, and try to make it Perlish.
=head1 DESCRIPTION
There are numerous RFCs regarding a complete bundle of exception
handling mechanisms. Most of them are modeled after some other
languages exception handling mechanism, adapted somewhat to Perl, and
somewhat to the goals of the author. While this is not all bad, as the
problems being faced were faced in the other languages as well, it is
not necessarily all good, either. This RFC examines some of the
incentives behind C++ exceptions, both the structure of the code and
the structure of the exception object, then examines the goals of an
exception mechanism, then examines some techniques that could be used
to reach the goals. The result can be made to look a lot like the C++
exception mechanism if desired, but can be much more powerful when all
its features are used. So this leads to the following "head2"s:
- C++ exceptions
- Goals of exceptions
- Techniques for exceptions
- Results
- C++-like Usage
I focus on C++ rather than Java, because Java (pardon me, Java-heads)
is just an attempt to use the best parts of C++ without all the baggage
of C, so while most of this could have been changed in Java, it wasn't.
This made Java easy to learn for C users who'd read about C++, and for
C++ users. This didn't make Java a significantly better language than
C++, although they were able to remove some of the worst C++ language
traps. To excel, you need to not only remove the worst, but add some
best. I think that's the goal of Perl.
While Graham's error.pm module is a valiant attempt to include C++-like
exception handling in Perl, it has various deficiencies (discussed by
others) that can be attributed to be an add-on to Perl, just like C++
exception handling has various deficiencies because of being an add-on
to C.
=head2 C++ exceptions
Remember that C++ exceptions were built first as a preprocessor for C.
Therefore, the mechanisms used had to exist in C. Stack unwinding
could therefore only be done by using the only non-local goto facility
supported by C: longjmp. This forced a number of decisions about the
design of exceptions, not all of which are good.
I note in passing that ANSI Forth defines catch and throw which use
single cells as parameters... so not all usage of catch and throw is
related to object techniques.
=head3 Keyword try
First, longjmp doesn't work without a setjmp, and setjmp must be called
prior to longjmp. This is the basic justification for try: it calls
setjmp at a point within the scope of the code for which the exception
handling mechanism is to be activated. Some attempts have been made to
justify the use of the try keyword as aiding programmer comprehension
of the scope of the try block, and perhaps it does this in languages
where some code may be in the scope of the exception handling
mechanism, and other code may not be.
It would seem, however, that the best implementation of an exception
handling mechanism would be that all code is in scope of the exception
handling mechanism, so that exceptions cannot be ignored, other than
explicitly. Perl's die is of that flavor: very hard to ignore, except
explicitly.
=head3 Scoping problems
Let's presume the example often cited, of wishing to close a file
handle during the unwind, here's some C++ for that:
FILE *handle;
try {
handle = fopen ( ... );
...
}
catch ( ... ) {
fclose ( handle );
throw;
}
Note that "handle" has to be defined outside the scope of the try,
because catch cannot see the scope defined by the try block, and is
completely unable to recover from problems that are not explicitly
hoisted outside of the try block.
=head3 Control flow problem #1
Here's some icky C error handling code: 3 errors to handle so the
pattern becomes obvious, but I tried to keep them simple (all open
errors to build on the case above)--they can get much more complex in
practice, when the code to deal with an error gets more complex.
int returned_error;
FILE *handle1, *handle2, *handle3;
if ( ! handle1 = fopen ( ... )) {
return errno;
}
if ( ! handle2 = fopen ( ... )) {
returned_error = errno;
close ( handle1 );
return returned_error;
}
if ( ! handle3 = fopen ( ... )) {
returned_error = errno;
close ( handle1 );
close ( handle2 );
return returned_error;
}
...
Here's an icky attempt to reduce the redundant error code:
int returned_error;
FILE *handle1, *handle2, *handle3;
if ( ! handle1 = fopen ( ... )) {
returned_error = errno;
handle1_return:
return returned_error;
}
if ( ! handle2 = fopen ( ... )) {
returned_error = errno;
handle2_return:
close ( handle1 );
goto handle1_return;
}
if ( ! handle3 = fopen ( ... )) {
returned_error = errno;
handle3_return:
close ( handle2 );
goto handle2_return;
}
...
While this solves the redundancies of the cleanup code, the cleanup
code for handle1 is by the code that attempts to open handle2, rather
than being bundled with the code that opens handle1. C never claimed
to be OO, but even without OO, this is icky.
Translating to C++ doesn't help much. Assuming (not accurately) that
C++'s fopen throws an error if it fails, to simulate proposals that
Perl's open should do exactly that:
FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL;
try {
handle1 = fopen ( ... );
handle2 = fopen ( ... );
handle3 = fopen ( ... );
...
}
catch ( ... ) {
if ( handle1 ) close ( handle1 );
if ( handle2 ) close ( handle2 );
if ( handle3 ) close ( handle3 );
throw;
}
Some would like this, because it removes all the error handling code
from the control flow, but to support that, the handles must be outside
the try block (as noted in the previous section) so they can be seen by
the catch block, they must be initialized even if never used (not a bad
programming practice, but certainly not needed in the C examples) so
that the catch block doesn't do stupid things, and the code to clean up
a handle is far removed from the code that sets up the handle.
Perl, fortunately, initializes all its variables to undef, so we are
saved from that aspect of C/C++.
=head3 Control flow problem #2
The above examples all dealt with cases where the error is simply
rethrown, using the "catch ( ... )" as a "finally" block per RFC 88, or
a "continue" block per RFC 63. When actually attempting to handle
errors, we discover that any commonality between handling different
errors results in duplicate code (or additional subroutines or gotos):
FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL;
try {
handle1 = fopen ( ... );
handle2 = fopen ( ... );
handle3 = fopen ( ... );
...
}
catch ( error_type_1 ) {
if ( handle1 ) close ( handle1 );
if ( handle2 ) close ( handle2 );
if ( handle3 ) close ( handle3 );
// ... report error type 1, handle it
}
catch ( error_type_2 ) {
if ( handle1 ) close ( handle1 );
if ( handle2 ) close ( handle2 );
if ( handle3 ) close ( handle3 );
// ... report error type 2, handle it
}
catch ( ... ) {
if ( handle1 ) close ( handle1 );
if ( handle2 ) close ( handle2 );
if ( handle3 ) close ( handle3 );
throw;
}
Or you could:
void help_clean ( FILE * handle1, FILE * handle2, FILE * handle3 ) {
if ( handle1 ) close ( handle1 );
if ( handle2 ) close ( handle2 );
if ( handle3 ) close ( handle3 );
}
FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL;
try {
handle1 = fopen ( ... );
handle2 = fopen ( ... );
handle3 = fopen ( ... );
...
}
catch ( error_type_1 ) {
help_clean ( handle1, handle2, handle3 );
// ... report error type 1, handle it
}
catch ( error_type_2 ) {
help_clean ( handle1, handle2, handle3 );
// ... report error type 2, handle it
}
catch ( ... ) {
help_clean ( handle1, handle2, handle3 );
throw;
}
This removes the error handling code even further from the setup code,
still requires redundancy among the catch phrases, and introduces new
functions dealing only with cleanup. Assuming an RFC 88 finally clause
added to C++ would help, if and only if and only if (if I understand it
correctly) the handles can be closed _at the end_ of the cleanup
process. That would produce:
FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL;
try {
handle1 = fopen ( ... );
handle2 = fopen ( ... );
handle3 = fopen ( ... );
...
}
catch ( error_type_1 ) {
// ... report error type 1, handle it
}
catch ( error_type_2 ) {
// ... report error type 2, handle it
}
catch ( ... ) {
throw;
}
finally {
if ( handle1 ) close ( handle1 );
if ( handle2 ) close ( handle2 );
if ( handle3 ) close ( handle3 );
}
I'm not sure how the "catch ( ... )"'s rethrow would interact with the
finally clause, that seems to be an area of discussion regarding the
differences between RFC 63 and RFC 88.
=head2 Goals of exceptions
This is my list so far, feel free to suggest more.
In the examples thus far, each "fopen" call could independently fail,
but the overall program appears to need to open all three, or none, in
a somewhat atomic manner. While the code to deal with a single fopen
call and the possibility that it fails is straightforward, the
complexity of the situation results from the polynomial explosion of
code and branches resulting from increasing numbers of operations.
This is my justification for the first 6 items on the list.
While I have nothing against OO techniques (I've found C++ OO features
useful for a compiled language), it is somewhat cumbersome to deal with
OO for small projects. Perhaps some of the "make everything an object"
RFCs for Perl6 will sidestep that cumbersomeness, and moot this point.
However, until or unless that is achieved, I'd rather not be forced to
use objects to achieve exception handling. On the other hand, when
building large system, having an exception object might be helpful.
This is my justification for item 7.
1) Keep the cleanup code near the setup code, to keep it understandable
2) Keep the cleanup code in the same scope as the setup code, to avoid
hoisting variables into higher scopes.
3) Avoid redundancy and complex control flow in the visible cleanup
code paths.
4) Achieve a structured form of non-local goto to allow exiting
multiple levels of subroutine calls without coding tests of error
conditions at every level within the stack.
5) Achieve good default reporting of uncaught exceptions.
6) Make exception handling the default (or only) method of operation
for Perl code
7) Permit use of exception objects, but don't require them.
=head2 Techniques for exceptions
=head3 Technique for goals 1-3
Add a new except clause that can modify a statement or a block:
statement1 except statement2;
Either statement1 or statement2 can be made into blocks, with the
result that scoping problems resurface, but often times they wouldn't
need to be.
statement1 is executed as normal, and the except clause is executed,
which causes statement2 to be pushed on the stack of cleanup code.
For example (I'll use Perl language examples henceforth).
$handle1 = open ( "<file1" ) except close ( $handle1 );
throw "Error opening file1" if ! defined $handle1;
$handle2 = open ( "<file2" ) except close ( $handle2 );
throw "Error opening file2" if ! defined $handle2;
$handle3 = open ( "<file3" ) except close ( $handle3 );
throw "Error opening file3" if ! defined $handle3;
If you assume that Perl6 open gets enhanced to throw an exception when
it fails, you can simplify this to:
$handle1 = open ( "<file1" ) except close ( $handle1 );
$handle2 = open ( "<file2" ) except close ( $handle2 );
$handle3 = open ( "<file3" ) except close ( $handle3 );
=head3 Technique for goal 4
Add a new throw clause to achieve a structured non-local goto. The
throw statement takes a list as a parameter, and can be qualified with
the usual conditionals.
So you can
throw "Error opening file1";
or (printf-like throw)
throw "Error opening file %s", "file1";
or (OO throw)
throw new Exception::Error ("Error opening file", "file1");
throw new Exception::Success ("The answer is", 17 );
or (rethrow)
throw; # throws @_
Definition:
OO throw: a throw that throws a single object reference parameter.
Now a non-local goto has to have a target, so that is provided by the
catch statement, which is a sub-like block. There are rules for
finding the appropriate catch statement, listed later. A catch
statement gets a new @_ which is initialized to the list supplied to
the throw. These catch examples all use die to make the errors fatal,
but if die is not used, execution would continue with the next
non-catch statement after the catch statement.
catch { die join ( ", ", @_ ); }
or (printf-like catch)
catch { my ( $msg, @parm ) = @_; die sprintf "$msg\n", @parm; }
or (conditional catch)
catch ( $_[0] =~ /^Error/ ) { die join ( ", ", @_ ); }
or (simple OO catch)
catch { die $_[0]->message; }
or (complex OO catch)
catch Exception::Error { die $_[0]->message; }
catch Exception::Success ( print $_[0]->message; exit ( 0 ); }
catch Exception { die "unexpected:" . sprintf $_[0]->message; }
What about the rules for determining which catch statement is the
target of a particular throw? A combination of lexical and dynamic
scope rules, which aren't that different from those for C++.
Definition:
appropriate catch statement
case 1: for an OO throw, an appropriate catch statement is one that
lists the class of the reference thrown.
case 2: for other throws, an appropriate catch statement is one that
doesn't list a class name, and either has no expression, or has an
expression which is true when evaluated (with @_ referring to the
parameters thrown).
Catch selection rules:
- If the scope containing the throw contains catch statements, they are
examined in source code order to determine if any are appropriate.
The first appropriate catch statement is used.
- If the scope of the throw contains no appropriate catch statement,
all except clauses for that scope are popped off the cleanup stack
and executed (in reverse order, that's why it is a stack).
- Each lexically larger scope within the sub is examined in like
manner, using the above two rules. There is one exception to this
rule: if the throw is within the scope of a catch statement, the
scope containing that catch statement is not examined to find a
handler for such embedded throws. See example below.
- If the sub contains no appropriate catch statement, the above three
rules are used for each sub scope found on the call stack.
- There are two implicit catch phrases at the end of the outer scope:
catch UNIVERSAL { die "uncaught OO throw: $_[0]"; };
catch { die "uncaught throw: @_"; }
Example for rule three's exception:
{
# some code, it, or something it calls, does a "throw 37"
catch ( $_[0] == 37 ) {
# the throw is caught here
{ # somewhere down inside some nested scope, someone does
throw 38;
}
# if there were a "catch ( $_[0] == 38 ) { ... }" here, it would
# be allowed to catch the throw 38.
}
catch ( $_[0] == 38 ) {
# this should not catch throws from within catch clauses at the
# same lexical scope.
}
}
catch ( $_[0] == 38 ) {
# but this one should catch it!
}
The whole point being that only one catch in a scope should ever
execute until it is complete, and that errors within a catch statement
or block should be caught within that statement or block, or be passed
to outer scopes to be caught there.
=head2 Results
New statements throw & catch, with semantics similar to their
counterparts in other languages.
New clause except which localizes cleanup code near, and potentially in
the same scope as, the corresponding setup code.
Two new implicit catch phrases.
Orthogonality with eval/die for compatibility.
Support for both OO and structured programming.
=head2 C++-like Usage
When putting all the above features together, it is possible to
construct syntax that looks very similar to the equivalent C++ syntax.
This might seem familiar to some, as a way to ease into the more
powerful Perlish syntax proposed here. An example along the lines of
the earlier examples would be:
// repeat of earlier C++ example
try {
handle1 = fopen ( ... );
handle2 = fopen ( ... );
handle3 = fopen ( ... );
...
}
catch ( error_type_1 ) {
// report error type 1, handle it
}
catch ( error_type_2 ) {
// report error type 2, handle it
}
catch ( ... ) {
throw;
}
finally {
if ( handle1 ) close ( handle1 );
if ( handle2 ) close ( handle2 );
if ( handle3 ) close ( handle3 );
}
Corresponding Perl example using the features of this RFC:
# Note, try keyword is not needed, because exception handling is always
# available. The "try" block is only needed to be C++-like in bundling
# together all the "except" processing (which is better left
# distributed IMHO).
{
$handle1 = open ( ... );
$handle2 = open ( ... );
$handle3 = open ( ... );
...
}
except {
close ( $handle1 );
close ( $handle2 );
close ( $handle3 );
}
catch error_type_1 {
# report error type 1, handle it
}
catch error_type_2 {
# report error type 2, handle it
}
catch {
throw;
}
It should be noted that the last catch phrase in both examples could be
omitted without semantic change in both C++ and using the facilities of
this RFC. It is just shown to indicate how to write a phrase which
catches everything else, and how to code an explicit rethrow of the
same exception.
=head1 IMPLEMENTATION
no clue
=head1 REFERENCES
RFC 63: Exception handling syntax
RFC 88: Structured Exception Handling Mechanism (Try)
CPAN Error.pm