RFC 119 (v1) object neutral error handling via exceptions

Perl6 RFC Librarian Wed, 16 Aug 2000 13:19:54 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

object neutral error handling via exceptions

=head1 VERSION

  Maintainer: Glenn Linderman <[EMAIL PROTECTED]>
  Date: 16 Aug 2000
  Version: 1
  Mailing List: [EMAIL PROTECTED]
  Number: 119

=head1 ABSTRACT

Revisit what  the goals  of error handling  and exceptions are  for, to
determine the set of desirable  unit operations, rather than start with
a bundle of stuff from another language, and try to make it Perlish.

=head1 DESCRIPTION

There  are  numerous RFCs  regarding  a  complete  bundle of  exception
handling  mechanisms.   Most  of  them  are modeled  after  some  other
languages exception  handling mechanism, adapted somewhat  to Perl, and
somewhat to the goals of the author.  While this is not all bad, as the
problems being faced  were faced in the other languages  as well, it is
not  necessarily all  good,  either.   This RFC  examines  some of  the
incentives behind  C++ exceptions, both  the structure of the  code and
the structure  of the exception object,  then examines the  goals of an
exception mechanism,  then examines some techniques that  could be used
to reach the goals.  The result can  be made to look a lot like the C++
exception mechanism if desired, but  can be much more powerful when all
its features are used.  So this leads to the following "head2"s:

- C++ exceptions

- Goals of exceptions

- Techniques for exceptions

- Results

- C++-like Usage


I focus on  C++ rather than Java, because  Java (pardon me, Java-heads)
is just an attempt to use the best parts of C++ without all the baggage
of C, so while most of this could have been changed in Java, it wasn't.
This made Java easy to learn for  C users who'd read about C++, and for
C++ users.  This didn't make  Java a significantly better language than
C++, although they  were able to remove some of  the worst C++ language
traps.  To excel,  you need to not only remove the  worst, but add some
best.  I think that's the goal of Perl.

While Graham's error.pm module is a valiant attempt to include C++-like
exception handling  in Perl, it has various  deficiencies (discussed by
others) that can  be attributed to be an add-on to  Perl, just like C++
exception handling has various  deficiencies because of being an add-on
to C.


=head2 C++ exceptions

Remember that C++ exceptions were  built first as a preprocessor for C.
Therefore,  the mechanisms  used had  to exist  in C.   Stack unwinding
could therefore only be done  by using the only non-local goto facility
supported by C:  longjmp.  This forced a number  of decisions about the
design of exceptions, not all of which are good.

I note  in passing that  ANSI Forth defines  catch and throw  which use
single cells  as parameters... so not  all usage of catch  and throw is
related to object techniques.

=head3 Keyword try

First, longjmp doesn't work without a setjmp, and setjmp must be called
prior to  longjmp.  This is the  basic justification for  try: it calls
setjmp at a point within the  scope of the code for which the exception
handling mechanism is to be activated.  Some attempts have been made to
justify the use  of the try keyword as  aiding programmer comprehension
of the  scope of the try block,  and perhaps it does  this in languages
where  some  code  may  be  in  the scope  of  the  exception  handling
mechanism, and other code may not be.

It would  seem, however, that  the best implementation of  an exception
handling mechanism would be that all  code is in scope of the exception
handling mechanism,  so that exceptions  cannot be ignored,  other than
explicitly.  Perl's die is of  that flavor: very hard to ignore, except
explicitly.

=head3 Scoping problems

Let's  presume the  example often  cited, of  wishing to  close  a file
handle during the unwind, here's some C++ for that:

  FILE *handle;
  try {
     handle = fopen ( ... );
     ...
  }
  catch ( ... ) {
    fclose ( handle );
    throw;
  }

Note that  "handle" has  to be  defined outside the  scope of  the try,
because catch  cannot see the  scope defined by  the try block,  and is
completely  unable to  recover from  problems that  are  not explicitly
hoisted outside of the try block.

=head3 Control flow problem #1

Here's  some icky  C error  handling code:  3 errors  to handle  so the
pattern  becomes obvious, but  I tried  to keep  them simple  (all open
errors to build  on the case above)--they can get  much more complex in
practice, when the code to deal with an error gets more complex.

  int returned_error;
  FILE *handle1, *handle2, *handle3;
  if ( ! handle1 = fopen ( ... )) {
    return errno;
  }
  if ( ! handle2 = fopen ( ... )) {
    returned_error = errno;
    close ( handle1 );
    return returned_error;
  }
  if ( ! handle3 = fopen ( ... )) {
    returned_error = errno;
    close ( handle1 );
    close ( handle2 );
    return returned_error;
  }
  ...

Here's an icky attempt to reduce the redundant error code:

  int returned_error;
  FILE *handle1, *handle2, *handle3;
  if ( ! handle1 = fopen ( ... )) {
    returned_error = errno;
handle1_return:
    return returned_error;
  }
  if ( ! handle2 = fopen ( ... )) {
    returned_error = errno;
handle2_return:
    close ( handle1 );
    goto handle1_return;
  }
  if ( ! handle3 = fopen ( ... )) {
    returned_error = errno;
handle3_return:
    close ( handle2 );
    goto handle2_return;
  }
  ...

While this  solves the  redundancies of the  cleanup code,  the cleanup
code for handle1  is by the code that attempts  to open handle2, rather
than being bundled  with the code that opens  handle1.  C never claimed
to be OO, but even without OO, this is icky.

Translating to  C++ doesn't help much.  Assuming  (not accurately) that
C++'s fopen  throws an  error if it  fails, to simulate  proposals that
Perl's open should do exactly that:

  FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL;
  try {
    handle1 = fopen ( ... );
    handle2 = fopen ( ... );
    handle3 = fopen ( ... );
    ...
  }
  catch ( ... ) {
    if ( handle1 ) close ( handle1 );
    if ( handle2 ) close ( handle2 );
    if ( handle3 ) close ( handle3 );
    throw;
  }

Some would  like this, because it  removes all the  error handling code
from the control flow, but to support that, the handles must be outside
the try block (as noted in the previous section) so they can be seen by
the catch block, they must be initialized even if never used (not a bad
programming practice,  but certainly not  needed in the C  examples) so
that the catch block doesn't do stupid things, and the code to clean up
a handle is far removed from the code that sets up the handle.

Perl, fortunately,  initializes all its  variables to undef, so  we are
saved from that aspect of C/C++.

=head3 Control flow problem #2

The  above examples  all dealt  with cases  where the  error  is simply
rethrown, using the "catch ( ... )" as a "finally" block per RFC 88, or
a  "continue" block  per RFC  63.  When  actually attempting  to handle
errors,  we discover  that any  commonality between  handling different
errors results in duplicate code (or additional subroutines or gotos):

  FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL;
  try {
    handle1 = fopen ( ... );
    handle2 = fopen ( ... );
    handle3 = fopen ( ... );
    ...
  }
  catch ( error_type_1 ) {
    if ( handle1 ) close ( handle1 );
    if ( handle2 ) close ( handle2 );
    if ( handle3 ) close ( handle3 );
    // ... report error type 1, handle it
  }
  catch ( error_type_2 ) {
    if ( handle1 ) close ( handle1 );
    if ( handle2 ) close ( handle2 );
    if ( handle3 ) close ( handle3 );
    // ... report error type 2, handle it
  }
  catch ( ... ) {
    if ( handle1 ) close ( handle1 );
    if ( handle2 ) close ( handle2 );
    if ( handle3 ) close ( handle3 );
    throw;
  }

Or you could:

  void help_clean ( FILE * handle1, FILE * handle2, FILE * handle3 ) {
    if ( handle1 ) close ( handle1 );
    if ( handle2 ) close ( handle2 );
    if ( handle3 ) close ( handle3 );
  }

  FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL;
  try {
    handle1 = fopen ( ... );
    handle2 = fopen ( ... );
    handle3 = fopen ( ... );
    ...
  }
  catch ( error_type_1 ) {
    help_clean ( handle1, handle2, handle3 );
    // ... report error type 1, handle it
  }
  catch ( error_type_2 ) {
    help_clean ( handle1, handle2, handle3 );
    // ... report error type 2, handle it
  }
  catch ( ... ) {
    help_clean ( handle1, handle2, handle3 );
    throw;
  }

This removes the error handling  code even further from the setup code,
still requires  redundancy among the catch phrases,  and introduces new
functions dealing only with cleanup.  Assuming an RFC 88 finally clause
added to C++ would help, if and only if and only if (if I understand it
correctly)  the handles  can  be closed  _at  the end_  of the  cleanup
process. That would produce:

  FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL;
  try {
    handle1 = fopen ( ... );
    handle2 = fopen ( ... );
    handle3 = fopen ( ... );
    ...
  }
  catch ( error_type_1 ) {
    // ... report error type 1, handle it
  }
  catch ( error_type_2 ) {
    // ... report error type 2, handle it
  }
  catch ( ... ) {
    throw;
  }
  finally {
    if ( handle1 ) close ( handle1 );
    if ( handle2 ) close ( handle2 );
    if ( handle3 ) close ( handle3 );
  }

I'm not sure how the "catch  ( ... )"'s rethrow would interact with the
finally clause,  that seems to be  an area of  discussion regarding the
differences between RFC 63 and RFC 88.

=head2 Goals of exceptions

This is my list so far, feel free to suggest more.

In the examples thus far, each "fopen" call could independently fail,
but the overall program appears to  need to open all three, or none, in
a somewhat atomic  manner.  While the code to deal  with a single fopen
call  and  the  possibility  that  it  fails  is  straightforward,  the
complexity of  the situation results  from the polynomial  explosion of
code  and branches  resulting  from increasing  numbers of  operations.
This is my justification for the first 6 items on the list.

While I have nothing against  OO techniques (I've found C++ OO features
useful for a compiled language), it is somewhat cumbersome to deal with
OO for small projects.  Perhaps some of the "make everything an object"
RFCs for Perl6 will sidestep  that cumbersomeness, and moot this point.
However, until or unless that is  achieved, I'd rather not be forced to
use objects  to achieve  exception handling.  On  the other  hand, when
building  large system, having  an exception  object might  be helpful.
This is my justification for item 7.

1) Keep the cleanup code near the setup code, to keep it understandable

2) Keep the cleanup code in the  same scope as the setup code, to avoid
   hoisting variables into higher scopes.

3) Avoid  redundancy and complex  control flow  in the  visible cleanup
   code paths.

4) Achieve  a  structured  form  of  non-local goto  to  allow  exiting
   multiple levels  of subroutine calls  without coding tests  of error
   conditions at every level within the stack.

5) Achieve good default reporting of uncaught exceptions.

6) Make exception  handling the default  (or only) method  of operation
   for Perl code

7) Permit use of exception objects, but don't require them.


=head2 Techniques for exceptions

=head3 Technique for goals 1-3

Add a new except clause that can modify a statement or a block:

  statement1 except statement2;

Either  statement1 or  statement2 can  be  made into  blocks, with  the
result that  scoping problems resurface, but often  times they wouldn't
need to be.

statement1 is  executed as normal,  and the except clause  is executed,
which causes statement2 to be pushed on the stack of cleanup code.

For example (I'll use Perl language examples henceforth).

  $handle1 = open ( "<file1" ) except close ( $handle1 );
  throw "Error opening file1" if ! defined $handle1;
  $handle2 = open ( "<file2" ) except close ( $handle2 );
  throw "Error opening file2" if ! defined $handle2;
  $handle3 = open ( "<file3" ) except close ( $handle3 );
  throw "Error opening file3" if ! defined $handle3;

If you assume that Perl6 open  gets enhanced to throw an exception when
it fails, you can simplify this to:

  $handle1 = open ( "<file1" ) except close ( $handle1 );
  $handle2 = open ( "<file2" ) except close ( $handle2 );
  $handle3 = open ( "<file3" ) except close ( $handle3 );


=head3 Technique for goal 4

Add a  new throw  clause to achieve  a structured non-local  goto.  The
throw statement takes a list as  a parameter, and can be qualified with
the usual conditionals.

So you can

   throw "Error opening file1";

or (printf-like throw)

   throw "Error opening file %s", "file1";

or (OO throw)

   throw new Exception::Error ("Error opening file", "file1");
   throw new Exception::Success ("The answer is", 17 );

or (rethrow)

   throw; # throws @_

Definition:

  OO throw: a throw that throws a single object reference parameter.

Now a non-local goto  has to have a target, so that  is provided by the
catch  statement, which  is  a  sub-like block.   There  are rules  for
finding  the  appropriate  catch  statement,  listed  later.   A  catch
statement gets  a new @_ which  is initialized to the  list supplied to
the throw.  These catch examples all  use die to make the errors fatal,
but  if  die  is not  used,  execution  would  continue with  the  next
non-catch statement after the catch statement.

   catch { die join ( ", ", @_ ); }

or (printf-like catch)

   catch { my ( $msg, @parm ) = @_; die sprintf "$msg\n", @parm; }

or (conditional catch)

   catch ( $_[0] =~ /^Error/ ) { die join ( ", ", @_ ); }

or (simple OO catch)

   catch { die $_[0]->message; }

or (complex OO catch)

   catch Exception::Error { die $_[0]->message; }
   catch Exception::Success ( print $_[0]->message; exit ( 0 ); }
   catch Exception { die "unexpected:" . sprintf $_[0]->message; }

What  about the  rules for  determining  which catch  statement is  the
target of  a particular  throw?  A combination  of lexical  and dynamic
scope rules, which aren't that different from those for C++.

Definition:

  appropriate catch statement

  case 1: for  an OO throw, an appropriate catch  statement is one that
  lists the class of the reference thrown.

  case 2: for other throws,  an appropriate catch statement is one that
  doesn't list  a class name, and  either has no expression,  or has an
  expression which  is true  when evaluated (with  @_ referring  to the
  parameters thrown).

Catch selection rules:

- If the scope containing the throw contains catch statements, they are
  examined in  source code order  to determine if any  are appropriate.
  The first appropriate catch statement is used.

- If the scope of the throw contains no appropriate catch statement,
  all except  clauses for that scope  are popped off  the cleanup stack
  and executed (in reverse order, that's why it is a stack).

- Each  lexically larger  scope  within  the sub  is  examined in  like
  manner, using  the above two rules.   There is one  exception to this
  rule: if  the throw  is within  the scope of  a catch  statement, the
  scope  containing that  catch statement  is  not examined  to find  a
  handler for  such embedded throws.  See example below.

- If the sub  contains no appropriate catch statement,  the above three
  rules are used for each sub scope found on the call stack.

- There are two implicit catch phrases at the end of the outer scope:

  catch UNIVERSAL { die "uncaught OO throw: $_[0]"; };
  catch { die "uncaught throw: @_"; }

Example for rule three's exception:

  {
    # some code, it, or something it calls, does a "throw 37"
    catch ( $_[0] == 37 ) {
      # the throw is caught here
      { # somewhere down inside some nested scope, someone does
        throw 38;
      }
      # if there were a "catch ( $_[0] == 38 ) { ... }" here, it would
      # be allowed to catch the throw 38.
    }
    catch ( $_[0] == 38 ) {
      # this should not  catch throws from within catch  clauses at the
      # same lexical scope.
    }
  }
  catch ( $_[0] == 38 ) {
    # but this one should catch it!
  }

The  whole point  being that  only  one catch  in a  scope should  ever
execute until it is complete,  and that errors within a catch statement
or block should be caught within  that statement or block, or be passed
to  outer scopes  to be  caught there.

=head2 Results

New  statements  throw  &   catch,  with  semantics  similar  to  their
counterparts in other languages.

New clause except which localizes cleanup code near, and potentially in
the same scope as, the corresponding setup code.

Two new implicit catch phrases.

Orthogonality with eval/die for compatibility.

Support for both OO and structured programming.


=head2 C++-like Usage

When  putting  all the  above  features  together,  it is  possible  to
construct syntax that looks very  similar to the equivalent C++ syntax.
This  might seem  familiar to  some, as  a way  to ease  into  the more
powerful Perlish syntax  proposed here.  An example along  the lines of
the earlier examples would be:

// repeat of earlier C++ example

  try {
    handle1 = fopen ( ... );
    handle2 = fopen ( ... );
    handle3 = fopen ( ... );
    ...
  }
  catch ( error_type_1 ) {
    // report error type 1, handle it
  }
  catch ( error_type_2 ) {
    // report error type 2, handle it
  }
  catch ( ... ) {
    throw;
  }
  finally {
    if ( handle1 ) close ( handle1 );
    if ( handle2 ) close ( handle2 );
    if ( handle3 ) close ( handle3 );
  }

Corresponding Perl example using the features of this RFC:

# Note, try keyword is not needed, because exception handling is always
# available.  The "try" block is only needed to be C++-like in bundling
# together  all   the  "except"   processing  (which  is   better  left
# distributed IMHO).

  {
    $handle1 = open ( ... );
    $handle2 = open ( ... );
    $handle3 = open ( ... );
    ...
  }
  except {
    close ( $handle1 );
    close ( $handle2 );
    close ( $handle3 );
  }
  catch error_type_1 {
    # report error type 1, handle it
  }
  catch error_type_2 {
    # report error type 2, handle it
  }
  catch {
    throw;
  }


It should be noted that the last catch phrase in both examples could be
omitted without semantic change in both C++ and using the facilities of
this RFC.   It is just  shown to indicate  how to write a  phrase which
catches everything  else, and  how to code  an explicit rethrow  of the
same exception.

=head1 IMPLEMENTATION

no clue

=head1 REFERENCES

RFC 63: Exception handling syntax

RFC 88: Structured Exception Handling Mechanism (Try)

CPAN Error.pm
RFC 119 (v1) object neutral error handling via exceptions

Reply via email to