From: Allison Randal <[EMAIL PROTECTED]> Date: Wed, 5 Apr 2006 15:24:27 -0700
In: docs/pdds/clip/pddXX_exceptions.pod As with the I/O PDD, this isn't a final form, it's just a draft to seed discussion. What's missing? What's inaccurate? What's accurate for the current state of Parrot, but is something you always intended to write out later? What thoughts have you had on how exceptions should work? All comments, suggestions, and contributions cheerfully welcomed. Allison Here's what I hope is a contribution. -- Bob Rogers http://rgrjr.dyndns.org/ ------------------------------------------------------------------------ # Copyright: 2001-2006 The Perl Foundation. # $Id: pddXX_exceptions.pod 12153 2006-04-09 02:23:27Z rgrjr $ =head1 NAME docs/pdds/clip/pddXX_exceptions.pod - Parrot Exceptions . . . =item * C<push_eh> creates an exception handler and pushes it onto the control stack. It takes a label (the location of the exception handler) as its only argument. [Is this right? Treating exception handlers as label jumps rather than full subroutines is error-prone.] They are not "jumps" but continuations, so in a sense they are more general than subs, which don't have prior state. . . . =item * C<pushaction> pushes a subroutine object onto the control stack. If the control stack is unwound due to an exception (or C<popmark>, or subroutine return), the subroutine is invoked with an integer argument: C<0> means a normal return; C<1> means an exception has been raised. [Seems like there's lots of room for dangerous collisions here.] I'm not sure what you mean by "collisions" here, nor why you think they would be dangerous. Arguably, C<pushaction> is too simplistic; it doesn't provide for such things as the repeated exit-and-reenter behavior of coroutines, and there is no mechanism to specify a thunk that gets called when *entering* a dynamic context . . . =back =head1 IMPLEMENTATION [I'm not convinced the control stack is the right way to handle exceptions. Most of Parrot is based on the continuation-passing style of control, shouldn't exceptions be based on it too? See bug #38850.] Seems to me there isn't any real choice. Exception handlers are part of the dynamic context, and dynamic contexts nest in such a way as to behave like a stack. Even pure CPS implementations that want to maintain dynamic state have to create an explicit stack in a global variable somewhere. . . . Other opcodes respond to an C<errorson> setting to decide whether to throw an exception or return an error value. C<find_global> throws an exception (or returns a Null PMC) if the global name requested doesn't exist. C<find_name> throws an exception (or returns a Null PMC) if the name requested doesn't exist in a lexical, current, global, or built-in namespace. It's a little odd that so few opcodes throw exceptions (these are the ones that are documented, but a few others throw exceptions internally even though they aren't documented as doing so). It's worth considering either expanding the use of exceptions consistently throughout the opcode set, or eliminating exceptions from the opcode set entirely. The strategy for error handling should be consistent, whatever it is. [I like the way C<LexPad>s and the C<errorson> settings provide the option for exception-based or non-exception-based implementations, rather than forcing one or the other.] This have-your-cake-and-eat-it-too (HYCAEIT?) strategy sounds good in theory, but may be dangerous in practice. Which style of error handling a given piece of code uses is a static property of the way the code is written. On the other hand, C<errorson> is dynamic and global. If one of the modules you use wants to do error handling by checking return values, but another module doesn't check returns because it expects errors to be signalled, then no C<errorson> setting will satisfy both, regardless of how you want to design *your* code. I personally prefer exception-based error handling, since it scales better. I have been acting on this when the opportunity arises, changing internal_exception calls to real_exception when it makes sense, and when I'm mucking around in that code anyway. (A good example of this is "No exception to pop", come to think of it.) It is also helpful to get a backtrace when something fails. On the other hand, it would be a pain have to write 10 ops for an error handler just to catch a slightly unusual situation that could be handled adequately by testing a special return value. I think each case needs to be examined individually, but it's a choice of return value OR throwing an error. IMHO. =head2 Excerpt [Excerpt from "Perl 6 and Parrot Essentials" to seed discussion. Out-of-date in some ways, and in others it was simply speculative.] Exceptions provide a way of calling a piece of code outside the normal flow of control. They are mainly used for error reporting or cleanup tasks, but sometimes exceptions are just a funny way to branch from one code location to another one. Exceptions are objects that hold all the information needed to handle the exception: the error message, the severity and type of the error, etc. The class of an exception object indicates the kind of exception it is. Exception handlers are derived from continuations. They are ordinary subroutines that follow the Parrot calling conventions, but are never explicitly called from within user code. Not quite true; a Continuation is not a Sub, though it can be invoked like one. User code pushes an exception handler onto the control stack with the C<push_eh> opcode. The system calls the installed exception handler only when an exception is thrown. push_eh _handler # push handler on control stack find_global P10, "none" # may throw exception clear_eh # pop the handler off the stack ... _handler: # if not, execution continues here get_params '(0,0)', P0, S0 # handler is called with (exception, message) ... If the global variable is found, the next statement (C<clear_eh>) pops the exception handler off the control stack and normal execution continues. If the C<find_global> call doesn't find C<none> it throws an exception by passing an exception object to the exception handler. The first exception handler in the control stack sees every exception This is really the last (topmost) exception handler. thrown. The handler has to examine the exception object and decide whether it can handle it (or discard it) or whether it should C<rethrow> the exception to pass it along to an exception handler deeper in the stack. The C<rethrow> opcode is only valid in exception handlers. It pushes the exception object back onto the control stack so Parrot knows to search for the next exception handler in the stack. The This is not correct; exception objects are never pushed onto the control stack. And the exception handler itself is popped off the control stack before it is invoked. process continues until some exception handler deals with the exception and returns normally, or until there are no more exception handlers on the control stack. When the system finds no installed exception handlers it defaults to a final action, which normally means it prints an appropriate message and terminates the program. Currently it also prints a backtrace, which is really nice. Alas, the backtrace is only from the point of the final rethrow by the oldest (bottommost) exception handler. This is the greatest weakness with the current Parrot exception-handling design: By the time you find out that a given exception is unhandled, the dynamic environment of the C<throw> has been destroyed by the very process of searching for a willing handler. This makes it extremely difficult to write a debugger than can do anything useful about uncaught exceptions. When the system installs an exception handler, it creates a return continuation with a snapshot of the current interpreter context. If This is confusing; I assume you are talking about the Exception_Handler itself and not a RetContinuation. the exception handler just returns (that is, if the exception is cleanly caught) the return continuation restores the control stack back to its state when the exception handler was called, cleaning up the exception handler and any other changes that were made in the process of handling the exception. Hmm. It seems that an exception is "cleanly caught" only if it is not rethrown. It is therefore not possible to tell by looking at the exception itself whether or not it is "cleanly caught" or if it is still in the process of being signalled. Exceptions thrown by standard Parrot opcodes (like the one thrown by C<find_global> above or by the C<throw> opcode) are always resumable, so when the exception handler function returns normally it continues execution at the opcode immediately after the one that threw the exception. Other exceptions at the run-loop level are also generally resumable. You seem to want to say that unhandled exceptions are ignored. Is that correct? If so, I see several problems: 1. What is "the exception handler function" and how is it distinguished from the function that established the exception handler? [It sounds like you are expecting the exception handler to behave more like a closure than a continuation . . . ] 2. The previous paragraph says that if "the exception handler just returns", that means that "the exception is cleanly caught". Unless you want to propose a new mechanism, the only way a handler can decline to handle an exception is by rethrowing it, which precludes the possibility of resuming. 3. Shouldn't unhandled exceptions either enter the debugger if interactive, else die? Ignoring the fact that an opcode failed, like ignoring the fact that anything else failed, seems dangerous . . . new P10, Exception # create new Exception object set P10["_message"], "I die" # set message attribute throw P10 # throw it Exceptions are designed to work with the Parrot calling conventions. Since the return addresses of C<bsr> subroutine calls and exception handlers are both pushed onto the control stack, it's generally a bad idea to combine the two. How about replacing this with the following: . . . exception handlers are both pushed onto the control stack, care must be taken to nest them properly, i.e. by removing error handlers established after C<bsr> before the corresponding C<ret>. After all, it works as long as the user plays by the rules.