Hi folks. In several recent RFCs and related discussion, the question of error handling has come up. Specifically, the problem is:
* "return null" conflicts with "but sometimes null is a real value" (the validity of that position is debatable, but common), and provides no useful information as to what went wrong. * Exceptions are very expensive, the hierarchy is confusing, and handling them properly is a major pain. Failing to handle them properly is very easy since you have no way of knowing what exceptions the code you're calling might throw, or its nested calls, etc. That makes them poorly suited for mundane, predictable error conditions. * trigger_error() is, well, a mess and not suitable for signaling to a calling function that something recoverable-in-context went wrong. * And... that's all we've got as options. I've had an idea kicking around in my head for a while, which I know I've mentioned before. Given the timing, I want to put it out in its current unfinished form to see if there's interest in me bothering to finish it, or if it doesn't have a snowball's chance in hell of happening so it's not worth my time to further develop. I know I've posted this before, but it's useful for background: https://peakd.com/hive-168588/@crell/much-ado-about-null https://joeduffyblog.com/2016/02/07/the-error-model/ >From both prior discussions here as well as my understanding of language >design trends, it seems the consensus view is that a Result type (aka, an >Either monad) is the ideal mechanism for robust error handling. However, it >requires generics to be really viable, which we don't have. It's also very >clumsy to use in a classic-OOP language (like PHP) without special dedicated >syntax. Various languages work around that in various ways. Rust built its whole error system on Result types, and later added the `?` operator to indicate "and if this returns an error result, just return it directly", making delegating error handling vastly easier. Kotlin (via its Arrow library) relies on heavy use of chained tail-closures. Go has a convention of a "naked either" using two return values, but doesn't have any special syntax for it leading to famously annoying boilerplate. Python has lightweight exceptions so that throwing them willy nilly as a control flow tool is actually OK and Pythonic. However, as noted in the "Error Model" article above, and this is key, a Result type is isomorphic to a *checked* exception. A checked exception is one where a function must explicitly declare what it can throw, and if it throws something else it's the function's error, and a compile time error. It also means any "bubbling" of exceptions has to be explicit at each function step. That's in contrast to unchecked exceptions, as PHP has now, which may be thrown from nearly anywhere and will silently bubble up and crash the program if not otherwise handled. The key point here is that a happy-path return and an unhappy-but-not-world-ending-path need to be different. Using the return value for both (what returning null does) is simply insufficient. The "Error Model" article goes into the pros and cons of checked vs unchecked exceptions so I won't belabor the point, except to say that most arguments against checked exceptions are based on Java's very-broken implementation of checked-except-when-it's-not exceptions. But as noted, what Rust and Go do is checked exceptions, aka a Result type, just spelled differently. The advantage of checked exceptions is that we don't need generics at all, and still get all the benefits. We can also design syntax around them specifically to make them more ergonomic. I am invisioning something like this: ``` function div(int $n, int $d): float raises ZeroDivisor { if ($d === 0) { raise new ZeroDivisor(); // This terminates the function. } return $n/$d; } ``` The "raises" declaration specifies a class or interface type that could be "raised". It can be any object; no required Exception hierarchy, no backtrace, just a boring old object value. Enum if you feel like it, or any other object. We could probably allow union or full DNF types there if we wanted, though I worry that it may lead to too confusing of an API. (To bikeshed later.) Static analysis tools could very easily detect if the code doesn't match up with the declared raises. This feature already exists in both Midori (the subject of the "Error Model" article) and Swift. So it's not a new invention; in fact it's quite old. The handling side is where I am still undecided on syntax. Swift uses essentially try-catch blocks, though I fear that would be too verbose in practice and would be confused with existing "heavy" exceptions. Midori did the same. Various ideas I've pondered in no particular order: ``` // Suck it up and reuse try-catch function test() { // No declared raise, so if it doesn't handle ZeroDivisor itself, fatal. try { $val = div(3, 0); } catch (ZeroDivisor $e) { print "Nope."; } } ``` ``` // try-catch spelled differently to avoid confusion with exceptions try { $val = div(3, 0); } handle (ZeroDivisor $e) { print "Nope."; } ``` ``` // Some kind of suffix block, maybe with a specially named variable? $val = div(3, 0) else { print $err->message; return 0; } ``` ``` // A "collapsed" try-catch block. $val = try div(3, 0) catch (ZeroDivisor $e) { print "Nope"; } catch (SomethingElse $e) { print "Wat?"; } ``` ``` // Similar to Rust's ? operator, to make propagating an error easier. // The raise here could be the same or wider than what div() raises. function test(): float raises ZeroDivisor { $val = div(3, 0) reraise; // use $val safely knowing it was returned and nothing was raised. } ``` Or other possibilities I've not considered. The use cases for a dedicated error channel are many: * Any variation of "out of bounds": Could be "record not found in database", or "no such array key" or "you tried to get the first item of an empty list", or many other things along those lines. * Expected input validation errors. This would cover the URL/URI RFC's complex error messages, without the C-style "inout" parameter. * Chaining validation. A series of validators that can return true (or just the value being validated) OR raise an object with the failure reason. A wrapping function can collect them all into a single error object to return to indicate all the various validation failures. * A transformer chain, which does the same as validation but passes on the transformed value and raises on the first erroring transformer. Exceptions remain as is, for "stop the world" unexpected failures or developer errors (bugs). But mundane errors, where local resolution is both possible and appropriate, get a dedicated channel and syntax with no performance overhead. That also naturally becomes a Python-style "better to beg forgiveness than ask permission" approach to error handling if desired, without all the overhead of exceptions. So that's what I've got so far. My question for the audience is: 1. Assuming we could flesh out a comfortable and ergonomic syntax, would you support this feature, or would you reject it out of hand? 2. For engine-devs: Is this even feasible? :-) And if so, anyone want to join me in developing it? 3. And least relevant, I'm very open to suggestions for the syntax, though the main focus right now is question 1 to determine if discussing syntax is even worthwhile. -- Larry Garfield la...@garfieldtech.com