Hi folks.  In several recent RFCs and related discussion, the question of error 
handling has come up.  Specifically, the problem is:

* "return null" conflicts with "but sometimes null is a real value" (the 
validity of that position is debatable, but common), and provides no useful 
information as to what went wrong.
* Exceptions are very expensive, the hierarchy is confusing, and handling them 
properly is a major pain.  Failing to handle them properly is very easy since 
you have no way of knowing what exceptions the code you're calling might throw, 
or its nested calls, etc.  That makes them poorly suited for mundane, 
predictable error conditions.
* trigger_error() is, well, a mess and not suitable for signaling to a calling 
function that something recoverable-in-context went wrong.
* And... that's all we've got as options.

I've had an idea kicking around in my head for a while, which I know I've 
mentioned before.  Given the timing, I want to put it out in its current 
unfinished form to see if there's interest in me bothering to finish it, or if 
it doesn't have a snowball's chance in hell of happening so it's not worth my 
time to further develop.

I know I've posted this before, but it's useful for background:

https://peakd.com/hive-168588/@crell/much-ado-about-null
https://joeduffyblog.com/2016/02/07/the-error-model/

>From both prior discussions here as well as my understanding of language 
>design trends, it seems the consensus view is that a Result type (aka, an 
>Either monad) is the ideal mechanism for robust error handling.  However, it 
>requires generics to be really viable, which we don't have.  It's also very 
>clumsy to use in a classic-OOP language (like PHP) without special dedicated 
>syntax.

Various languages work around that in various ways.  Rust built its whole error 
system on Result types, and later added the `?` operator to indicate "and if 
this returns an error result, just return it directly", making delegating error 
handling vastly easier.  Kotlin (via its Arrow library) relies on heavy use of 
chained tail-closures.  Go has a convention of a "naked either" using two 
return values, but doesn't have any special syntax for it leading to famously 
annoying boilerplate.  Python has lightweight exceptions so that throwing them 
willy nilly as a control flow tool is actually OK and Pythonic.

However, as noted in the "Error Model" article above, and this is key, a Result 
type is isomorphic to a *checked* exception.  A checked exception is one where 
a function must explicitly declare what it can throw, and if it throws 
something else it's the function's error, and a compile time error.  It also 
means any "bubbling" of exceptions has to be explicit at each function step.  
That's in contrast to unchecked exceptions, as PHP has now, which may be thrown 
from nearly anywhere and will silently bubble up and crash the program if not 
otherwise handled.

The key point here is that a happy-path return and an 
unhappy-but-not-world-ending-path need to be different.  Using the return value 
for both (what returning null does) is simply insufficient.  

The "Error Model" article goes into the pros and cons of checked vs unchecked 
exceptions so I won't belabor the point, except to say that most arguments 
against checked exceptions are based on Java's very-broken implementation of 
checked-except-when-it's-not exceptions.  But as noted, what Rust and Go do is 
checked exceptions, aka a Result type, just spelled differently.  The advantage 
of checked exceptions is that we don't need generics at all, and still get all 
the benefits.  We can also design syntax around them specifically to make them 
more ergonomic.

I am invisioning something like this:

```
function div(int $n, int $d): float raises ZeroDivisor
{
  if ($d === 0) {
    raise new ZeroDivisor();  // This terminates the function.
  }
  return $n/$d;
}
```

The "raises" declaration specifies a class or interface type that could be 
"raised".  It can be any object; no required Exception hierarchy, no backtrace, 
just a boring old object value.  Enum if you feel like it, or any other object. 
 We could probably allow union or full DNF types there if we wanted, though I 
worry that it may lead to too confusing of an API. (To bikeshed later.)  Static 
analysis tools could very easily detect if the code doesn't match up with the 
declared raises.

This feature already exists in both Midori (the subject of the "Error Model" 
article) and Swift.  So it's not a new invention; in fact it's quite old.

The handling side is where I am still undecided on syntax.  Swift uses 
essentially try-catch blocks, though I fear that would be too verbose in 
practice and would be confused with existing "heavy" exceptions.  Midori did 
the same.  

Various ideas I've pondered in no particular order:

```
// Suck it up and reuse try-catch

function test() { // No declared raise, so if it doesn't handle ZeroDivisor 
itself, fatal.
  try {
    $val = div(3, 0);
  } catch (ZeroDivisor $e) {
    print "Nope.";
  }
}
```

```
// try-catch spelled differently to avoid confusion with exceptions
try {
  $val = div(3, 0);
} handle (ZeroDivisor $e) {
  print "Nope.";
}
```

```
// Some kind of suffix block, maybe with a specially named variable?

$val = div(3, 0) else { print $err->message; return 0; }
```

```
// A "collapsed" try-catch block.
$val = try div(3, 0) 
  catch (ZeroDivisor $e) { print "Nope"; }
  catch (SomethingElse $e) { print "Wat?"; }
```

```
// Similar to Rust's ? operator, to make propagating an error easier.

// The raise here could be the same or wider than what div() raises.
function test(): float raises ZeroDivisor {
  $val = div(3, 0) reraise;
  // use $val safely knowing it was returned and nothing was raised.
}
```

Or other possibilities I've not considered.

The use cases for a dedicated error channel are many:

* Any variation of "out of bounds": Could be "record not found in database", or 
"no such array key" or "you tried to get the first item of an empty list", or 
many other things along those lines.
* Expected input validation errors.  This would cover the URL/URI RFC's complex 
error messages, without the C-style "inout" parameter.
* Chaining validation.  A series of validators that can return true (or just 
the value being validated) OR raise an object with the failure reason.  A 
wrapping function can collect them all into a single error object to return to 
indicate all the various validation failures.
* A transformer chain, which does the same as validation but passes on the 
transformed value and raises on the first erroring transformer.

Exceptions remain as is, for "stop the world" unexpected failures or developer 
errors (bugs).  But mundane errors, where local resolution is both possible and 
appropriate, get a dedicated channel and syntax with no performance overhead.  
That also naturally becomes a Python-style "better to beg forgiveness than ask 
permission" approach to error handling if desired, without all the overhead of 
exceptions.


So that's what I've got so far.  My question for the audience is:

1. Assuming we could flesh out a comfortable and ergonomic syntax, would you 
support this feature, or would you reject it out of hand?

2. For engine-devs: Is this even feasible? :-)  And if so, anyone want to join 
me in developing it?

3. And least relevant, I'm very open to suggestions for the syntax, though the 
main focus right now is question 1 to determine if discussing syntax is even 
worthwhile.


-- 
  Larry Garfield
  la...@garfieldtech.com

Reply via email to