Hey everyone, I've been thinking for a while now about the advantages of 
nillable types, especially for basic types, and I'm interested in making an 
official language proposal. I'd love to know what the rest of the community 
thinks about it, and if this discussion has been had before (I found some 
vaguely similar past proposals but they were all specifically about making 
pointers easier to use, which is not the same thing).

Thanks,
Dan



# Proposal: Nillable types

## Summary

I propose that for any type T, the type `T | nil` should also be a valid 
type with the following properties:
1. The zero value of a `T | nil` is `nil`.
2. Any `T` or `nil` may be used as a value of `T | nil`
3. It is a compile-time error to use a `T | nil` as a `T` without first 
checking that it is non-nil.

So basically, you should be able to write code like

```go

func Foo(x int32 | nil) {
  if x != nil {
    fmt.Println("X squared is", x*x)
  } else {
    fmt.Println("X is undefined")
  }
}
```

## Motivation

There are loads of examples out there (I can dig up a bunch if it would be 
helpful) where various go programs need values that can take some "unset" 
value that is distinct from the zero value of that type. Examples include 
configuration objects, command line flags, and quite a lot of large 
structured data types (protobuf used to have every field be a pointer for 
this reason, though in v3 they got rid of this in their go implementation 
because it was so frustrating to do, and now go programs just can't 
distinguish unset values from zero values in protobufs).

### Alternatives and why they're bad

There are three common ways I've seen of dealing with this in Go, and they 
all have serious limitations:

#### Pointers

The variable `var x *int32` allows you to distinguish between `x == nil` 
and `*x == 0`. However, using a pointer has three major drawbacks:

1. No compile-time protection

Go does not force developers to put nil guards on pointers, so nothing will 
stop you from dereferencing `x` without checking if it's nil. I've lost 
count of the number of times I've seen code that panics in corner cases 
because someone assumed that some pointer would always be set.

2. Passed values are mutable

If I want a function that can take an integer or an unset marker, and I 
define it as `func foo(x *int32)`, then users of my library just have to 
trust me that calling `foo(&some_var)` won't change the value of 
`some_var`; this breaks an important encapsulation boundary.

3. No literals

You can't take the address of a literal, so instead of simply writing e.g. 
`x = 3`, you have to either use a temporary variable:
```
tmp := 3
x = &tmp
```
or write a helper function that does this for you. Numerous libraries have 
implemented these helper functions - for example, both 
https://pkg.go.dev/github.com/openconfig/ygot/ygot and 
https://pkg.go.dev/github.com/golang/protobuf/proto define helpers named 
`Bool`, `Float32`, `Float64`, `Int`, `Int32`, etc. just so that you can 
write
```
x = proto.Int32(3)
```
But this makes code a lot more cumbersome to read, and also requires the 
developer to restate the type of `x` every time (as compared to `x = 3`, 
where Go will infer that `3` means `int32(3)` based on the type of `x`).

[Sourcegraph finds more than 10k results for functions that just take an 
int and return a pointer to 
it](https://sourcegraph.com/search?q=context:global+/func+Int%5C%28%5Cw%2B+int%5C%29+%5C*int/+lang:go&patternType=keyword&sm=0)
 
so this is coming up A LOT.

#### Sentinel values

Some code uses explicit "impossible" values to indicate unset. For example, 
a nonegative value might have type `int` and be set to -1 to indicate that 
it is unset. However, this fails the criteria that the zero value for the 
type should be unset. It also requires that every function using this value 
check for -1 before using the value, and the compiler cannot enforce that 
this check has been made.

Furthermore, this requires you to use a broader type than the type you 
actually care about, which may be impossible (e.g. if the value can be any 
float) or extremely unwieldy (e.g. if you have to use an integer to 
represent a bool).

#### An additional bool

You can also approximate this by using a struct like

```go
struct {
  X int32
  IsZero bool
}
```

The zero value for this struct has `IsZero=false`, so you can use that to 
determine that `X` is not explicitly 0, but is in fact unset. However, this 
is confusing (what does it mean if IsZero is true but X is not 0?) and 
awkward (you have to remember to set IsZero any time you set X to 0, and to 
check IsZero any time you want to read X), and again, the compiler will not 
complain if you fail to do these.

An example of using BOTH a sentinel value AND an additional bool in golang 
itself is: 
https://github.com/golang/go/blob/68d3a9e417344c11426f158c7a6f3197a0890ff1/src/crypto/x509/x509.go#L724
 
. The `MaxPathLen` value is considered "unset" if it's set to -1 OR if it's 
set to 0 and the bool `MaxPathLenZero` is false.

This is necessary if you want to be able to mark it unset in a single line 
(`cert.MaxPathLen=-1`) but also have it be unset on any zero-valued (i.e. 
uninitialized) certificate. But as a consequence, every single use of 
MaxPathLen has to be guarded by multiple checks and any attempt to set it 
has to be careful about setting the 0 indicator as well; forgetting to take 
both possibilities into account would break your handling of X.509 
certificates (which could even be a security issue, if you're rolling your 
own certificate handler instead of using an existing one).

## Nillable Types

If Go supported nillable types, an example like `MaxPathLen` would be 
written simply as
`MaxPathLen uint32 | nil`. The zero value would be nil (unset), setting it 
to a specific value would be easy (`c.MaxPathLen=5`), and setting it to nil 
would also be easy (`c.MaxPathLen=nil`).

Moreover, the compiler could enforce at compile time that a developer can't 
forget the possibility of nil.

### Syntax

The syntax would simply be that any type can have `| nil` appended to it to 
make a new, nillable type. For the sake of sanity, it would be reasonable 
to generate syntax errors on redundant constructs like `int | nil | nil`.

This syntax is (in my opinion) more readable than some other languages' 
syntaxes for the same (e.g. `int? x` in C#) and more concise that most 
other languages (e.g. `x: typing.Optional[int]` in Python or `x :: Maybe 
Int` in haskell)

Types would be checked by type assertions:

```go
func foo2(x int | nil) {
  if i, ok := x.(int); ok {
    fmt.Println("x squared is", i*i)
  } else {
    fmt.Println("No value for x")
  }
}
```

As with other type assertions, you may omit `ok` if you're sure that the 
value will match, but it will panic if you're wrong:

```go

func foo3(x int | nil) {
  if x != nil {
    fmt.Println("x squared is", x.(int)*x.(int))
  } else {
    fmt.Println("No value for x")
  }
}
```

### Nice-to-have: Implicit type guards

Ideally, the compiler would also infer simple type guards so that we 
wouldn't need intermediate variables or unchecked type assertions:

```go
func foo(x int | nil) {
  if x != nil {
    fmt.Println("x squared is", x*x)
  } else {
    fmt.Println("No value for x")
  }
}
```
i.e. the compiler would infer that `x` cannot be nil inside the `if` block 
and therefore must be an int. Obviously this is impossible to do for 
arbitrary expressions, but typecheckers in numerous other languages (e.g. 
`mypy` and typescript) do recognize simple `if x != None` checks as type 
guards; I have no idea whether it would be difficult to add this to Go.


### Expressed in terms of pointers

You could think of `T | nil` as being like a `*T` except with easier syntax 
for using it and not passed by reference; you could implement it solely in 
terms of AST transformations if you wanted to by having the following 
statements correspond to each other:
 
`var x int32 | nil` -> `var _x *int`

`x = nil` -> `_x = nil`

`x = 3` -> `var _tmp int32 = 3; _x = &_tmp`

`y, ok := x.(int)` -> `var y int32, ok bool; if _x == nil { ok = false } 
else { y = *_x; ok = true}`

`y := x.(int)` -> `y := *_x`

`foo(x)` -> `if _x == nil { foo(nil) } else { _tmp := *_x; foo(&_tmp) }`

I doubt this would be the most efficient way to actually implement this 
feature (I am not an expert on the internal works of the go compiler), but 
the fact that it *could* be written this way makes me think it would not be 
difficult to add to the language.


### Other implications

This change would be entirely backward-compatible - no existing code would 
contain the `T | nil` syntax, so nothing would change in the compilation of 
any existing code.

I don't think this would make the language any harder to learn - the syntax 
for using it is the same as the syntax for other type assertions, and the 
use of | for union types is A) pretty common in other languages, and B) 
under discussion as a more general Go feature (#57644). Moreover, it would 
make a lot of code more readable: the `x509.go` example from earlier has 16 
lines of comments around `MaxPathLen` and `MaxPathLenZero` just to explain 
how they interact, and additional comments when they're used explaining 
again how they work; none of that would necessary if it were a single 
value-or-nil.

Also, this syntax fits nicely with the proposal for more general sum types 
(https://github.com/golang/go/issues/57644).

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/7e4a5552-f761-423c-8dc6-75903529378dn%40googlegroups.com.

Reply via email to