Hey everyone, I've been thinking for a while now about the advantages of nillable types, especially for basic types, and I'm interested in making an official language proposal. I'd love to know what the rest of the community thinks about it, and if this discussion has been had before (I found some vaguely similar past proposals but they were all specifically about making pointers easier to use, which is not the same thing).
Thanks, Dan # Proposal: Nillable types ## Summary I propose that for any type T, the type `T | nil` should also be a valid type with the following properties: 1. The zero value of a `T | nil` is `nil`. 2. Any `T` or `nil` may be used as a value of `T | nil` 3. It is a compile-time error to use a `T | nil` as a `T` without first checking that it is non-nil. So basically, you should be able to write code like ```go func Foo(x int32 | nil) { if x != nil { fmt.Println("X squared is", x*x) } else { fmt.Println("X is undefined") } } ``` ## Motivation There are loads of examples out there (I can dig up a bunch if it would be helpful) where various go programs need values that can take some "unset" value that is distinct from the zero value of that type. Examples include configuration objects, command line flags, and quite a lot of large structured data types (protobuf used to have every field be a pointer for this reason, though in v3 they got rid of this in their go implementation because it was so frustrating to do, and now go programs just can't distinguish unset values from zero values in protobufs). ### Alternatives and why they're bad There are three common ways I've seen of dealing with this in Go, and they all have serious limitations: #### Pointers The variable `var x *int32` allows you to distinguish between `x == nil` and `*x == 0`. However, using a pointer has three major drawbacks: 1. No compile-time protection Go does not force developers to put nil guards on pointers, so nothing will stop you from dereferencing `x` without checking if it's nil. I've lost count of the number of times I've seen code that panics in corner cases because someone assumed that some pointer would always be set. 2. Passed values are mutable If I want a function that can take an integer or an unset marker, and I define it as `func foo(x *int32)`, then users of my library just have to trust me that calling `foo(&some_var)` won't change the value of `some_var`; this breaks an important encapsulation boundary. 3. No literals You can't take the address of a literal, so instead of simply writing e.g. `x = 3`, you have to either use a temporary variable: ``` tmp := 3 x = &tmp ``` or write a helper function that does this for you. Numerous libraries have implemented these helper functions - for example, both https://pkg.go.dev/github.com/openconfig/ygot/ygot and https://pkg.go.dev/github.com/golang/protobuf/proto define helpers named `Bool`, `Float32`, `Float64`, `Int`, `Int32`, etc. just so that you can write ``` x = proto.Int32(3) ``` But this makes code a lot more cumbersome to read, and also requires the developer to restate the type of `x` every time (as compared to `x = 3`, where Go will infer that `3` means `int32(3)` based on the type of `x`). [Sourcegraph finds more than 10k results for functions that just take an int and return a pointer to it](https://sourcegraph.com/search?q=context:global+/func+Int%5C%28%5Cw%2B+int%5C%29+%5C*int/+lang:go&patternType=keyword&sm=0) so this is coming up A LOT. #### Sentinel values Some code uses explicit "impossible" values to indicate unset. For example, a nonegative value might have type `int` and be set to -1 to indicate that it is unset. However, this fails the criteria that the zero value for the type should be unset. It also requires that every function using this value check for -1 before using the value, and the compiler cannot enforce that this check has been made. Furthermore, this requires you to use a broader type than the type you actually care about, which may be impossible (e.g. if the value can be any float) or extremely unwieldy (e.g. if you have to use an integer to represent a bool). #### An additional bool You can also approximate this by using a struct like ```go struct { X int32 IsZero bool } ``` The zero value for this struct has `IsZero=false`, so you can use that to determine that `X` is not explicitly 0, but is in fact unset. However, this is confusing (what does it mean if IsZero is true but X is not 0?) and awkward (you have to remember to set IsZero any time you set X to 0, and to check IsZero any time you want to read X), and again, the compiler will not complain if you fail to do these. An example of using BOTH a sentinel value AND an additional bool in golang itself is: https://github.com/golang/go/blob/68d3a9e417344c11426f158c7a6f3197a0890ff1/src/crypto/x509/x509.go#L724 . The `MaxPathLen` value is considered "unset" if it's set to -1 OR if it's set to 0 and the bool `MaxPathLenZero` is false. This is necessary if you want to be able to mark it unset in a single line (`cert.MaxPathLen=-1`) but also have it be unset on any zero-valued (i.e. uninitialized) certificate. But as a consequence, every single use of MaxPathLen has to be guarded by multiple checks and any attempt to set it has to be careful about setting the 0 indicator as well; forgetting to take both possibilities into account would break your handling of X.509 certificates (which could even be a security issue, if you're rolling your own certificate handler instead of using an existing one). ## Nillable Types If Go supported nillable types, an example like `MaxPathLen` would be written simply as `MaxPathLen uint32 | nil`. The zero value would be nil (unset), setting it to a specific value would be easy (`c.MaxPathLen=5`), and setting it to nil would also be easy (`c.MaxPathLen=nil`). Moreover, the compiler could enforce at compile time that a developer can't forget the possibility of nil. ### Syntax The syntax would simply be that any type can have `| nil` appended to it to make a new, nillable type. For the sake of sanity, it would be reasonable to generate syntax errors on redundant constructs like `int | nil | nil`. This syntax is (in my opinion) more readable than some other languages' syntaxes for the same (e.g. `int? x` in C#) and more concise that most other languages (e.g. `x: typing.Optional[int]` in Python or `x :: Maybe Int` in haskell) Types would be checked by type assertions: ```go func foo2(x int | nil) { if i, ok := x.(int); ok { fmt.Println("x squared is", i*i) } else { fmt.Println("No value for x") } } ``` As with other type assertions, you may omit `ok` if you're sure that the value will match, but it will panic if you're wrong: ```go func foo3(x int | nil) { if x != nil { fmt.Println("x squared is", x.(int)*x.(int)) } else { fmt.Println("No value for x") } } ``` ### Nice-to-have: Implicit type guards Ideally, the compiler would also infer simple type guards so that we wouldn't need intermediate variables or unchecked type assertions: ```go func foo(x int | nil) { if x != nil { fmt.Println("x squared is", x*x) } else { fmt.Println("No value for x") } } ``` i.e. the compiler would infer that `x` cannot be nil inside the `if` block and therefore must be an int. Obviously this is impossible to do for arbitrary expressions, but typecheckers in numerous other languages (e.g. `mypy` and typescript) do recognize simple `if x != None` checks as type guards; I have no idea whether it would be difficult to add this to Go. ### Expressed in terms of pointers You could think of `T | nil` as being like a `*T` except with easier syntax for using it and not passed by reference; you could implement it solely in terms of AST transformations if you wanted to by having the following statements correspond to each other: `var x int32 | nil` -> `var _x *int` `x = nil` -> `_x = nil` `x = 3` -> `var _tmp int32 = 3; _x = &_tmp` `y, ok := x.(int)` -> `var y int32, ok bool; if _x == nil { ok = false } else { y = *_x; ok = true}` `y := x.(int)` -> `y := *_x` `foo(x)` -> `if _x == nil { foo(nil) } else { _tmp := *_x; foo(&_tmp) }` I doubt this would be the most efficient way to actually implement this feature (I am not an expert on the internal works of the go compiler), but the fact that it *could* be written this way makes me think it would not be difficult to add to the language. ### Other implications This change would be entirely backward-compatible - no existing code would contain the `T | nil` syntax, so nothing would change in the compilation of any existing code. I don't think this would make the language any harder to learn - the syntax for using it is the same as the syntax for other type assertions, and the use of | for union types is A) pretty common in other languages, and B) under discussion as a more general Go feature (#57644). Moreover, it would make a lot of code more readable: the `x509.go` example from earlier has 16 lines of comments around `MaxPathLen` and `MaxPathLenZero` just to explain how they interact, and additional comments when they're used explaining again how they work; none of that would necessary if it were a single value-or-nil. Also, this syntax fits nicely with the proposal for more general sum types (https://github.com/golang/go/issues/57644). -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/7e4a5552-f761-423c-8dc6-75903529378dn%40googlegroups.com.