Re: [elixir-core:11450] [Proposal] Overload capture operator to support tagged variable captures

Christopher Keele Wed, 28 Jun 2023 20:44:49 -0700

Posted that last reply early. continued:

Part of the elegance in of making $:foo and &"bar" expand to a valid pair, 
right before Map expansion handles pairs as {:%{}, [], [...pairs]}, is that 
it *could* easily allow us to support mixing tagged variable captures 
anywhere in the existing syntax constructs: This is not true of my 
prototype today, though, it would need more work based on how we decide to 
handle it:


{foo, bar, baz} = {1, 2, 3}

%{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz}
# => %{:fizz => :buzz, :foo => 1, "bar" => 2, "fizz" => "buzz"}

%{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz, $:baz} # !> ** 
(SyntaxError) invalid syntax found on iex:12:47:
# !>     ┌─ error: iex:12:47
# !>     │
# !>  12 │ %{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz, $:baz}
# !>     │                                               ^
# !>     │
# !>     unexpected expression after keyword list. Keyword lists must 
always come last in lists and maps. Therefore, this is not allowed:
# !> 
# !>         [some: :value, :another]
# !>         %{some: :value, another => value}
# !> 
# !>     Instead, reorder it to be the last entry:
# !> 
# !>         [:another, some: :value]
# !>         %{another => value, some: :value}
# !> 
# !>     Syntax error after: ',' 



On Wednesday, June 28, 2023 at 10:32:20 PM UTC-5 Christopher Keele wrote:

> > Alternatively, the `$` symbol could be used at the beginning of the data 
> structure to indicate that it is performing capture destructuring (e.g., 
> `$%{key1:, key2:}` or `$%{"key1", "key2"}`, but then it starts feeling a 
> little more line-noisy.
>
> I agree that'd be noisy. Also, it might make mixing tagged variable 
> literals, literal => pairs, and trailing keyword pairs even more confusing.
>
> Consider today that we support:
> %{"fizz" => "buzz", foo: :bar}
> # => %{:foo => :bar, "fizz" => "buzz"}
>
> But do not support:
> %{foo: :bar, "fizz" => "buzz"}
> # !> ** (SyntaxError) invalid syntax found on iex:5:12:
> # !>     ┌─ error: iex:5:12
> # !>     │
> # !>   5 │ %{foo: :bar, "fizz" => "buzz"}
> # !>     │            ^
> # !>     │
> # !>     unexpected expression after keyword list. Keyword lists must 
> always come last in lists and maps. Therefore, this is not allowed:
> # !> 
> # !>         [some: :value, :another]
> # !>         %{some: :value, another => value}
> # !> 
> # !>     Instead, reorder it to be the last entry:
> # !> 
> # !>         [:another, some: :value]
> # !>         %{another => value, some: :value}
> # !> 
> # !>     Syntax error after: ','
>
> Supporting $%{key1:, key2:} or $%{"key1", "key2"} obfuscates this 
> situation even further.
> On Wednesday, June 28, 2023 at 10:16:10 PM UTC-5 halos...@gmail.com wrote:
>
>> On Wed, Jun 28, 2023 at 8:41 PM Paul Schoenfelder <
>> paulscho...@fastmail.com> wrote:
>>
>>> I have an almost visceral reaction to the use of capture syntax for this 
>>> though, and I don’t believe any of the languages you mentioned that support 
>>> field punning do so in this fashion. They all use a similar intuitive 
>>> syntax where the variable matches the field name, and they don’t make any 
>>> effort to support string keys.
>>>
>>
>> JavaScript *only* supports string keys. Ruby’s pattern matching which 
>> can lead to field punning only supports symbol keys, but since ~2.2 Ruby 
>> can garbage collect symbols, making it *somewhat* less dangerous to do 
>> `JSON.parse!(data, keys: :symbol)` than it was previously.
>>
>> As far as I know, the BEAM does not do any atom garbage collection, and 
>> supporting *only* symbols will lead to a greater chance of atom exhaustion 
>> because a non-flagged mechanism here that only works on atom keys will lead 
>> to `Jason.parse(data, keys: :atom)` (and not `Jason.parse(data, keys: 
>> :atom!)`). I do not think that any destructuring syntax which works on maps 
>> with symbol keys but not string keys will be acceptable, although if it is 
>> constrained to *only* work on structs, then it does not matter (as that is 
>> the same restriction that it appears that OCaml and Haskell have).
>>
>> I think that either `&:key` / `&"key"` or `$:key` / `$"key"` will work 
>> very nicely for this feature, although it would be nice to have `&key:` or 
>> `$key:` work the same as the former version. Alternatively, the `$` symbol 
>> could be used at the beginning of the data structure to indicate that it is 
>> performing capture destructuring (e.g., `$%{key1:, key2:}` or `$%{"key1", 
>> "key2"}`, but then it starts feeling a little more line-noisy.
>>
>> I think that the proposal here — either using `&` or `$` — is entirely 
>> workable and IMO extends the concept nicely.
>>
>> -a
>>
>> On Wed, Jun 28, 2023, at 7:56 PM, Christopher Keele wrote:
>>>
>>> This is a formalization of my concept here 
>>> <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU/m/BWF24zoAAgAJ>,
>>>  
>>> as a first-class proposal for explicit discussion/feedback, since I now 
>>> have a working prototype 
>>> <https://github.com/elixir-lang/elixir/compare/main...christhekeele:elixir:tagged-variable-capture>
>>> .
>>>
>>> *Goal*
>>>
>>> The aim of this proposal is to support a commonly-requested feature: 
>>> *short-hand 
>>> construction and pattern matching of key/value pairs of associative data 
>>> structures, based on variable names* in the current scope.
>>>
>>> *Context*
>>>
>>> Similar shorthand syntax sugar exists in many programming languages 
>>> today, known variously as:
>>>
>>>    - Field Punning <https://dev.realworldocaml.org/records.html> — OCaml
>>>    - Record Puns 
>>>    
>>> <https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/record_puns.html> 
>>>    — Haskell
>>>    - Object Property Value Shorthand 
>>>    
>>> <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer#property_definitions>
>>>  
>>>    — ES6 Javascript
>>>    
>>> This feature has been in discussion for a decade, on this mailing list (
>>> 1 
>>> <https://groups.google.com/g/elixir-lang-core/c/4w9eOeLvt-8/m/WOkoPSMm6kEJ>,
>>>  
>>> 2 
>>> <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/WTpArTGMKSIJ>,
>>>  
>>> 3 
>>> <https://groups.google.com/g/elixir-lang-core/c/3XrVXEVSixc/m/NHU2M4QFAQAJ>,
>>>  
>>> 4 
>>> <https://groups.google.com/g/elixir-lang-core/c/OvSQkvXxsmk/m/bKKHbBxiCwAJ>,
>>>  
>>> 5 
>>> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/1W-d_XAlBgAJ>
>>> , 6 <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU>) 
>>> and the Elixir forum (1 
>>> <https://elixirforum.com/t/proposal-add-field-puns-map-shorthand-to-elixir/15452>,
>>>  
>>> 2 
>>> <https://elixirforum.com/t/shorthand-for-passing-variables-by-name/30583>, 
>>> 3 
>>> <https://elixirforum.com/t/if-you-could-change-one-thing-in-elixir-language-what-you-would-change/19902/17>,
>>>  
>>> 4 
>>> <https://elixirforum.com/t/has-map-shorthand-syntax-in-other-languages-caused-you-any-problems/15403>,
>>>  
>>> 5 
>>> <https://elixirforum.com/t/es6-ish-property-value-shorthands-for-maps/1524>,
>>>  
>>> 6 
>>> <https://elixirforum.com/t/struct-creation-pattern-matching-short-hand/7544>),
>>>  
>>> and has motivated many libraries (1 
>>> <https://github.com/whatyouhide/short_maps>, 2 
>>> <https://github.com/meyercm/shorter_maps>, 3 
>>> <https://hex.pm/packages/shorthand>, 4 <https://hex.pm/packages/synex>). 
>>> These narrow margins cannot fit the full history of possibilities, 
>>> proposals, and problems with this feature, and I will not attempt to 
>>> summarize them all. For context, I suggest reading this mailing list 
>>> proposal 
>>> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/1W-d_XAlBgAJ> 
>>> and this community discussion 
>>> <https://elixirforum.com/t/proposal-add-field-puns-map-shorthand-to-elixir/15452>
>>>  in 
>>> particular.
>>>
>>> However, in summary, this particular proposal tries to solve a couple of 
>>> past sticking points:
>>>
>>>    1. Atom vs String 
>>>    
>>> <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/IpZQHbZk4xEJ> 
>>>    key support
>>>    2. Visual clarity 
>>>    
>>> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/NBkAVto0BAAJ> 
>>>    that atom/string matching is occurring
>>>    3. Limitations of string-based sigil parsing 
>>>    
>>> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/TiZw6xM3BAAJ>
>>>    4. Easy confusion 
>>>    
>>> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/WRhXxHDfBAAJ> 
>>>    with tuples
>>>    
>>> I have a working fork of Elixir here 
>>> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture> 
>>> where this proposed syntax can be experimented with. Be warned, it is buggy.
>>>
>>> *Proposal: Tagged Variable Captures*
>>>
>>> I propose we overload the unary capture operator (*&*) to accept 
>>> compile-time atoms and strings as arguments, for example *&:foo* and 
>>> *&"bar"*. This would *expand at compile time* into *a tagged tuple with 
>>> the atom/string and a variable reference*. For now, I am calling this a 
>>> *"tagged-variable 
>>> capture"*  to differentiate it from a function capture.
>>>
>>> For the purposes of this proposal, assume:
>>>
>>> {foo, bar} = {1, 2}
>>>
>>> Additionally,
>>>
>>>    - Lines beginning with *# == * indicate what the compiler expands an 
>>>    expression to.
>>>    - Lines beginning with *# => * represent the result of evaluating 
>>>    that expression.
>>>    - Lines beginning with *# !> * represent an exception.
>>>    
>>> *Bare Captures*
>>>
>>> I'm not sure if we should support *bare* tagged-variable capture, but 
>>> it is illustrative for this proposal, so I left it in my prototype. It 
>>> would look like:
>>>
>>> &:foo
>>> *# == **{:foo, foo}*
>>> *# => *{:foo, 1}
>>> &"foo"
>>> *# == **{"foo", foo}*
>>> *# => *{"foo", 1}
>>>
>>> If bare usage is supported, this expansion would work as expected in 
>>> match and guard contexts as well, since it expands before variable 
>>> references are resolved:
>>>
>>> {:foo, baz} = &:foo
>>> *# == {:foo, baz} = {:foo, foo}*
>>> *# => *{:foo, 1}
>>> baz
>>> *# => *1
>>>
>>> *List Captures*
>>>
>>> Since capture expressions are allowed in lists, this can be used to 
>>> construct Keyword lists from the local variable scope elegantly:
>>>
>>> list = [&:foo, &:bar]
>>> *# == **list = [{:foo, foo}, {:bar, bar}]*
>>> *# => *[foo: 1, bar: 2]
>>>
>>> This would work with other list operators like *|*:
>>>
>>> baz = 3
>>> list = [&:baz | list]
>>> *# == **list = [**{:baz, baz} **| **list**]*
>>> *# => *[baz: 3, foo: 1, bar: 2]
>>>
>>> And list destructuring:
>>>
>>> {foo, bar, baz} = {nil, nil, nil}
>>> [&:baz, &:foo, &:bar] = list
>>> *# == [{:baz, baz}, {:foo, foo}, {:bar, bar}] = list*
>>> *# => *[baz: 3, foo: 1, bar: 2]
>>> {foo, bar, baz}
>>> *# => *{1, 2, 3}
>>>
>>> *Map Captures*
>>>
>>> With a small change to the parser, 
>>> <https://github.com/elixir-lang/elixir/commit/0a4f5376c0f9b4db7d71514d05df6b8b6abc96a9>
>>>  
>>> we can allow this expression inside map literals. Because this expression 
>>> individually gets expanded into a tagged-tuple before the map associations 
>>> list as a whole are processed, it allow this syntax to work in all existing 
>>> map/struct constructs, like map construction:
>>>
>>> map = %{&:foo, &"bar"}
>>> *# == %{:foo => foo, "bar" => bar}*
>>> *# => *%{:foo => 1, "bar" => 2}
>>>
>>> Map updates:
>>>
>>> foo = 3
>>> map = %{map | &:foo}
>>> *# == %{map | :foo => foo}*
>>> *# => *%{:foo => 3, "bar" => 2}
>>>
>>> And map destructuring:
>>>
>>> {foo, bar} = {nil, nil}
>>> %{&:foo, &"bar"} = map
>>> *# == %{:foo => foo, "bar" => bar} = map*
>>> *# => *%{:foo => 3, "bar" => 2}
>>> {foo, bar}
>>> *# => *{3, 2}
>>>
>>> *Considerations*
>>>
>>> Though just based on an errant thought 
>>> <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU/m/BWF24zoAAgAJ> 
>>> that popped into my head yesterday, I'm unreasonably pleased with how well 
>>> this works and reads in practice. I will present my thoughts here, though 
>>> again I encourage you to grab my branch 
>>> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture>, 
>>> compile 
>>> it from source 
>>> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture#compiling-from-source>,
>>>  and 
>>> play with it yourself!
>>>
>>> *Pro: solves existing pain points*
>>>
>>> As mentioned, this solves flaws previous proposals suffer from:
>>>
>>>    1. Atom vs String 
>>>    
>>> <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/IpZQHbZk4xEJ> 
>>> key 
>>>    support
>>>    This supports both.
>>>    2. Visual clarity 
>>>    
>>> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/NBkAVto0BAAJ> 
>>> that 
>>>    atom/string matching is occurring
>>>    This leverages the appropriate literal in question within the syntax 
>>>    sugar.
>>>    3. Limitations of string-based sigil parsing 
>>>    
>>> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/TiZw6xM3BAAJ>
>>>    This is compiler-expansion-native.
>>>    4. Easy confusion 
>>>    
>>> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/WRhXxHDfBAAJ> 
>>> with 
>>>    tuples
>>>    %{&:foo, &"bar"} is very different from {foo, bar}, instead of 
>>>    1-character different.
>>>    
>>> Additionally, it solves my main complaint with historical proposals: 
>>> syntax to combine a variable identifier with a literal must either obscure 
>>> that we are building an identifier, or obscure the key/string typing of the 
>>> literal.
>>>
>>> I'm proposing overloading the capture operator rather than introducing a 
>>> new operator because the capture operator already has a semantic 
>>> association with messing with variable scope, via the nested integer-based 
>>> positional function argument syntax (ex *& &1*).
>>>
>>> By using the capture operator we indicate that we are messing with an 
>>> identifier in scope, but via a literal atom/string we want to associate 
>>> with, to get the best of both worlds.
>>>
>>> *Pro: works with existing code*
>>>
>>> The capture today operator has well-defined compile-time-error semantics 
>>> if you try to pass it an atom or a string. All compiling Elixir code today 
>>> will continue to compile as before.
>>>
>>> *Pro: works with existing tooling*
>>>
>>> By overloading an existing operator, this approach works seamlessly for 
>>> me with the syntax highlighters I have tried it with so far, and reasonable 
>>> with the formatter.
>>>
>>> In my experimentation I've found that the formatter wants to rewrite *&:baz 
>>> *to *(&:baz)* pretty often. That's good, because there are several edge 
>>> cases in my prototype where not doing so causes it to behave strangely; I'm 
>>> sure it's resolving ambiguities that would occur in function captures that 
>>> impact my proposal in ways I have yet fully anticipated.
>>>
>>> *Pros: minimizes surface area of the language*
>>>
>>> By overriding the capture operator instead of introducing a new operator 
>>> or sigil, we are able to keep the surface area of this feature slim.
>>>
>>> *Cons: overloads the capture operator*
>>>
>>> Of course, much of the virtues of this proposal comes from overloading 
>>> the capture operator. But it is an already semantically fraught syntactic 
>>> sugar construct that causes confusion to newcomers, and this would place 
>>> more strain on it.
>>>
>>> We would need to augment it with more than the meager error message 
>>> modification 
>>> <https://github.com/elixir-lang/elixir/commit/3d83d21ada860d03cece8c6f90dbcf7bf9e737ec#diff-92b98063d1e86837fae15261896c265ab502b8d556141aaf1c34e67a3ef3717cL199-R207>
>>>  in 
>>> my prototype, as well as documentation and anticipate a new wave of 
>>> questions from the community upon release.
>>>
>>> This inelegance really shows when considering embedding a tagged 
>>> variable capture inside an anonymous function capture, ex *& &1 = &:foo*. 
>>> In my prototype I've chosen to allow this rather than error on "nested 
>>> captures not allowed" (would probably become: "nested *function* 
>>> captures not allowed"), but I'm not sure I found all the edge-cases of 
>>> mixing them in all possible constructions.
>>>
>>> Additionally, since my proposal now allows the capture operator as an 
>>> associative element inside map literal parsing, that would change the 
>>> syntax error reported by providing a function capture as an associative 
>>> element to be generated during expansion rather than during parsing. I am 
>>> not fluent enough in leex to have have updated the parser to preserve the 
>>> exact old error, but serendipitously what it reports in my prototype today 
>>> is pretty good regardless, but I prefer the old behaviour:
>>>
>>> Old:
>>> %{& &1}
>>> *# !> **** (SyntaxError) syntax error before '}'*
>>> *# !> * |
>>> *# !> * 1 | %{& &1}
>>> *# !> * | ^
>>> New:
>>> %{& &1}
>>> *# => error: expected key-value pairs in a map, got: & &1*
>>> *# => ** (CompileError) cannot compile code (errors have been logged)*
>>>
>>> *Cons: here there be dragons I cannot see*
>>>
>>> I'm quite sure a full implementation would require a lot more knowledge 
>>> of the compiler than I am able to provide. For example, *&:foo = &:foo 
>>> *raises 
>>> an exception where *(&:foo) = &:foo* behaves as expected. I also find 
>>> the variable/context/binding environment implementation in the erlang part 
>>> of the compiler during expansion to be impenetrable, and I'm sure my 
>>> prototype fails on edge cases there.
>>>
>>> *Open Question: the pin operator*
>>>
>>> As this feature constructs a variable ref for you, it is not clear 
>>> if/how we should support attempts to pin the generated variable to avoid 
>>> new bindings. In my prototype, I have tried to support the pin operator via 
>>> the *&^:atom *syntax, though I'm pretty sure it's super buggy on bare 
>>> out-of-data-structure cases and I only got it far enough to work in 
>>> function heads for basic function head map pattern matching.
>>>
>>> *Open Question: charlists*
>>>
>>> I did not add support for charlist tagged variable captures in my 
>>> prototype, as it would be more involved to differentiate a capture of list 
>>> mean to become a tagged tuple from a list representing the AST of a 
>>> function capture. I would not lose a lot of sleep over this.
>>>
>>> *Open Question: allowed contexts*
>>>
>>> Would we even want to allow this syntax construct outside of map 
>>> literals? Or list literals?
>>>
>>> I can certainly see people abusing the 
>>> bare-outside-of-associative-datastructure syntax to make some neigh 
>>> impenetrable code where it's really unclear where assignment and pattern 
>>> matching is occuring, and relatedly this is where I see a lot of odd 
>>> edge-case behaviour in my prototype. I allowed it to speed up the 
>>> implementation, but it merits more discussion.
>>>
>>> On the other hand, this does seem like an... interesting use-case:
>>>
>>> error = "rate limit exceeded"
>>> &:error *# return error tuple*
>>>
>>> *Thanks for reading! What do you think?*
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google 
>>> Groups "elixir-lang-core" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elixir-lang-co...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elixir-lang-core/ad7e0313-4207-4cb7-a5f3-d824735830abn%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elixir-lang-core/ad7e0313-4207-4cb7-a5f3-d824735830abn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elixir-lang-core" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elixir-lang-co...@googlegroups.com.
>>>
>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elixir-lang-core/4ee25f02-f27e-47a8-b4b5-b8520c1c9b05%40app.fastmail.com
>>>  
>>> <https://groups.google.com/d/msgid/elixir-lang-core/4ee25f02-f27e-47a8-b4b5-b8520c1c9b05%40app.fastmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> -- 
>> Austin Ziegler • halos...@gmail.com • aus...@halostatue.ca
>> http://www.halostatue.ca/ • http://twitter.com/halostatue
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/f88874ea-9bb5-4a34-91d3-445352302db6n%40googlegroups.com.

Re: [elixir-core:11450] [Proposal] Overload capture operator to support tagged variable captures

Reply via email to