Re: [elixir-core:11445] [Proposal] Overload capture operator to support tagged variable captures

Christopher Keele Wed, 28 Jun 2023 19:30:47 -0700

I've figured out the tokenizer enough to prototype this as new operator; 
working title the "tagged variable literal" operator (not in love with that 
name). I'm using a dollar sign ($) to represent it.


It has the same issues as before, as I've ungracefully wedged it between 
the capture operator and other precedences, but now is logically separated 
from the capture operator. Weird stuff still happens without wrapping it in 
parens in certain contexts, for example; but I think it's enough to 
continue discussion around this proposal if we want to refocus it around a 
new operator.

I'm happy to refine the branch further and work on a PR, but would need 
much guidance, and so would rather leave it as is for now without more 
feedback on the proposal and related blessings, as I would need more 
core-team support to implement it than I did defguard. Still sounds really 
fun to do.

The source code for this new fork of Elixir is available here 
<https://github.com/elixir-lang/elixir/compare/main...christhekeele:elixir:tagged-variable-literals>
 for 
experimentation. For convenience, here are the examples in this proposal 
reworked to use a dedicated $ operator for compile-time tagged variable 
literals. They all work in iex on my fork, although many obvious usages of 
it do not without more work:

*Bare Tagged Variable Literals*

$:foo
# == *{:foo, foo}*
# => {:foo, 1}
$"foo"
# == *{"foo", foo}*
# => {"foo", 1}

If bare usage is supported, this expansion would work as expected in match 
and guard contexts as well, since it expands before variable references are 
resolved:

{:foo, baz} = $:foo
*# == {:foo, baz} = {:foo, foo}*
# => {:foo, 1}
baz
# => 1

*Tagged Variable Literals in **Lists*

Since tagged variable expressions are allowed in lists, this can be used to 
construct Keyword lists from the local variable scope elegantly:

list = [$:foo, $:bar]
# == *list = [{:foo, foo}, {:bar, bar}]*
# => [foo: 1, bar: 2]

This would work with other list operators like *|*:

baz = 3
list = [$:baz | list]
# == *list = [**{:baz, baz} **| **list**]*
# => [baz: 3, foo: 1, bar: 2]

And list destructuring:

{foo, bar, baz} = {nil, nil, nil}
[$:baz, $:foo, $:bar] = list
*# == [{:baz, baz}, {:foo, foo}, {:bar, bar}] = list*
# => [baz: 3, foo: 1, bar: 2]
{foo, bar, baz}
# => {1, 2, 3}

*Tagged Variable Literals in **Maps*

With a small change to the parser, 
<https://github.com/elixir-lang/elixir/commit/119bd6da351e8fe2ab94e86a8456ffc521ce865d#diff-7e4167a9de48e2dcae64fae18a5b2ddad1d4aeff8f2dde274eb6f127ef65ac11R615>
 we 
can allow this expression inside map literals. Because this expression 
individually gets expanded into a tagged-tuple before the map associations 
list as a whole are processed, it allow this syntax to work in all existing 
map/struct constructs, like map construction:

map = %{$:foo, $"bar"}
*# == %{:foo => foo, "bar" => bar}*
# => %{:foo => 1, "bar" => 2}

Map updates:

foo = 3
map = %{map | $:foo}
*# == %{map | :foo => foo}*
# => %{:foo => 3, "bar" => 2}

And map destructuring:

{foo, bar} = {nil, nil}
%{$:foo, &"bar"} = map
*# == %{:foo => foo, "bar" => bar} = map*
# => %{:foo => 3, "bar" => 2}
{foo, bar}
# => {3, 2}

On Wednesday, June 28, 2023 at 8:36:15 PM UTC-5 Paul Schoenfelder wrote:

> I do think there is value in proposing the "tagged variable captures" idea 
> separately, but at the same time, your solution for field punning is part 
> of the value proposition there. That said, as you've already noted, it is 
> very easy for the conversation to get bogged down when more than one thing 
> is being discussed at a time.
>
> This is a very salient point. How do you feel about introducing a new 
> operator for this sugar, such as $:foo?
>
>
> The first thing that sticks out to me is that there are a variety of 
> places where atoms starting with `$` occur in practice (particularly around 
> ETS), so I could see things like `$:$$` appearing in code, which is 
> just...no. Of course, an argument could be made that one should just not do 
> that, but it is something to consider. Obviously, you can't get rid of the 
> `:` for the same reason.
>
> But the idea of an operator more generally? I guess it would really depend 
> on the specific choice. I don't like it in principle, but I'd want to cast 
> my vote with a specific syntax in question, such as those you've proposed. 
> As I mentioned in my previous reply, I really think the best path for 
> Elixir with regard to field punning is to solve the syntax ambiguities that 
> prevent the "obvious" syntax for it, e.g. `%{foo, bar} = baz`, and only 
> focus on supporting atom keys. That may not be possible without 
> backwards-incompatible changes to the grammar, in which case it's something 
> to throw on the wishlist of things that could go in an eventual Elixir 2.0.
>
> I think it's important to cast the feature in a broader context, because I 
> think everyone would agree that field punning is a nice-to-have. But is the 
> tradeoff in complexity for the language really worth it? The more explicit 
> syntax is (perhaps) more annoying to write, but I think the vast majority 
> would agree that it is simple, clear, and easy to reason about. When we're 
> arguing for field punning, we're really arguing for a significant benefit 
> when writing code, but only in the "obvious" syntax I gave an example of 
> above do I think one can argue that there is any benefit in terms of 
> readability, and even then it is a small benefit. It adds cognitive 
> overhead, particularly for new Elixir developers, as one must desugar the 
> syntax in their head. I don't think that cognitive overhead is significant, 
> but it is only one thing amongst many that one must carry around in their 
> head when working with Elixir code - we should aim to reduce that overhead 
> rather than add to it.
>
> Anyway, I don't think I'm adding anything new to the arguments that have 
> been made in the past, so I don't want to derail your proposal here, or add 
> to the noise, particularly with regard to the "tagged variable captures" 
> portion, which deserves its own consideration. I will leave it up to the 
> community at large to decide, but just want to say thanks again for putting 
> so much effort into summarizing the current state of the discussion and 
> implementing a prototype of your proposal - it certainly gives it a lot 
> more weight to me.
>
> Paul
>
>
> On Wed, Jun 28, 2023, at 8:45 PM, Christopher Keele wrote:
>
> > My thoughts on the proposal itself aside, I’d just like to say that I 
> think you’ve set a great example of what proposals on this list should look 
> like. Well done!
>
> Much appreciated!
>
> > I have an almost visceral reaction to the use of capture syntax for this 
> though.
>
> > I think calling the `&…` syntax “capture syntax” is actually misleading, 
> and only has that name because it can be used to construct closures by 
> “capturing” a function name, but it is more accurate to consider it closure 
> syntax, in my opinion.
>
> This is a very salient point. How do you feel about introducing a new 
> operator for this sugar, such as $:foo?
> On Wednesday, June 28, 2023 at 7:41:05 PM UTC-5 Paul Schoenfelder wrote:
>
>
> My thoughts on the proposal itself aside, I’d just like to say that I 
> think you’ve set a great example of what proposals on this list should look 
> like. Well done!
>
> I have an almost visceral reaction to the use of capture syntax for this 
> though, and I don’t believe any of the languages you mentioned that support 
> field punning do so in this fashion. They all use a similar intuitive 
> syntax where the variable matches the field name, and they don’t make any 
> effort to support string keys.
>
> If Elixir is to ever support field punning, I strongly believe it should 
> follow their example. However, there are reasons why Elixir cannot do so 
> due to syntax ambiguities (IIRC). In my mind, that makes any effort to 
> introduce this feature a non-starter, because code should be first and 
> foremost easy to read, and I have yet to see a proposal for this that 
> doesn’t make the code harder to read and understand, including this one.
>
> I’d like to have field punning, but by addressing, if possible, the core 
> issue that is blocking it. If that can’t be done, I just don’t think the 
> cost of overloading unrelated syntax is worth it. I think calling the `&…` 
> syntax “capture syntax” is actually misleading, and only has that name 
> because it can be used to construct closures by “capturing” a function 
> name, but it is more accurate to consider it closure syntax, in my opinion. 
> Overloading it to mean capturing things in a more general sense will be 
> confusing for everyone, and would only work in a few restricted forms, 
> which makes it more difficult to teach and learn.
>
> That’s my two cents anyway, I think you did a great job with the proposal, 
> but I’m very solidly against it as the solution to the problem being solved.
>
> Paul
>
>
>
> On Wed, Jun 28, 2023, at 7:56 PM, Christopher Keele wrote:
>
> This is a formalization of my concept here 
> <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU/m/BWF24zoAAgAJ>, 
> as a first-class proposal for explicit discussion/feedback, since I now 
> have a working prototype 
> <https://github.com/elixir-lang/elixir/compare/main...christhekeele:elixir:tagged-variable-capture>
> .
>
> *Goal*
>
> The aim of this proposal is to support a commonly-requested feature: 
> *short-hand 
> construction and pattern matching of key/value pairs of associative data 
> structures, based on variable names* in the current scope.
>
> *Context*
>
> Similar shorthand syntax sugar exists in many programming languages today, 
> known variously as:
>
>    - Field Punning <https://dev.realworldocaml.org/records.html> — OCaml
>    - Record Puns 
>    <https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/record_puns.html> 
>    — Haskell
>    - Object Property Value Shorthand 
>    
> <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer#property_definitions>
>  
>    — ES6 Javascript
>    
> This feature has been in discussion for a decade, on this mailing list (1 
> <https://groups.google.com/g/elixir-lang-core/c/4w9eOeLvt-8/m/WOkoPSMm6kEJ>, 
> 2 
> <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/WTpArTGMKSIJ>, 
> 3 
> <https://groups.google.com/g/elixir-lang-core/c/3XrVXEVSixc/m/NHU2M4QFAQAJ>, 
> 4 
> <https://groups.google.com/g/elixir-lang-core/c/OvSQkvXxsmk/m/bKKHbBxiCwAJ>, 
> 5 
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/1W-d_XAlBgAJ>
> , 6 <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU>) and the 
> Elixir forum (1 
> <https://elixirforum.com/t/proposal-add-field-puns-map-shorthand-to-elixir/15452>,
>  
> 2 
> <https://elixirforum.com/t/shorthand-for-passing-variables-by-name/30583>, 
> 3 
> <https://elixirforum.com/t/if-you-could-change-one-thing-in-elixir-language-what-you-would-change/19902/17>,
>  
> 4 
> <https://elixirforum.com/t/has-map-shorthand-syntax-in-other-languages-caused-you-any-problems/15403>,
>  
> 5 
> <https://elixirforum.com/t/es6-ish-property-value-shorthands-for-maps/1524>, 
> 6 
> <https://elixirforum.com/t/struct-creation-pattern-matching-short-hand/7544>),
>  
> and has motivated many libraries (1 
> <https://github.com/whatyouhide/short_maps>, 2 
> <https://github.com/meyercm/shorter_maps>, 3 
> <https://hex.pm/packages/shorthand>, 4 <https://hex.pm/packages/synex>). 
> These narrow margins cannot fit the full history of possibilities, 
> proposals, and problems with this feature, and I will not attempt to 
> summarize them all. For context, I suggest reading this mailing list 
> proposal 
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/1W-d_XAlBgAJ> 
> and this community discussion 
> <https://elixirforum.com/t/proposal-add-field-puns-map-shorthand-to-elixir/15452>
>  in 
> particular.
>
> However, in summary, this particular proposal tries to solve a couple of 
> past sticking points:
>
>    1. Atom vs String 
>    
> <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/IpZQHbZk4xEJ> 
>    key support
>    2. Visual clarity 
>    
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/NBkAVto0BAAJ> 
>    that atom/string matching is occurring
>    3. Limitations of string-based sigil parsing 
>    <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/TiZw6xM3BAAJ>
>    4. Easy confusion 
>    
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/WRhXxHDfBAAJ> 
>    with tuples
>    
> I have a working fork of Elixir here 
> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture> 
> where this proposed syntax can be experimented with. Be warned, it is buggy.
>
> *Proposal: Tagged Variable Captures*
>
> I propose we overload the unary capture operator (*&*) to accept 
> compile-time atoms and strings as arguments, for example *&:foo* and 
> *&"bar"*. This would *expand at compile time* into *a tagged tuple with 
> the atom/string and a variable reference*. For now, I am calling this a 
> *"tagged-variable 
> capture"*  to differentiate it from a function capture.
>
> For the purposes of this proposal, assume:
>
> {foo, bar} = {1, 2}
>
> Additionally,
>
>    - Lines beginning with *# == * indicate what the compiler expands an 
>    expression to.
>    - Lines beginning with *# => * represent the result of evaluating that 
>    expression.
>    - Lines beginning with *# !> * represent an exception.
>    
> *Bare Captures*
>
> I'm not sure if we should support *bare* tagged-variable capture, but it 
> is illustrative for this proposal, so I left it in my prototype. It would 
> look like:
>
> &:foo
> *# == **{:foo, foo}*
> *# => *{:foo, 1}
> &"foo"
> *# == **{"foo", foo}*
> *# => *{"foo", 1}
>
> If bare usage is supported, this expansion would work as expected in match 
> and guard contexts as well, since it expands before variable references are 
> resolved:
>
> {:foo, baz} = &:foo
> *# == {:foo, baz} = {:foo, foo}*
> *# => *{:foo, 1}
> baz
> *# => *1
>
> *List Captures*
>
> Since capture expressions are allowed in lists, this can be used to 
> construct Keyword lists from the local variable scope elegantly:
>
> list = [&:foo, &:bar]
> *# == **list = [{:foo, foo}, {:bar, bar}]*
> *# => *[foo: 1, bar: 2]
>
> This would work with other list operators like *|*:
>
> baz = 3
> list = [&:baz | list]
> *# == **list = [**{:baz, baz} **| **list**]*
> *# => *[baz: 3, foo: 1, bar: 2]
>
> And list destructuring:
>
> {foo, bar, baz} = {nil, nil, nil}
> [&:baz, &:foo, &:bar] = list
> *# == [{:baz, baz}, {:foo, foo}, {:bar, bar}] = list*
> *# => *[baz: 3, foo: 1, bar: 2]
> {foo, bar, baz}
> *# => *{1, 2, 3}
>
> *Map Captures*
>
> With a small change to the parser, 
> <https://github.com/elixir-lang/elixir/commit/0a4f5376c0f9b4db7d71514d05df6b8b6abc96a9>
>  
> we can allow this expression inside map literals. Because this expression 
> individually gets expanded into a tagged-tuple before the map associations 
> list as a whole are processed, it allow this syntax to work in all existing 
> map/struct constructs, like map construction:
>
> map = %{&:foo, &"bar"}
> *# == %{:foo => foo, "bar" => bar}*
> *# => *%{:foo => 1, "bar" => 2}
>
> Map updates:
>
> foo = 3
> map = %{map | &:foo}
> *# == %{map | :foo => foo}*
> *# => *%{:foo => 3, "bar" => 2}
>
> And map destructuring:
>
> {foo, bar} = {nil, nil}
> %{&:foo, &"bar"} = map
> *# == %{:foo => foo, "bar" => bar} = map*
> *# => *%{:foo => 3, "bar" => 2}
> {foo, bar}
> *# => *{3, 2}
>
> *Considerations*
>
> Though just based on an errant thought 
> <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU/m/BWF24zoAAgAJ> 
> that popped into my head yesterday, I'm unreasonably pleased with how well 
> this works and reads in practice. I will present my thoughts here, though 
> again I encourage you to grab my branch 
> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture>, 
> compile 
> it from source 
> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture#compiling-from-source>,
>  and 
> play with it yourself!
>
> *Pro: solves existing pain points*
>
> As mentioned, this solves flaws previous proposals suffer from:
>
>    1. Atom vs String 
>    
> <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/IpZQHbZk4xEJ> 
> key 
>    support
>    This supports both.
>    2. Visual clarity 
>    
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/NBkAVto0BAAJ> 
> that 
>    atom/string matching is occurring
>    This leverages the appropriate literal in question within the syntax 
>    sugar.
>    3. Limitations of string-based sigil parsing 
>    <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/TiZw6xM3BAAJ>
>    This is compiler-expansion-native.
>    4. Easy confusion 
>    
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/WRhXxHDfBAAJ> 
> with 
>    tuples
>    %{&:foo, &"bar"} is very different from {foo, bar}, instead of 
>    1-character different.
>    
> Additionally, it solves my main complaint with historical proposals: 
> syntax to combine a variable identifier with a literal must either obscure 
> that we are building an identifier, or obscure the key/string typing of the 
> literal.
>
> I'm proposing overloading the capture operator rather than introducing a 
> new operator because the capture operator already has a semantic 
> association with messing with variable scope, via the nested integer-based 
> positional function argument syntax (ex *& &1*).
>
> By using the capture operator we indicate that we are messing with an 
> identifier in scope, but via a literal atom/string we want to associate 
> with, to get the best of both worlds.
>
> *Pro: works with existing code*
>
> The capture today operator has well-defined compile-time-error semantics 
> if you try to pass it an atom or a string. All compiling Elixir code today 
> will continue to compile as before.
>
> *Pro: works with existing tooling*
>
> By overloading an existing operator, this approach works seamlessly for me 
> with the syntax highlighters I have tried it with so far, and reasonable 
> with the formatter.
>
> In my experimentation I've found that the formatter wants to rewrite *&:baz 
> *to *(&:baz)* pretty often. That's good, because there are several edge 
> cases in my prototype where not doing so causes it to behave strangely; I'm 
> sure it's resolving ambiguities that would occur in function captures that 
> impact my proposal in ways I have yet fully anticipated.
>
> *Pros: minimizes surface area of the language*
>
> By overriding the capture operator instead of introducing a new operator 
> or sigil, we are able to keep the surface area of this feature slim.
>
> *Cons: overloads the capture operator*
>
> Of course, much of the virtues of this proposal comes from overloading the 
> capture operator. But it is an already semantically fraught syntactic sugar 
> construct that causes confusion to newcomers, and this would place more 
> strain on it.
>
> We would need to augment it with more than the meager error message 
> modification 
> <https://github.com/elixir-lang/elixir/commit/3d83d21ada860d03cece8c6f90dbcf7bf9e737ec#diff-92b98063d1e86837fae15261896c265ab502b8d556141aaf1c34e67a3ef3717cL199-R207>
>  in 
> my prototype, as well as documentation and anticipate a new wave of 
> questions from the community upon release.
>
> This inelegance really shows when considering embedding a tagged variable 
> capture inside an anonymous function capture, ex *& &1 = &:foo*. In my 
> prototype I've chosen to allow this rather than error on "nested captures 
> not allowed" (would probably become: "nested *function* captures not 
> allowed"), but I'm not sure I found all the edge-cases of mixing them in 
> all possible constructions.
>
> Additionally, since my proposal now allows the capture operator as an 
> associative element inside map literal parsing, that would change the 
> syntax error reported by providing a function capture as an associative 
> element to be generated during expansion rather than during parsing. I am 
> not fluent enough in leex to have have updated the parser to preserve the 
> exact old error, but serendipitously what it reports in my prototype today 
> is pretty good regardless, but I prefer the old behaviour:
>
> Old:
> %{& &1}
> *# !> **** (SyntaxError) syntax error before '}'*
> *# !> * |
> *# !> * 1 | %{& &1}
> *# !> * | ^
> New:
> %{& &1}
> *# => error: expected key-value pairs in a map, got: & &1*
> *# => ** (CompileError) cannot compile code (errors have been logged)*
>
> *Cons: here there be dragons I cannot see*
>
> I'm quite sure a full implementation would require a lot more knowledge of 
> the compiler than I am able to provide. For example, *&:foo = &:foo *raises 
> an exception where *(&:foo) = &:foo* behaves as expected. I also find the 
> variable/context/binding environment implementation in the erlang part of 
> the compiler during expansion to be impenetrable, and I'm sure my prototype 
> fails on edge cases there.
>
> *Open Question: the pin operator*
>
> As this feature constructs a variable ref for you, it is not clear if/how 
> we should support attempts to pin the generated variable to avoid new 
> bindings. In my prototype, I have tried to support the pin operator via the 
> *&^:atom *syntax, though I'm pretty sure it's super buggy on bare 
> out-of-data-structure cases and I only got it far enough to work in 
> function heads for basic function head map pattern matching.
>
> *Open Question: charlists*
>
> I did not add support for charlist tagged variable captures in my 
> prototype, as it would be more involved to differentiate a capture of list 
> mean to become a tagged tuple from a list representing the AST of a 
> function capture. I would not lose a lot of sleep over this.
>
> *Open Question: allowed contexts*
>
> Would we even want to allow this syntax construct outside of map literals? 
> Or list literals?
>
> I can certainly see people abusing the 
> bare-outside-of-associative-datastructure syntax to make some neigh 
> impenetrable code where it's really unclear where assignment and pattern 
> matching is occuring, and relatedly this is where I see a lot of odd 
> edge-case behaviour in my prototype. I allowed it to speed up the 
> implementation, but it merits more discussion.
>
> On the other hand, this does seem like an... interesting use-case:
>
> error = "rate limit exceeded"
> &:error *# return error tuple*
>
> *Thanks for reading! What do you think?*
>
>
> --
> You received this message because you are subscribed to the Google Groups 
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elixir-lang-co...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elixir-lang-core/ad7e0313-4207-4cb7-a5f3-d824735830abn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elixir-lang-core/ad7e0313-4207-4cb7-a5f3-d824735830abn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elixir-lang-co...@googlegroups.com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elixir-lang-core/2b46232e-04f1-4b21-87e6-9c098741cd36n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elixir-lang-core/2b46232e-04f1-4b21-87e6-9c098741cd36n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/a19f6007-395d-4758-8a41-106bbaf26458n%40googlegroups.com.

Re: [elixir-core:11445] [Proposal] Overload capture operator to support tagged variable captures

Reply via email to