Re: [elixir-core:11447] [Proposal] Overload capture operator to support tagged variable captures

Austin Ziegler Wed, 28 Jun 2023 20:16:15 -0700

On Wed, Jun 28, 2023 at 8:41 PM Paul Schoenfelder <
paulschoenfel...@fastmail.com> wrote:


> I have an almost visceral reaction to the use of capture syntax for this
> though, and I don’t believe any of the languages you mentioned that support
> field punning do so in this fashion. They all use a similar intuitive
> syntax where the variable matches the field name, and they don’t make any
> effort to support string keys.
>

JavaScript *only* supports string keys. Ruby’s pattern matching which can
lead to field punning only supports symbol keys, but since ~2.2 Ruby can
garbage collect symbols, making it *somewhat* less dangerous to do
`JSON.parse!(data, keys: :symbol)` than it was previously.

As far as I know, the BEAM does not do any atom garbage collection, and
supporting *only* symbols will lead to a greater chance of atom exhaustion
because a non-flagged mechanism here that only works on atom keys will lead
to `Jason.parse(data, keys: :atom)` (and not `Jason.parse(data, keys:
:atom!)`). I do not think that any destructuring syntax which works on maps
with symbol keys but not string keys will be acceptable, although if it is
constrained to *only* work on structs, then it does not matter (as that is
the same restriction that it appears that OCaml and Haskell have).

I think that either `&:key` / `&"key"` or `$:key` / `$"key"` will work very
nicely for this feature, although it would be nice to have `&key:` or
`$key:` work the same as the former version. Alternatively, the `$` symbol
could be used at the beginning of the data structure to indicate that it is
performing capture destructuring (e.g., `$%{key1:, key2:}` or `$%{"key1",
"key2"}`, but then it starts feeling a little more line-noisy.

I think that the proposal here — either using `&` or `$` — is entirely
workable and IMO extends the concept nicely.

-a

On Wed, Jun 28, 2023, at 7:56 PM, Christopher Keele wrote:
>
> This is a formalization of my concept here
> <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU/m/BWF24zoAAgAJ>,
> as a first-class proposal for explicit discussion/feedback, since I now
> have a working prototype
> <https://github.com/elixir-lang/elixir/compare/main...christhekeele:elixir:tagged-variable-capture>
> .
>
> *Goal*
>
> The aim of this proposal is to support a commonly-requested feature: 
> *short-hand
> construction and pattern matching of key/value pairs of associative data
> structures, based on variable names* in the current scope.
>
> *Context*
>
> Similar shorthand syntax sugar exists in many programming languages today,
> known variously as:
>
>    - Field Punning <https://dev.realworldocaml.org/records.html> — OCaml
>    - Record Puns
>    <https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/record_puns.html>
>    — Haskell
>    - Object Property Value Shorthand
>    
> <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer#property_definitions>
>    — ES6 Javascript
>
> This feature has been in discussion for a decade, on this mailing list (1
> <https://groups.google.com/g/elixir-lang-core/c/4w9eOeLvt-8/m/WOkoPSMm6kEJ>,
> 2
> <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/WTpArTGMKSIJ>,
> 3
> <https://groups.google.com/g/elixir-lang-core/c/3XrVXEVSixc/m/NHU2M4QFAQAJ>,
> 4
> <https://groups.google.com/g/elixir-lang-core/c/OvSQkvXxsmk/m/bKKHbBxiCwAJ>,
> 5
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/1W-d_XAlBgAJ>
> , 6 <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU>) and the
> Elixir forum (1
> <https://elixirforum.com/t/proposal-add-field-puns-map-shorthand-to-elixir/15452>,
> 2
> <https://elixirforum.com/t/shorthand-for-passing-variables-by-name/30583>,
> 3
> <https://elixirforum.com/t/if-you-could-change-one-thing-in-elixir-language-what-you-would-change/19902/17>,
> 4
> <https://elixirforum.com/t/has-map-shorthand-syntax-in-other-languages-caused-you-any-problems/15403>,
> 5
> <https://elixirforum.com/t/es6-ish-property-value-shorthands-for-maps/1524>,
> 6
> <https://elixirforum.com/t/struct-creation-pattern-matching-short-hand/7544>),
> and has motivated many libraries (1
> <https://github.com/whatyouhide/short_maps>, 2
> <https://github.com/meyercm/shorter_maps>, 3
> <https://hex.pm/packages/shorthand>, 4 <https://hex.pm/packages/synex>).
> These narrow margins cannot fit the full history of possibilities,
> proposals, and problems with this feature, and I will not attempt to
> summarize them all. For context, I suggest reading this mailing list
> proposal
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/1W-d_XAlBgAJ>
> and this community discussion
> <https://elixirforum.com/t/proposal-add-field-puns-map-shorthand-to-elixir/15452>
>  in
> particular.
>
> However, in summary, this particular proposal tries to solve a couple of
> past sticking points:
>
>    1. Atom vs String
>    <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/IpZQHbZk4xEJ>
>    key support
>    2. Visual clarity
>    <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/NBkAVto0BAAJ>
>    that atom/string matching is occurring
>    3. Limitations of string-based sigil parsing
>    <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/TiZw6xM3BAAJ>
>    4. Easy confusion
>    <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/WRhXxHDfBAAJ>
>    with tuples
>
> I have a working fork of Elixir here
> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture>
> where this proposed syntax can be experimented with. Be warned, it is buggy.
>
> *Proposal: Tagged Variable Captures*
>
> I propose we overload the unary capture operator (*&*) to accept
> compile-time atoms and strings as arguments, for example *&:foo* and
> *&"bar"*. This would *expand at compile time* into *a tagged tuple with
> the atom/string and a variable reference*. For now, I am calling this a 
> *"tagged-variable
> capture"*  to differentiate it from a function capture.
>
> For the purposes of this proposal, assume:
>
> {foo, bar} = {1, 2}
>
> Additionally,
>
>    - Lines beginning with *# == * indicate what the compiler expands an
>    expression to.
>    - Lines beginning with *# => * represent the result of evaluating that
>    expression.
>    - Lines beginning with *# !> * represent an exception.
>
> *Bare Captures*
>
> I'm not sure if we should support *bare* tagged-variable capture, but it
> is illustrative for this proposal, so I left it in my prototype. It would
> look like:
>
> &:foo
> *# == **{:foo, foo}*
> *# => *{:foo, 1}
> &"foo"
> *# == **{"foo", foo}*
> *# => *{"foo", 1}
>
> If bare usage is supported, this expansion would work as expected in match
> and guard contexts as well, since it expands before variable references are
> resolved:
>
> {:foo, baz} = &:foo
> *# == {:foo, baz} = {:foo, foo}*
> *# => *{:foo, 1}
> baz
> *# => *1
>
> *List Captures*
>
> Since capture expressions are allowed in lists, this can be used to
> construct Keyword lists from the local variable scope elegantly:
>
> list = [&:foo, &:bar]
> *# == **list = [{:foo, foo}, {:bar, bar}]*
> *# => *[foo: 1, bar: 2]
>
> This would work with other list operators like *|*:
>
> baz = 3
> list = [&:baz | list]
> *# == **list = [**{:baz, baz} **| **list**]*
> *# => *[baz: 3, foo: 1, bar: 2]
>
> And list destructuring:
>
> {foo, bar, baz} = {nil, nil, nil}
> [&:baz, &:foo, &:bar] = list
> *# == [{:baz, baz}, {:foo, foo}, {:bar, bar}] = list*
> *# => *[baz: 3, foo: 1, bar: 2]
> {foo, bar, baz}
> *# => *{1, 2, 3}
>
> *Map Captures*
>
> With a small change to the parser,
> <https://github.com/elixir-lang/elixir/commit/0a4f5376c0f9b4db7d71514d05df6b8b6abc96a9>
> we can allow this expression inside map literals. Because this expression
> individually gets expanded into a tagged-tuple before the map associations
> list as a whole are processed, it allow this syntax to work in all existing
> map/struct constructs, like map construction:
>
> map = %{&:foo, &"bar"}
> *# == %{:foo => foo, "bar" => bar}*
> *# => *%{:foo => 1, "bar" => 2}
>
> Map updates:
>
> foo = 3
> map = %{map | &:foo}
> *# == %{map | :foo => foo}*
> *# => *%{:foo => 3, "bar" => 2}
>
> And map destructuring:
>
> {foo, bar} = {nil, nil}
> %{&:foo, &"bar"} = map
> *# == %{:foo => foo, "bar" => bar} = map*
> *# => *%{:foo => 3, "bar" => 2}
> {foo, bar}
> *# => *{3, 2}
>
> *Considerations*
>
> Though just based on an errant thought
> <https://groups.google.com/g/elixir-lang-core/c/oFbaOT7rTeU/m/BWF24zoAAgAJ>
> that popped into my head yesterday, I'm unreasonably pleased with how well
> this works and reads in practice. I will present my thoughts here, though
> again I encourage you to grab my branch
> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture>, 
> compile
> it from source
> <https://github.com/christhekeele/elixir/tree/tagged-variable-capture#compiling-from-source>,
>  and
> play with it yourself!
>
> *Pro: solves existing pain points*
>
> As mentioned, this solves flaws previous proposals suffer from:
>
>    1. Atom vs String
>    
> <https://groups.google.com/g/elixir-lang-core/c/NoUo2gqQR3I/m/IpZQHbZk4xEJ> 
> key
>    support
>    This supports both.
>    2. Visual clarity
>    
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/NBkAVto0BAAJ> 
> that
>    atom/string matching is occurring
>    This leverages the appropriate literal in question within the syntax
>    sugar.
>    3. Limitations of string-based sigil parsing
>    <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/TiZw6xM3BAAJ>
>    This is compiler-expansion-native.
>    4. Easy confusion
>    
> <https://groups.google.com/g/elixir-lang-core/c/XxnrGgZsyVc/m/WRhXxHDfBAAJ> 
> with
>    tuples
>    %{&:foo, &"bar"} is very different from {foo, bar}, instead of
>    1-character different.
>
> Additionally, it solves my main complaint with historical proposals:
> syntax to combine a variable identifier with a literal must either obscure
> that we are building an identifier, or obscure the key/string typing of the
> literal.
>
> I'm proposing overloading the capture operator rather than introducing a
> new operator because the capture operator already has a semantic
> association with messing with variable scope, via the nested integer-based
> positional function argument syntax (ex *& &1*).
>
> By using the capture operator we indicate that we are messing with an
> identifier in scope, but via a literal atom/string we want to associate
> with, to get the best of both worlds.
>
> *Pro: works with existing code*
>
> The capture today operator has well-defined compile-time-error semantics
> if you try to pass it an atom or a string. All compiling Elixir code today
> will continue to compile as before.
>
> *Pro: works with existing tooling*
>
> By overloading an existing operator, this approach works seamlessly for me
> with the syntax highlighters I have tried it with so far, and reasonable
> with the formatter.
>
> In my experimentation I've found that the formatter wants to rewrite *&:baz
> *to *(&:baz)* pretty often. That's good, because there are several edge
> cases in my prototype where not doing so causes it to behave strangely; I'm
> sure it's resolving ambiguities that would occur in function captures that
> impact my proposal in ways I have yet fully anticipated.
>
> *Pros: minimizes surface area of the language*
>
> By overriding the capture operator instead of introducing a new operator
> or sigil, we are able to keep the surface area of this feature slim.
>
> *Cons: overloads the capture operator*
>
> Of course, much of the virtues of this proposal comes from overloading the
> capture operator. But it is an already semantically fraught syntactic sugar
> construct that causes confusion to newcomers, and this would place more
> strain on it.
>
> We would need to augment it with more than the meager error message
> modification
> <https://github.com/elixir-lang/elixir/commit/3d83d21ada860d03cece8c6f90dbcf7bf9e737ec#diff-92b98063d1e86837fae15261896c265ab502b8d556141aaf1c34e67a3ef3717cL199-R207>
>  in
> my prototype, as well as documentation and anticipate a new wave of
> questions from the community upon release.
>
> This inelegance really shows when considering embedding a tagged variable
> capture inside an anonymous function capture, ex *& &1 = &:foo*. In my
> prototype I've chosen to allow this rather than error on "nested captures
> not allowed" (would probably become: "nested *function* captures not
> allowed"), but I'm not sure I found all the edge-cases of mixing them in
> all possible constructions.
>
> Additionally, since my proposal now allows the capture operator as an
> associative element inside map literal parsing, that would change the
> syntax error reported by providing a function capture as an associative
> element to be generated during expansion rather than during parsing. I am
> not fluent enough in leex to have have updated the parser to preserve the
> exact old error, but serendipitously what it reports in my prototype today
> is pretty good regardless, but I prefer the old behaviour:
>
> Old:
> %{& &1}
> *# !> **** (SyntaxError) syntax error before '}'*
> *# !> * |
> *# !> * 1 | %{& &1}
> *# !> * | ^
> New:
> %{& &1}
> *# => error: expected key-value pairs in a map, got: & &1*
> *# => ** (CompileError) cannot compile code (errors have been logged)*
>
> *Cons: here there be dragons I cannot see*
>
> I'm quite sure a full implementation would require a lot more knowledge of
> the compiler than I am able to provide. For example, *&:foo = &:foo *raises
> an exception where *(&:foo) = &:foo* behaves as expected. I also find the
> variable/context/binding environment implementation in the erlang part of
> the compiler during expansion to be impenetrable, and I'm sure my prototype
> fails on edge cases there.
>
> *Open Question: the pin operator*
>
> As this feature constructs a variable ref for you, it is not clear if/how
> we should support attempts to pin the generated variable to avoid new
> bindings. In my prototype, I have tried to support the pin operator via the
> *&^:atom *syntax, though I'm pretty sure it's super buggy on bare
> out-of-data-structure cases and I only got it far enough to work in
> function heads for basic function head map pattern matching.
>
> *Open Question: charlists*
>
> I did not add support for charlist tagged variable captures in my
> prototype, as it would be more involved to differentiate a capture of list
> mean to become a tagged tuple from a list representing the AST of a
> function capture. I would not lose a lot of sleep over this.
>
> *Open Question: allowed contexts*
>
> Would we even want to allow this syntax construct outside of map literals?
> Or list literals?
>
> I can certainly see people abusing the
> bare-outside-of-associative-datastructure syntax to make some neigh
> impenetrable code where it's really unclear where assignment and pattern
> matching is occuring, and relatedly this is where I see a lot of odd
> edge-case behaviour in my prototype. I allowed it to speed up the
> implementation, but it merits more discussion.
>
> On the other hand, this does seem like an... interesting use-case:
>
> error = "rate limit exceeded"
> &:error *# return error tuple*
>
> *Thanks for reading! What do you think?*
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elixir-lang-core+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elixir-lang-core/ad7e0313-4207-4cb7-a5f3-d824735830abn%40googlegroups.com
> <https://groups.google.com/d/msgid/elixir-lang-core/ad7e0313-4207-4cb7-a5f3-d824735830abn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elixir-lang-core+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elixir-lang-core/4ee25f02-f27e-47a8-b4b5-b8520c1c9b05%40app.fastmail.com
> <https://groups.google.com/d/msgid/elixir-lang-core/4ee25f02-f27e-47a8-b4b5-b8520c1c9b05%40app.fastmail.com?utm_medium=email&utm_source=footer>
> .
>


-- 
Austin Ziegler • halosta...@gmail.com • aus...@halostatue.ca
http://www.halostatue.ca/ • http://twitter.com/halostatue

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CAJ4ekQt_8BxL69o2RyZzrBiqwH8oRpZD95qWzxHWTkH3d1vBGA%40mail.gmail.com.

Re: [elixir-core:11447] [Proposal] Overload capture operator to support tagged variable captures

Reply via email to