Hi Steve,

I agree, part of the work to make that a reality started with the proposal 
to make public some functionality from the Elixir 
formatter(https://groups.google.com/u/1/g/elixir-lang-core/c/-8CPorfVTxg).

>From a previous discussion in this mailing 
list(https://groups.google.com/u/1/g/elixir-lang-core/c/GM0yM5Su1Zc/m/poIKsiEVDQAJ)
 
my conclusion was that comments won't be added as part of the AST and 
having two ASTs(one for macros and one for source code manipulation that 
would have different semantics) would result in confusion, but if at least 
we could have access to a) the internal representation of comments in the 
formatter and b) a function to transform regular ast into an algebra 
document, then half of the work would already be done.

Because such change won't be done, the best place to figure it out would be 
in userland code, all we need is the formatter to "cooperate". I started 
working on the Sourceror library(https://github.com/doorgan/sourceror) to 
help solve this issue without having Elixir itself introduce breaking 
changes.

However, after the last comments-in-ast proposal we now have the 
literal_encoder option for Code.string_to_quoted/2, so maybe it would be 
worth exploring that front too. It may require a significant amount of work 
in the tokenizer, though.

-Dorgan
El domingo, 13 de junio de 2021 a las 13:36:20 UTC-3, [email protected] 
escribió:

>
> RE: And in any case comments are not part of the ast, it's up to the user 
> to figure out how to merge them together.
>
> Making it as easy as possible to preserve comments will aid AST 
> manipulation of files that are human created and then augmented 
> programmatically. 
>
> On Sun, Jun 6, 2021 at 9:20 AM i Dorgan <[email protected]> wrote:
>
>> It wouldn't be needed, with the `preserve_comments` option you already 
>> get the comment lines, so it's easy to figure out it's boundaries.
>> And in any case comments are not part of the ast, it's up to the user to 
>> figure out how to merge them together
>>
>> El sábado, 5 de junio de 2021 a las 21:45:09 UTC-3, [email protected] 
>> escribió:
>>
>>> Will that meta data extend to comments?
>>>
>>> On Fri, Jun 4, 2021 at 5:32 PM i Dorgan <[email protected]> wrote:
>>>
>>>> Sounds good!
>>>> I will send some PRs soon :)
>>>>
>>>> El viernes, 4 de junio de 2021 a las 18:10:41 UTC-3, José Valim 
>>>> escribió:
>>>>
>>>>> Ah, let’s call it :last then and it points to the segment. There is 
>>>>> always one too, so it is always available.
>>>>>
>>>>> On Fri, Jun 4, 2021 at 22:34 i Dorgan <[email protected]> wrote:
>>>>>
>>>>>> > I think for the dot we don't need end_of_expression, we just update 
>>>>>> the outer meta to include the outer identifier.
>>>>>> Sounds good to me
>>>>>>
>>>>>> >  For aliases, I guess we can reuse closing? Or maybe last_dot?
>>>>>> Closing points to the location of the closing pair, which implies 
>>>>>> there is something wrapped in {}, () or [](or end in the case of 
>>>>>> anonymous 
>>>>>> functions), which is why I was leaning towards end_of_expression. The 
>>>>>> only 
>>>>>> issue I see with end_of_expression is that we need to calculate the 
>>>>>> length 
>>>>>> of the segment(because end_of_expression always point at the very end of 
>>>>>> the expression, not just where the last token starts).
>>>>>>
>>>>>> The problem with last_dot is that the last segment may or may not be 
>>>>>> in the same line as the dot, for example:
>>>>>>
>>>>>> Foo.
>>>>>> Bar
>>>>>>
>>>>>> or
>>>>>>
>>>>>> Foo
>>>>>> .
>>>>>>
>>>>>> Bar
>>>>>>
>>>>>> Both of which evaluate to the same ast. So the name should refer to 
>>>>>> the last segment(:Bar in this case). Maybe last_segment? The syntax 
>>>>>> reference docs mention "each segment separated by dot as an argument", 
>>>>>> so 
>>>>>> it would be consistent with that description.
>>>>>>
>>>>>> > And what happens when the alias has no dot? We don't set it?
>>>>>> If the alias has no dot I think we could safely skip the new field, 
>>>>>> especially if we go for last_segment since there is only one segment.
>>>>>> El viernes, 4 de junio de 2021 a las 17:10:56 UTC-3, José Valim 
>>>>>> escribió:
>>>>>>
>>>>>>> I think for the dot we don't need end_of_expression, we just update 
>>>>>>> the outer meta to include the outer identifier. 
>>>>>>>
>>>>>>> For aliases, I guess we can reuse closing? Or maybe last_dot? And 
>>>>>>> what happens when the alias has no dot? We don't set it?
>>>>>>>
>>>>>>> On Fri, Jun 4, 2021 at 10:02 PM i Dorgan <[email protected]> wrote:
>>>>>>>
>>>>>>>> > The dot one is easy, I think we can have the outer meta be the 
>>>>>>>> meta of the call identifier. A PR is welcome.
>>>>>>>> Great! I will prepare a PR soon
>>>>>>>>
>>>>>>>> > One alternative is to have something similar to [end: ...] that 
>>>>>>>> we have for constructs like do-blocks, so we can at least say where 
>>>>>>>> the 
>>>>>>>> whole alias extends to? WDYT? 
>>>>>>>> Sounds reasonable to me. It would also be way less noisy.
>>>>>>>> I think what's most valuable is to be able to tell the boundaries 
>>>>>>>> of a node, not so much what happens in between.
>>>>>>>>
>>>>>>>> Regarding the naming of the fields, do you think end_of_expression 
>>>>>>>> would 
>>>>>>>> be fine for both? It is described as "denotes when the end of 
>>>>>>>> expression 
>>>>>>>> effectively happens", which is what we would be adding here. Moreover, 
>>>>>>>> they 
>>>>>>>> would be the same positions that are already added in such field if 
>>>>>>>> the 
>>>>>>>> expression is part of a block.
>>>>>>>> El viernes, 4 de junio de 2021 a las 16:13:11 UTC-3, José Valim 
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> The dot one is easy, I think we can have the outer meta be the 
>>>>>>>>> meta of the call identifier. A PR is welcome.
>>>>>>>>>
>>>>>>>>> For aliases, it is trickier, as you said. One alternative is to 
>>>>>>>>> have something similar to [end: ...] that we have for constructs like 
>>>>>>>>> do-blocks, so we can at least say where the whole alias extends to? 
>>>>>>>>> WDYT?
>>>>>>>>>
>>>>>>>>> On Fri, Jun 4, 2021 at 6:00 PM i Dorgan <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I'm writing a function that takes a quoted expression and 
>>>>>>>>>> calculates the start and end positions of the node in the source 
>>>>>>>>>> code. So 
>>>>>>>>>> for example for this expression:
>>>>>>>>>>
>>>>>>>>>> :"foo#{
>>>>>>>>>>   2
>>>>>>>>>> }bar"
>>>>>>>>>>
>>>>>>>>>> It would tell us that it starts at line: 1, column: 1 and ends 
>>>>>>>>>> at line: 3, column: 6. The idea is that by knowing the 
>>>>>>>>>> boundaries of a node, a refactoring tool can say things like 
>>>>>>>>>> "replace the 
>>>>>>>>>> code between these positions with this other code".
>>>>>>>>>>
>>>>>>>>>> The issue I'm facing is that there are two cases where the AST 
>>>>>>>>>> does not contain enough information to calculate those positions, 
>>>>>>>>>> the first 
>>>>>>>>>> one is qualified identifiers:
>>>>>>>>>>
>>>>>>>>>> foo
>>>>>>>>>> .
>>>>>>>>>> bar
>>>>>>>>>>
>>>>>>>>>> which produces the ast:
>>>>>>>>>>
>>>>>>>>>> {{:., [line: 2, column: 1],
>>>>>>>>>>   [
>>>>>>>>>>     {:foo, [line: 1, column: 1], nil},
>>>>>>>>>>     :bar
>>>>>>>>>>   ]},
>>>>>>>>>>  [no_parens: true, line: 2, column: 1], []}
>>>>>>>>>>
>>>>>>>>>> Note that we don't have any information about the location of 
>>>>>>>>>> :bar, only for the dot. This makes it impossible to accurately 
>>>>>>>>>> calculate the ending location for the expression, and we are forced 
>>>>>>>>>> to 
>>>>>>>>>> assume :bar is at the same line as the dot.
>>>>>>>>>>
>>>>>>>>>> The second case happens with aliases:
>>>>>>>>>>
>>>>>>>>>> Foo.
>>>>>>>>>> Bar
>>>>>>>>>> .Baz
>>>>>>>>>>
>>>>>>>>>> produces:
>>>>>>>>>>
>>>>>>>>>> {:__aliases__, [line: 1, column: 1], [:Foo, :Bar, :Baz]}
>>>>>>>>>>
>>>>>>>>>> Here we have even less information, we know nothing about dots or 
>>>>>>>>>> segments location, and we are forced to assume everything happens at 
>>>>>>>>>> the 
>>>>>>>>>> same line.
>>>>>>>>>>
>>>>>>>>>> I looked into the parser and this information is being discarded 
>>>>>>>>>> in the build_dot function for qualified identifiers and in 
>>>>>>>>>> build_dot_alias for aliases.
>>>>>>>>>>
>>>>>>>>>> My proposal is to keep that information in the ast metadata 
>>>>>>>>>> instead of discarding it when the :token_metadata option is true, 
>>>>>>>>>> similarly to how it is done with do/end, closing and 
>>>>>>>>>> end_of_expression.
>>>>>>>>>>
>>>>>>>>>> The quoted form of the first example would be something like this:
>>>>>>>>>>
>>>>>>>>>> {{:.,
>>>>>>>>>>   [
>>>>>>>>>>     identifier_location: [line: 3, column: 1],
>>>>>>>>>>     line: 2,
>>>>>>>>>>     column: 1
>>>>>>>>>>   ],
>>>>>>>>>>   [
>>>>>>>>>>     {:foo, [line: 1, column: 1], nil},
>>>>>>>>>>     :bar
>>>>>>>>>>   ]},
>>>>>>>>>>  [no_parens: true, line: 2, column: 1], []}
>>>>>>>>>>
>>>>>>>>>> For the aliases it would be a bit more involved, because there 
>>>>>>>>>> are two kind of locations that would need to be preserved: dots and 
>>>>>>>>>> segments. I've considered something like this to keep only the 
>>>>>>>>>> segments:
>>>>>>>>>>
>>>>>>>>>> {:__aliases__,
>>>>>>>>>>  [
>>>>>>>>>>    line: 1,
>>>>>>>>>>    column: 1,
>>>>>>>>>>    alias_segments: [
>>>>>>>>>>      [token: :Foo, line: 1, column: 1],
>>>>>>>>>>      [token: :Bar, line: 2, column: 1],
>>>>>>>>>>      [token: :Baz, line: 4, column: 1]
>>>>>>>>>>    ]
>>>>>>>>>>  ], [:Foo, :Bar, :Baz]}
>>>>>>>>>>
>>>>>>>>>> I already have a working version, so I will gladly submit a PR if 
>>>>>>>>>> you consider this to be viable. I'm still unsure on how to tackle 
>>>>>>>>>> the dots 
>>>>>>>>>> positions in a meaningful way. While just knowing the segments 
>>>>>>>>>> positions is 
>>>>>>>>>> enough for my use cases, I figure dot positions may also need to be 
>>>>>>>>>> preserved for the sake of completeness.
>>>>>>>>>>
>>>>>>>>>> I'd like to know your thoughts!
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "elixir-lang-core" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/815d3113-dae6-4e99-8427-a873a704c4aan%40googlegroups.com
>>>>>>>>>>  
>>>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/815d3113-dae6-4e99-8427-a873a704c4aan%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "elixir-lang-core" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>>
>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/ffd16955-f791-40b8-bd40-1cf37322995an%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/ffd16955-f791-40b8-bd40-1cf37322995an%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "elixir-lang-core" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>>
>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/ca7b5fe2-b767-40fe-899f-ab915989f2c4n%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/ca7b5fe2-b767-40fe-899f-ab915989f2c4n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elixir-lang-core" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>>
>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/elixir-lang-core/7c687cbf-ae58-40e7-87c7-609613751872n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/elixir-lang-core/7c687cbf-ae58-40e7-87c7-609613751872n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
>>>
>>> -- 
>>> Steve Morin | Entrepreneur, Engineering Leader, Startup Advisor 
>>> Editor at | https://productivegrowth.substack.com/
>>> twitter.com/SteveMorin | stevemorin.com
>>> *Live the dream start a startup. Make the world ... a better place.*
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elixir-lang-core/7b050ed2-c3f0-45e6-acf2-e340982a0d35n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elixir-lang-core/7b050ed2-c3f0-45e6-acf2-e340982a0d35n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
> Steve Morin | Entrepreneur, Engineering Leader, Startup Advisor 
> Editor at | https://productivegrowth.substack.com/
> twitter.com/SteveMorin | stevemorin.com
> *Live the dream start a startup. Make the world ... a better place.*
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/3e5c5364-e738-4458-a986-7971316bebd8n%40googlegroups.com.

Reply via email to