Hi all,

I'm writing a function that takes a quoted expression and calculates the 
start and end positions of the node in the source code. So for example for 
this expression:

:"foo#{
  2
}bar"

It would tell us that it starts at line: 1, column: 1 and ends at line: 3, 
column: 6. The idea is that by knowing the boundaries of a node, a 
refactoring tool can say things like "replace the code between these 
positions with this other code".

The issue I'm facing is that there are two cases where the AST does not 
contain enough information to calculate those positions, the first one is 
qualified identifiers:

foo
.
bar

which produces the ast:

{{:., [line: 2, column: 1],
  [
    {:foo, [line: 1, column: 1], nil},
    :bar
  ]},
 [no_parens: true, line: 2, column: 1], []}

Note that we don't have any information about the location of :bar, only 
for the dot. This makes it impossible to accurately calculate the ending 
location for the expression, and we are forced to assume :bar is at the 
same line as the dot.

The second case happens with aliases:

Foo.
Bar
.Baz

produces:

{:__aliases__, [line: 1, column: 1], [:Foo, :Bar, :Baz]}

Here we have even less information, we know nothing about dots or segments 
location, and we are forced to assume everything happens at the same line.

I looked into the parser and this information is being discarded in the 
build_dot function for qualified identifiers and in build_dot_alias for 
aliases.

My proposal is to keep that information in the ast metadata instead of 
discarding it when the :token_metadata option is true, similarly to how it 
is done with do/end, closing and end_of_expression.

The quoted form of the first example would be something like this:

{{:.,
  [
    identifier_location: [line: 3, column: 1],
    line: 2,
    column: 1
  ],
  [
    {:foo, [line: 1, column: 1], nil},
    :bar
  ]},
 [no_parens: true, line: 2, column: 1], []}

For the aliases it would be a bit more involved, because there are two kind 
of locations that would need to be preserved: dots and segments. I've 
considered something like this to keep only the segments:

{:__aliases__,
 [
   line: 1,
   column: 1,
   alias_segments: [
     [token: :Foo, line: 1, column: 1],
     [token: :Bar, line: 2, column: 1],
     [token: :Baz, line: 4, column: 1]
   ]
 ], [:Foo, :Bar, :Baz]}

I already have a working version, so I will gladly submit a PR if you 
consider this to be viable. I'm still unsure on how to tackle the dots 
positions in a meaningful way. While just knowing the segments positions is 
enough for my use cases, I figure dot positions may also need to be 
preserved for the sake of completeness.

I'd like to know your thoughts!

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/815d3113-dae6-4e99-8427-a873a704c4aan%40googlegroups.com.

Reply via email to