I believe those ideas are promising. There is one issue about
quoted_to_algebra though: the AST that the formatter works with is a
special one, where the literals are wrapped in blocks so we can store their
metadata. This means that, in order to have quotes_to_algebra, we would
need to change the formatter to also handle “regular AST” or nodes with
limited metadata.

It is doable, but it is work, and a requirement to expose said
functionality.

About the comments, I like the suggestion. Although we should probably move
from a tuple to a map to avoid breaking changes in the future. A PR for
this particular issue is very welcome!

Thank you for the proposal and thinking about these problems.

On Wed, May 5, 2021 at 21:16 i Dorgan <[email protected]> wrote:

> Hi all,
>
> The motivation for this proposal is to make it easier for tools to alter
> and format elixir code while preserving the current behavior of the
> formatter. Most of the functionality is already there, and a little change
> in the APIs would enable a wide variety of new use cases.
>
> If we want to transform a piece of code from one form to another, we can
> modify the AST and then convert it back to a string. For example, a tool
> could detect the usage of `String.to_atom` and not only warn about unsafe
> string to atom conversion, but also give an option to automatically fix the
> issue, replacing it with `String.to_existing_atom`. The first part is
> already covered by tools like credo, but it seems that manipulating the
> source code itself is difficult, mostly because the AST does not contain
> information about comments and because `Macro.to_string` doesn't produce
> text that complies with the elixir coding conventions. For example, this
> code:
> ```elixir
> def foo(bar) do
>   # Some comment
>   :baz
> end
> ```
> Would be printed as this:
> ```elixir
> def(fop(bar)) do
>   :baz
> end
> ```
> Tools like https://github.com/KronicDeth/intellij-elixir implement their
> own parsers to circumvent this issue, but I think it would be nice if it
> could be achieved with the existing parser in core elixir.
>
> I've seen other conversations where it was suggested to change the elixir
> AST to support comments, either adding a new comment node(breaking change)
> or using the nodes metadata, but the conclusion was that there was no clear
> preference on how to do this at the AST level, and being that the elixir
> tokenizer allows us to pass a function to process the comments, José
> suggested to keep the them on a side data structure instead of in the AST
> itself. This is what the Elixir formatter does.
>
> Currently the `Code.Formatter` module is private API used by the
> `Code.format_string` function. This means that the only way to format
> elixir code is by providing a string(or a file) to a function in the `Code`
> module. If we are transforming the code, however, what we have is a quoted
> expression, thus we don't have a way to turn it back into a string.
>
> At a high level, the `Code.Formatter.to_algebra` does three things:
> 1. It extracts the comments from the source code
> 2. It parses the source code into a quoted expression
> 3. It takes the comments and the quoted expressions and merges them to
> produce an algebra document
>
> What I propose is to split the `Code.Formatter.to_algebra` considering
> those steps, and expose the functionality via the `Code` module. The
> reasoning is that if a user has access to both the ast and the comments,
> they can then transform the ast, and return back both the ast and comments
> to the formatter to produce an algebra document. *How they implement this
> is up to them* and Elixir doesn't need to give an opinion on how comments
> should be handled during those manipulations, nor does it need to expose
> the private version of the AST used internally by the formatter. If they
> want to merge comments into the metadata or use custom nodes, it's
> completely up to them, they just need to return back a valid quoted
> expression and a list of comments with their metadata.
>
> The other reason I think this should be done by exposing those functions
> is that there's a great a amount of work put into the formatter to turn the
> quoted expressions into formatted algebra documents, and I think all of
> that could be reused, eliminating the need for custom formatters.
>
> The workflow would be something like this:
> ```elixir
> {:ok, {quoted, comments}} = File.read!(path) |>
> Code.string_to_quoted_with_comments()
> quoted  = do_some_manipulation(quoted , comments)
> {:ok, doc} = Code.quoted_to_algebra(quoted, comments: comments)
> new_source = Inspect.Algebra.format(doc, 80)
> ```
> I'm not married to those function names and return values, but I hope they
> serve to convey the idea.
>
> I already have some little examples of source code transformations using
> an API like the above, I'm working on reducing and tidying them, but in the
> meantime I would like to hear you opinions on this proposal.
>
> The only downside I can think of is that the `{line, {previous_eol,
> next_eol}, text}` format of the comments may be considered a private data
> structure, but the formatter hasn't changed much since 1.6(3 years ago) and
> I think it could be considered stable enough to be exposed.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elixir-lang-core/5f431518-f555-48bb-a999-ec49f6423463n%40googlegroups.com
> <https://groups.google.com/d/msgid/elixir-lang-core/5f431518-f555-48bb-a999-ec49f6423463n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4Jxt-nAbwYKYkTv0wUAamm%2Bhm24B3DdKatD%2Bdgn0six1g%40mail.gmail.com.

Reply via email to