Looking forward to any progress on this front. If anyone is interested, I was playing with these ideas and you can see some example use cases here: https://github.com/wojtekmach/fix/blob/master/test/fix_test.exs. But exactly because Code.Formatter is private, I had to vendor it and that code broke on recent Elixir versions.
On May 5, 2021, "José Valim" <[email protected]> wrote: > I believe those ideas are promising. There is one issue about > quoted_to_algebra though: the AST that the formatter works with is a > special one, where the literals are wrapped in blocks so we can store > their metadata. This means that, in order to have quotes_to_algebra, > we would need to change the formatter to also handle “regular AST” or > nodes with limited metadata. > It is doable, but it is work, and a requirement to expose said > functionality. > About the comments, I like the suggestion. Although we should probably > move from a tuple to a map to avoid breaking changes in the future. A > PR for this particular issue is very welcome! > Thank you for the proposal and thinking about these problems. > On Wed, May 5, 2021 at 21:16 i Dorgan <[email protected] > <mailto:[email protected]>> wrote: > Hi all, > > The motivation for this proposal is to make it easier for tools to > alter and format elixir code while preserving the current behavior of > the formatter. Most of the functionality is already there, and a > little change in the APIs would enable a wide variety of new use > cases. > > If we want to transform a piece of code from one form to another, we > can modify the AST and then convert it back to a string. For example, > a tool could detect the usage of `String.to_atom` and not only warn > about unsafe string to atom conversion, but also give an option to > automatically fix the issue, replacing it with > `String.to_existing_atom`. The first part is already covered by tools > like credo, but it seems that manipulating the source code itself is > difficult, mostly because the AST does not contain information about > comments and because `Macro.to_string` doesn't produce text that > complies with the elixir coding conventions. For example, this code: > ```elixir > def foo(bar) do > # Some comment > :baz > end > ``` > Would be printed as this: > ```elixir > def(fop(bar)) do > :baz > end > ``` > Tools like <https://github.com/KronicDeth/intellij-elixir> implement > their own parsers to circumvent this issue, but I think it would be > nice if it could be achieved with the existing parser in core elixir. > > I've seen other conversations where it was suggested to change the > elixir AST to support comments, either adding a new comment > node(breaking change) or using the nodes metadata, but the conclusion > was that there was no clear preference on how to do this at the AST > level, and being that the elixir tokenizer allows us to pass a > function to process the comments, José suggested to keep the them on a > side data structure instead of in the AST itself. This is what the > Elixir formatter does. > > Currently the `Code.Formatter` module is private API used by the > `Code.format_string` function. This means that the only way to format > elixir code is by providing a string(or a file) to a function in the > `Code` module. If we are transforming the code, however, what we have > is a quoted expression, thus we don't have a way to turn it back into > a string. > > At a high level, the `Code.Formatter.to_algebra` does three things: > 1. It extracts the comments from the source code > 2. It parses the source code into a quoted expression > 3. It takes the comments and the quoted expressions and merges them to > produce an algebra document > > What I propose is to split the `Code.Formatter.to_algebra` considering > those steps, and expose the functionality via the `Code` module. The > reasoning is that if a user has access to both the ast and the > comments, they can then transform the ast, and return back both the > ast and comments to the formatter to produce an algebra document. How > they implement this is up to them and Elixir doesn't need to give an > opinion on how comments should be handled during those manipulations, > nor does it need to expose the private version of the AST used > internally by the formatter. If they want to merge comments into the > metadata or use custom nodes, it's completely up to them, they just > need to return back a valid quoted expression and a list of comments > with their metadata. > > The other reason I think this should be done by exposing those > functions is that there's a great a amount of work put into the > formatter to turn the quoted expressions into formatted algebra > documents, and I think all of that could be reused, eliminating the > need for custom formatters. > > The workflow would be something like this: > ```elixir > {:ok, {quoted, comments}} = File.read!(path) |> > Code.string_to_quoted_with_comments() > quoted = do_some_manipulation(quoted , comments) > {:ok, doc} = Code.quoted_to_algebra(quoted, comments: comments) > new_source = Inspect.Algebra.format(doc, 80) > ``` > I'm not married to those function names and return values, but I hope > they serve to convey the idea. > > I already have some little examples of source code transformations > using an API like the above, I'm working on reducing and tidying them, > but in the meantime I would like to hear you opinions on this > proposal. > > The only downside I can think of is that the `{line, {previous_eol, > next_eol}, text}` format of the comments may be considered a private > data structure, but the formatter hasn't changed much since 1.6(3 > years ago) and I think it could be considered stable enough to be > exposed. > > -- > You received this message because you are subscribed to the Google > Groups "elixir-lang-core" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to [email protected] > <mailto:[email protected]>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elixir-lang-core/5f431518-f555-48bb- > a999-ec49f6423463n%40googlegroups.com > <https://groups.google.com/d/msgid/elixir-lang-core/5f431518-f555- > 48bb-a999- > ec49f6423463n%40googlegroups.com?utm_medium=email&utm_source=footer>. > -- > You received this message because you are subscribed to the Google > Groups "elixir-lang-core" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to [email protected] > <mailto:[email protected]>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4Jxt- > nAbwYKYkTv0wUAamm%2Bhm24B3DdKatD%2Bdgn0six1g%40mail.gmail.com > <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4Jxt- > nAbwYKYkTv0wUAamm%2Bhm24B3DdKatD%2Bdgn0six1g%40mail.gmail.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/6194d0e68dd80739e8bbf48bf7663bd452e54d8a%40hey.com.
