On Mon, Jul 06, 2020 at 04:18:41PM -0600, Daniele Nicolodi wrote:
> > In particular, I'd like to know if the raw/syntactic directives you
> > imagine coming out of the new Beancount core would be close enough
> > to the book concrete syntax to allow manipulation such as meddling
> > with spacing Provided that, and a good pretty printer for concrete
> > syntax, a "bean-sed" project with a dedicated manipulation language
> > can probably be created and maintained separately of core.
> 
> I am far from being a parsing expert, but I think having the parser
> emit a syntax tree suitable to reconstruct the input file without
> modifications is going to be very complex: the scanner would need to
> emit many more tokens for input that is now simply ignored (ie
> trailing whitespace) and the grammar would need to handle those,
> making it more complex. The representation of the parsing results
> would also be more complex. A lot of work to support a single tool.
> 
> I think that a tool like the one you describe should use the syntax
> tree and the actual file content in combination to rewrite the input
> file: the syntax tree allows to identify which elements need to be
> modified and from these the position in the input files where text
> changes need to happen. Sounds complex, but I believe less complex
> than augmenting the parser.

All good points. Its indeed a bit tricky (I've done it in the past for
an unrelated project) and it boils down to keeping around both a
concrete syntax tree (with all the spacing, for instance) and an
abstract syntax tree. The former is particularly annoying because to
have one that round-trips with textual input you often have to adapt the
lexer too.

I agree it looks like quite a burden for a single tool --- even though I
think it's a very important one to have, due to the intrinsic nature of
plain text accounting. And also, the alternative looks worse to me:
people just sed or search/replace in their ledgers messing up spacing or
worse. I don't think the user experience in doing that is great, and
that affects our user base.

There is an alternative though. Define a single canonical way to indent
Beancount textual ledgers and have a tool like Python's Black that
reformats a Beancount ledger (or even isolated directives) that way.
Right now there are some ambiguities, e.g., do you indent a metadata
attached to a transaction leg or not? Do you put them on the same line
of the transaction leg or on the line beneath it? Etc. And it gets
tricky with comments (which generally you want to keep as-is), both in
general and even more so when they are mixed with tags. If you have such
an "opinionated" pretty printer you can do all your changes on the AST
and just pretty print your result.

Cheers
-- 
Stefano Zacchiroli . z...@upsilon.cc . upsilon.cc/zack . . o . . . o . o
Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
Former Debian Project Leader & OSI Board Director  . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/20200707070826.jbauj3cn66uq7kve%40upsilon.cc.

Reply via email to