Here is an attempt to clarify my own confusion around the nested structures in org. In short: each node in the headline tree and the plain list tree can be parse using the EBNF, the nesting level cannot, which means that certain useful operations such as folding, require additional rules beyond the grammar. More in line. Best! Tom
> Do you need to? This is valid as an entire Org file, I think: > > *** foo > * bar > ***** baz > > And that can be represented in EBNF. I'm not aware of places where behavior > is indent-level specific, except inline tasks, and that edge case can be > represented. You are correct, and as long as the heading depth doesn't change some interpretation then this is a non-issue. The reason I mentioned this though is because it means that you cannot determine how to correctly fold an org file from the grammar alone. To make sure I understand. It is possible to determine the number of leading stars (and thus the level), but I think that it is not possible to identify the end of a section. For example * a *** b ** c * d You can parse out a 1, b 3, c 2, d 1, but if you want to be able to nest b and c inside a but not nest d inside a, then you need a stack in there somewhere. You can't have a rule such as section : headline content content : text | section because the parse would incorrectly nest sections at the same level, you would have to write section-level-1 : headline-1 content-1 content-1 : text | section-level-2-n but since we have an arbitrary number of levels the grammar would have to be infinite. This is only if you want your grammar to be able to encode that the content of sections can include other more deeply nested sections, which in this context we almost certainly do not (as you point out). > > There is a similar issue with the indentation level in > > order to correctly interpret plain lists. > > list ::= ('+' string newline)+ sublist? > sublist ::= (indent list)+ > > I think this captures lists? Ah yes, I see my mistake here. In order for this to work the parser has to implement significant whitespace, so whitespace cannot be parsed into a single token. I think everything works out after that. > Definitely not able to be represented in EBNF, unless as you say {name} is a > limited vocabulary. Darn those pesky open sets!