It seems I completely failed trying to express my idea.
Instead of extending Org grammar (syntax), I suggest to change behavior
of source blocks during export. In addition to current :results options,
"ast" may be added. Its effect is that instead of adding text to export
buffer that is parsed as Org markup, it causes insertion of a branch of
syntax tree into original parse results. I admit, during export it may
be necessary to iterate over source blocks one more time at a later stage.
Such source blocks should return "Org Syntax Tree", a simplified variant
of org-element. It allows to change implementation details and e.g. to
use vectors instead of lists for attributes in org-element. A converter
from Org Syntax Tree to org-element should be implemented.
Certainly such format may be used directly as src_ost{(code (:class var)
"language")} inline snippets or as
#+begin_src ost
(code nil ("libtree-{sitter}-"
(code (:class var) "\"language\"")
"."
(code (:class var) "ext")))
#+end_src
block-level elements. However I expect that it is the last resort option
when there is no way to express desired construct in some other way.
I think, more convenient org-babel backends may be created to parse
TeX-like (texinfo-like) or SGML-like (XML-like) syntax into Org Syntax
Tree hierarchy. The essential idea is that outside of source blocks
usual lightweight markup is used. Source blocks however have just a few
special characters ([\{}], [@{}], or [&<>], etc.) to reduce issues with
escaping for regular text or verbatim-like commands.
Some comments are inline.
On 03/10/2022 11:36, Ihor Radchenko wrote:
Max Nikulin writes:
On 02/10/2022 11:59, Ihor Radchenko wrote:
If you are asking how to represent such construct without introducing
custom elements then (it may be e.g. :type, not :class) parsed AST
should be like
(code nil ("libtree-{sitter}-"
(code (:class var) "\"language\"")
"."
(code (:class var) "ext")))
This is not much different from @name[nil]{<contents>} idea, but
more verbose.
> Also, more importantly, I strongly dislike the need to wrap the text
> into "". You will have to escape \". And it will force third-party
> parsers to re-implement Elisp sexp reader.
By this example I was trying to show how to express @var, @samp, @file
without introducing of new custom objects. I do not see any problem with
verbosity of such format, it may be used for really special cases only,
while some more convenient markup is used for more simple cases.
If there was some syntax for object attributes then simple cases would
be like
[[attr:(:class var)]]~language~
I do not like this idea. It will require non-trivial changes in Org
parser and fontification.
Using dedicated object properties or at least inheriting properties from
:parent is the style we employ more commonly across the code:
@var{language}
or
@code[:class var]{language}
or
@attr[:class var]{~language~}
I do not mind to have some "span" object to assign attributes to its
direct children. I used link-like prefix object just because a proof of
concept may be tried with no changes in Org. It does not require support
of nested objects. There is no existing syntax for such "span" objects,
but perhaps it is not necessary and source blocks should be used instead
for special needs.
I have no idea concerning particular markup that can be used inside
source blocks. It might be LaTeX-like commands as discussed in the
sibling subthread or HTML (XML) based syntax that is more verbose than
TeX-like notation.
By convention, the dynamic library
for src_alt{\code[class=var]{language}} is
src_alt{\code{libtree-\{sitter\}-\code[class=var]{"language"}.\code[class=var]{ext}}},
where src_alt{\code[class=var]{ext}} is the
system-specific extension for dynamic libraries.
I am against the idea of LaTeX-like commands. It will clash with
latex-fragment object type.
https://orgmode.org/worg/dev/org-syntax.html#LaTeX_Fragments
or
By convention, the dynamic library for
src_alt{<code class="var">language</code>} is
src_alt{<code>libtree-{sitter}-<code
class="var">"language"</code>.<code class="var">ext</code></code>},
where src_alt{<code class="var">ext</code>} is the
system-specific extension for dynamic libraries.
This style will indeed make things easier for the parser. But I find it
too verbose for practical usage. This is why I instead proposed the idea
with variable number of brackets: @code{{can have } inside}}.
Texinfo is TeX with \ replaced by @. Just another character has the
category starting command. The important point is that while Org markup
uses a lot of special characters (*/_+[]...) this flexible markup should
use just a few ones. I do not see any obstacles to try texinfo-like
markup. Source blocks allow to have several languages.
Hypothetical "alt" babel language has default :results ast :export
results header arguments to inject AST bypassing Org markup stage.
The problem with src block emitting AST is clashing with the way src
blocks work during export. What `org-export-as' does is replacing/adding
src block output into the actual Org buffer text before the parsing is
done.
Handling direct AST sexps will require a rewrite on how babel
integration with export works.
Yes, it will. I am evaluating feasibility of such change instead of
extending of Org syntax for custom elements.