Recent tree months, I wrote some parsers under src/Data/Parser:

+ blanks_parser

+ escaped_char_parser

+ identifier_parser

+ inline_comment_parser

+ keyword_parser

+ number_parser

+ operator_parser

+ string_parser (TODO: multi-line support)

On top of these parsers, I rewrite various part of xxx_language.cpp .

I didn't use a higher-level abstraction like packrat-parser. In my opinion,for now, low level small parsers written in C++ should work fine.

Most programing languages are similar in syntax. Doing simple abstraction is sufficient for syntax highlighting.


The goal of these small parsers with `can_parser` and `do_parser` is to minimize the time to implemented a new programming language parser.


     Below are the detailed progress:

+ dot: new composition of parsers, string_parser, keyword_parser, operator_parser

+ cpp: new composition of parsers, string_parser

+ java/scala/python: old composition of parsers, string_parser

+ others: old composition of parsers


     Composition of parsers: old vs new

Almost all xxx_language.cpp (except scheme_language.cpp) is derived from mathemagix_language.cpp .

I call it old-style composition of parsers. It is not efficient enough, because, we have to re-parse the code in `get_color`.

Dive into concat_text.cpp:typeset_prog_string, we will find what actually is `get_color` and `advance`.

The new-style composition of parsers, reduce the unnecessary parsings in get_color.


     Keyword/Operator Parser

Aims to keep the (type,  keyworkd) mapping in Scheme files. Please refer to `dot-lang.scm`.


     String Parser

Currently, the String parser only support inline string. Actually, string and multi-comment are the same type.

They both have openings and corresponding closings.

The string parser will finally support multi-line. Once it is ready, the multi-comment parser will also be implemented

in a short time.


     Recent Plans

I will continue my developments on newly-supported languages (like dot). The goal is to make it extremely easy to

support a new language. For coloring schemes, it is another topic. In the next one or two months, I will continue to

work on improving the xyz_parser and abc_language.


Darcy

2020/03/23

_______________________________________________
Texmacs-dev mailing list
Texmacs-dev@gnu.org
https://lists.gnu.org/mailman/listinfo/texmacs-dev

Reply via email to