I'm still a bit confused on 2. The example you give for a single class is one
that has all static members, but what if instead it was just a regular class.
It would instantiate a single instance of said class to do parsing. This is
what relax is doing
(https://github.com/tlc-pack/relax/blob/25cb42133130fb8ec75d2cc021c6c347e99f0b28/python/tvm/script/relax/parser.py#L107).
Instantiating a class would also give the benefit of being able to maintain
some state while parsing which is not possible with either free functions or a
class with static methods. Here is how I would imagine it (this is just an
example):
```python
# To clearly define the interface
class AbstractParser:
def parse(func: ast.Function):
raise NotImplemented()
@register("relax")
class RelaxParser(AbstractParser):
def __init__():
self.tir_parser = TIRParser()
def parse(func: ast.Function):
self.inputs.push(parse_args(func.args))
self.tir_parser.add_variables(self.inputs)
parse_stmt(func.body)
def parse_stmt(stmt):
if stmt == ast.Let:
assert stmt.var in self.inputs
if stmt == ast.Function:
# parse nested functions as TIR
self.tir_parser.parse(stmt)
```
If we want to consider future dialects, this also has the added benefit of
being able to add dialects by subclassing one of the existing parsers.
Another issue I see with the proposed registration method is that there is no
way to handle different parser rules based on context. As an example, what if I
want to treat the first line of a function body differently than every other
line (say it is a metadata dictionary):
```python
def foo():
{"version": "0.1"} # This should become a `MetadataNode`, only strings
allowed as values.
my_dictionary = {"hi", 1} # This should become a `DictLiteralNode`, only
integers allowed as values.
```
With the proposed registration method there is no way to distinguish between
what should be a `DictLiteralNode` and a `MetadataNode`:
```python
@dispatch.register(token="mylang", type_name="Dict")
def visit_dict(self: Parser, node: doc.Dict) -> None:
for key_ast, value_ast in node.items:
key = visit(key_ast)
assert isinstance(key, StringLiteral)
value = visit(value_ast)
assert isinstance(value, StringLiteral) # or wait, should it be
IntLiteral? There's no way to tell if this is the first line or not.
```
(Maybe I am misunderstanding how this approach works though. Can we put
arbitrary state into the the `self: Parser` object? If so then we can fix this
issue. But it would be a complicated and confusing solution as the parser would
have to maintain a stack of rules takes or flags or some such.)
But with the class based approach I proposed above:
```python
@register("mylang")
class MyLangParser:
def parse_function(func):
# assuming func.body is a list of statements, also applies if body is a
recursive set of statements
parse_metadata(func.body[0])
for stmt in func.body[1:]:
parse_statement(stmt)
def parse_metadata(stmt):
for key_ast, value_ast in node.items:
key = parse_string(key_ast)
value = parse_string(value_ast)
def parse_statement(stmt):
for key_ast, value_ast in node.items:
key = parse_string(key_ast)
value = parse_int(value_ast)
```
This issue is caused by registering visitors at the python node ast level. When
visiting each ast node there is no context as to where we may be in the
production rules. If we instead my proposed approach, the parser can control
which production rules can be used at each location.
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/79#issuecomment-1199837603
You are receiving this because you are subscribed to this thread.
Message ID: <apache/tvm-rfcs/pull/79/[email protected]>