After playing around with a bunch of alternatives, I think I've come up
with decent action semantics:
@transform <name> begin
<label> = <action>
end
For example, a simple graph grammar might look like:
@grammar nodetest begin
start = +node_def
node_def = node_label + node_name + lbrace + data + rbrace
node_name = string_value + space
data = *(line + semicolon)
line = string_value + space
string_value = r"[_a-zA-Z][_a-zA-Z0-9]*"
lbrace = "{" + space
rbrace = "}" + space
semicolon = ";" + space
node_label = "node" + space
space = r"[ \t\n]*"
end
with it's actions to create some data structure:
type MyNode
name
values
function MyNode(name, values)
new(name, values)
end
end
with:
@transform tograph begin
# ignore these
lbrace = nothing
rbrase = nothing
semicolon = nothing
node_label = nothing
space = nothing
# special action so we don't have to define every label
default = children
string_value = node.value
value = node.value
line = children
data = MyNode("", children)
node_def = begin
local name = children[1]
local cnode = children[2]
cnode.name = name
return cnode
end
end
and finally, to apply the transform:
(ast, pos, error) = parse(nodetest, data)
result = apply(tograph, ast)
println(result) # {MyNode("foo",{"a","b"}),MyNode("bar",{"c","d"})}
The magic in '@transform' basically just creates the dictionary like
before, but automatically wraps the expression on the RHS as an anonymous
function (node, children) -> expr.
I'm currently looking for a better name than 'children', as it's
potentially confusing and misleading. It's actually the values of the child
nodes (as opposed to node.children). Maybe cvalues?
On Sunday, May 25, 2014 10:28:45 PM UTC-4, Abe Schneider wrote:
>
> I wrote a quick PEG Parser for Julia with Packrat capabilities:
>
> https://github.com/abeschneider/PEGParser
>
> It's a first draft and needs a ton of work, testing, etc., but if this is
> of interest to anyone else, here is a quick description.
>
> Grammars can be defined using most of the standard EBNF syntax. For
> example, a simple math grammar can be defined as:
>
> @grammar mathgrammar begin
>
> start = expr
> number = r"([0-9]+)"
> expr = (term + op1 + expr) | term
> term = (factor + op2 + term) | factor
> factor = number | pfactor
> pfactor = ('(' + expr + ')')
> op1 = '+' | '-'
> op2 = '*' | '/'
> end
>
>
>
> To parse a string with the grammar:
>
> (node, pos, error) = parse(mathgrammar, "5*(2-6)")
>
> This will create an AST which can then be transformed to a value.
> Currently this is accomplished by doing:
>
> math = Dict()
>
> math["number"] = (node, children) -> float(node.value)
> math["expr"] = (node, children) ->
> length(children) == 1 ? children : eval(Expr(:call, children[2],
> children[1], children[3]))
> math["factor"] = (node, children) -> children
> math["pfactor"] = (node, children) -> children[2]
> math["term"] = (node, children) ->
> length(children) == 1 ? children : eval(Expr(:call, children[2],
> children[1], children[3]))
> math["op1"] = (node, children) -> symbol(node.value)
> math["op2"] = (node, children) -> symbol(node.value)
>
>
> Ideally, I would like to simplify this to using multi-dispatch on symbols
> (see previous post), but for now this is the easiest way to define actions
> based on node attributes.
>
> Finally, to transform the tree:
>
> result = transform(math, node) # will give the value of 20
>
> Originally I was going to attach the transforms to the rules themselves
> (similiar to boost::spirit). However, there were two reasons for not doing
> this:
>
> 1. To implement the packrat part of the parser, I needed to cache the
> results which meant building an AST anyways
> 2. It's nice to be apply to get different transforms for the same
> grammar (e.g. you may want to transform the result into HTML, LaTeX, etc.)
>
> The downside of the separation is that it adds some more complexity to the
> process.
>