[julia-users] Re: PEG Parser

Abe Schneider Wed, 04 Jun 2014 20:02:26 -0700

After playing around with a bunch of alternatives, I think I've come up 
with decent action semantics:


@transform <name> begin
 <label> = <action>
end

For example, a simple graph grammar might look like:

@grammar nodetest begin
  start = +node_def
  node_def = node_label + node_name + lbrace + data + rbrace
  node_name = string_value + space

  data = *(line + semicolon)
  line = string_value + space
  string_value = r"[_a-zA-Z][_a-zA-Z0-9]*"

  lbrace = "{" + space
  rbrace = "}" + space
  semicolon = ";" + space
  node_label = "node" + space
  space = r"[ \t\n]*"
end

with it's actions to create some data structure:

type MyNode
  name
  values

  function MyNode(name, values)
    new(name, values)
  end
end


with:
@transform tograph begin
  # ignore these
  lbrace = nothing
  rbrase = nothing
  semicolon = nothing
  node_label = nothing
  space = nothing

  # special action so we don't have to define every label
  default = children

  string_value = node.value
  value = node.value
  line = children
  data = MyNode("", children)
  node_def = begin
    local name = children[1]
    local cnode = children[2]
    cnode.name = name
    return cnode
  end
end

and finally, to apply the transform:

(ast, pos, error) = parse(nodetest, data)
result = apply(tograph, ast)
println(result)    # {MyNode("foo",{"a","b"}),MyNode("bar",{"c","d"})}

The magic in '@transform' basically just creates the dictionary like 
before, but automatically wraps the expression on the RHS  as an anonymous 
function  (node, children) -> expr.

I'm currently looking for a better name than 'children', as it's 
potentially confusing and misleading. It's actually the values of the child 
nodes (as opposed to node.children). Maybe cvalues?

On Sunday, May 25, 2014 10:28:45 PM UTC-4, Abe Schneider wrote:
>
> I wrote a quick PEG Parser for Julia with Packrat capabilities:
>
> https://github.com/abeschneider/PEGParser
>
> It's a first draft and needs a ton of work, testing, etc., but if this is 
> of interest to anyone else, here is a quick description.
>
> Grammars can be defined using most of the standard EBNF syntax. For 
> example, a simple math grammar can be defined as:
>
> @grammar mathgrammar begin
>
>   start = expr
>   number = r"([0-9]+)"
>   expr = (term + op1 + expr) | term
>   term = (factor + op2 + term) | factor
>   factor = number | pfactor
>   pfactor = ('(' + expr + ')')
>   op1 = '+' | '-'
>   op2 = '*' | '/'
> end
>
>
>
> To parse a string with the grammar:
>
> (node, pos, error) = parse(mathgrammar, "5*(2-6)")
>
> This will create an AST which can then be transformed to a value. 
> Currently this is accomplished by doing:
>
> math = Dict()
>
> math["number"] = (node, children) -> float(node.value)
> math["expr"] = (node, children) ->
>     length(children) == 1 ? children : eval(Expr(:call, children[2], 
> children[1], children[3]))
> math["factor"] = (node, children) -> children
> math["pfactor"] = (node, children) -> children[2]
> math["term"] = (node, children) ->
>     length(children) == 1 ? children : eval(Expr(:call, children[2], 
> children[1], children[3]))
> math["op1"] = (node, children) -> symbol(node.value)
> math["op2"] = (node, children) -> symbol(node.value)
>
>
> Ideally, I would like to simplify this to using multi-dispatch on symbols 
> (see previous post), but for now this is the easiest way to define actions 
> based on node attributes.
>
> Finally, to transform the tree:
>
> result = transform(math, node)  # will give the value of 20
>
> Originally I was going to attach the transforms to the rules themselves 
> (similiar to boost::spirit). However, there were two reasons for not doing 
> this:
>
>    1. To implement the packrat part of the parser, I needed to cache the 
>    results which meant building an AST anyways
>    2. It's nice to be apply to get different transforms for the same 
>    grammar (e.g. you may want to transform the result into HTML, LaTeX, etc.)
>
> The downside of the separation is that it adds some more complexity to the 
> process.
>

[julia-users] Re: PEG Parser

Reply via email to