Re: [julia-users] Re: PEG Parser

Jameson Nash Tue, 27 May 2014 17:27:18 -0700

>
> Finally, one thing that I would like to change in the near future is to
> have transforms look something like:
> html(node, children, :bold_open) = "<b>"
> html(node, children, :bold_close) = "</b>"
> html(node, children, :text) = node.value
> html(node, children, :bold_text) = join(children)
>  result = transform(html, node)
> If Julia gets dispatch on value, then this would be trivial to write. One
> possible workaround is to create a type per rule in the grammar. Then the
> functions can be written to dispatch on the type associated with the given
> rule.


(from your package readme)

type Node{T} end
html{T}(node, children, ::Node{T}) = error("No rule to make $T into html")
html(node, children, ::Node{ :bold_open }) = "<b>"
html(node, children, ::Node{ :bold_close }) = "</b>"



On Tue, May 27, 2014 at 7:48 PM, Jameson Nash <[email protected]> wrote:

> some possible alternatives to using eval include:
> macroexpand
> apply
> getfield
>
> depending on your use-case
>
> On Tue, May 27, 2014 at 6:58 PM, John Myles White
> <[email protected]> wrote:
> > I'd be really interested to see how this parser compares with DataFrames.
> > There's a bunch of test files in the DataFrames.jl/test directory.
> >
> >  -- John
> >
> > On May 27, 2014, at 3:49 PM, Abe Schneider <[email protected]>
> wrote:
> >
> > I don't know how the speed of the parser will be compared to DataFrames
> --
> > I've done absolutely no work to date on profiling the code, but I thought
> > writing a CSV parser was a good way to test out code (and helped find a
> > bunch of bugs).
> >
> > I've also committed (under examples/) the CSV parser. The grammar (from
> the
> > RFC) is:
> >
> > @grammar csv begin
> >   start = data
> >   data = record + *(crlf + record)
> >   record = field + *(comma + field)
> >   field = escaped_field | unescaped_field
> >   escaped_field = dquote + *(textdata | comma | cr | lf | dqoute2) +
> dquote
> >   unescaped_field = textdata
> >   textdata = r"[ !#$%&'()*+\-./0-~]+"
> >   cr = '\r'
> >   lf = '\n'
> >   crlf = cr + lf
> >   dquote = '"'
> >   dqoute2 = "\"\""
> >   comma = ','
> > end
> >
> > and the actions are:
> >
> > tr["crlf"] = (node, children) -> nothing
> > tr["comma"] = (node, children) -> nothing
> >
> > tr["escaped_field"] = (node, children) -> node.children[2].value
> > tr["unescaped_field"] = (node, children) -> node.children[1].value
> > tr["field"] = (node, children) -> children
> > tr["record"] = (node, children) -> unroll(children)
> > tr["data"] = (node, children) -> unroll(children)
> > tr["textdata"] = (node, children) -> node.value
> >
> >
> > give the data:
> >
> > parse_data = """1,2,3\r\nthis is,a test,of csv\r\n"these","are","quotes
> > ("")""""
> >
> > and running the parser:
> >
> > (node, pos, error) = parse(csv, parse_data)
> > result = transform(tr, node)
> >
> > I get:
> >
> > {{"1","2","3"},{"this is","a test","of csv"},{"these","are","quotes
> > (\"\")"}}
> >
> >
> >
> >
> >
> > On Monday, May 26, 2014 3:41:26 AM UTC-4, harven wrote:
> >>
> >> Nice!
> >>
> >> If you are interested by testing your library on a concrete problem, you
> >> may want to parse comma separated value (csv) files. The bnf is in the
> >> specification RFC4180. http://tools.ietf.org/html/rfc4180
> >>
> >> AFAIK, the readcsv function provided in Base does not handle quotations
> >> well whereas the csv parser in DataFrames is slow, so that julia does
> not
> >> have yet a native efficient way to parse csv files.
> >
> >
>

Re: [julia-users] Re: PEG Parser

Reply via email to