[go-nuts] Re: GoAWK: an AWK interpreter written in Go

Scott Pakin Mon, 27 Aug 2018 12:40:16 -0700

On Friday, August 24, 2018 at 3:13:25 PM UTC-6, Ben Hoyt wrote:
>
> I recently wrote an AWK interpreter in Go: 
> https://github.com/benhoyt/goawk
>
> It's a pretty simple implementation: hand-rolled lexer, recursive-descent 
> parser, and tree-walk interpreter. It's pretty complete according to the 
> POSIX spec, and it passes the AWK ("one true awk") test suite as well as my 
> own unit tests.
>
> In some quick tests I ran, I/O speed is on a par or better than AWK but 
> the interpreter itself is quite slow -- about 5x slower for a lot of 
> things. I hope to add some proper benchmarks soon. I have a pretty good of 
> why and how to fix it: variable lookup and assignment is slow, and I'm 
> planning to fix by resolving more things at parse time.
>
> One thing that's a bit funky about AWK is its type handling: string values 
> can be real strings or "numeric strings" (numbers that came from user 
> input). I'm currently passing the "value" struct (see interp/value.go) 
> around by value. I still need to test if that's a good idea 
> performance-wise or not.
>
> I'd love to hear any comments, bug reports, or -- especially -- code 
> reviews.
>


Once you have some proper benchmarks, it might be fun to compare GoAWK's 
performance to that of my awk package <https://github.com/spakin/awk>.  The 
package implements AWK semantics in Go so a typical program is far more 
verbose than it would be in AWK but integrates tightly with Go code (e.g., 
one can use arbitrary Go code within the body of an AWK action).  GoAWK 
seems a lot easier to use when that level of integration is not needed, 
however.

I don't know how much performance difference this makes in practice, but my 
value struct (also in a value.go file) lazily converts among strings, 
floats, and ints and caches the conversions.  I don't keep track of "the" 
type of a value (your typ field), just whether I have a currently valid 
string/float/int representation.

No need to change your lexer/parser at this stage, but I've recently grown 
quite fond of PEG parsers 
<https://en.wikipedia.org/wiki/Parsing_expression_grammar>.  These are a 
lot like hand-rolled, recursive-descent parsers so they're relatively easy 
to wrap your head around and reasonably powerful but require less 
code/effort than actually rolling your own.  For Go, I've used pigeon 
<https://github.com/mna/pigeon> for a few projects (e.g., edif2qmasm 
<https://github.com/lanl/edif2qmasm>, for which a PEG parser is probably 
overkill).  I like pigeon, but I admit I didn't do a thorough analysis of 
all the available PEG parsers for Go before going with that one.

Nice work!

— Scott

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[go-nuts] Re: GoAWK: an AWK interpreter written in Go

Reply via email to