On Friday, August 24, 2018 at 3:13:25 PM UTC-6, Ben Hoyt wrote: > > I recently wrote an AWK interpreter in Go: > https://github.com/benhoyt/goawk > > It's a pretty simple implementation: hand-rolled lexer, recursive-descent > parser, and tree-walk interpreter. It's pretty complete according to the > POSIX spec, and it passes the AWK ("one true awk") test suite as well as my > own unit tests. > > In some quick tests I ran, I/O speed is on a par or better than AWK but > the interpreter itself is quite slow -- about 5x slower for a lot of > things. I hope to add some proper benchmarks soon. I have a pretty good of > why and how to fix it: variable lookup and assignment is slow, and I'm > planning to fix by resolving more things at parse time. > > One thing that's a bit funky about AWK is its type handling: string values > can be real strings or "numeric strings" (numbers that came from user > input). I'm currently passing the "value" struct (see interp/value.go) > around by value. I still need to test if that's a good idea > performance-wise or not. > > I'd love to hear any comments, bug reports, or -- especially -- code > reviews. >
Once you have some proper benchmarks, it might be fun to compare GoAWK's performance to that of my awk package <https://github.com/spakin/awk>. The package implements AWK semantics in Go so a typical program is far more verbose than it would be in AWK but integrates tightly with Go code (e.g., one can use arbitrary Go code within the body of an AWK action). GoAWK seems a lot easier to use when that level of integration is not needed, however. I don't know how much performance difference this makes in practice, but my value struct (also in a value.go file) lazily converts among strings, floats, and ints and caches the conversions. I don't keep track of "the" type of a value (your typ field), just whether I have a currently valid string/float/int representation. No need to change your lexer/parser at this stage, but I've recently grown quite fond of PEG parsers <https://en.wikipedia.org/wiki/Parsing_expression_grammar>. These are a lot like hand-rolled, recursive-descent parsers so they're relatively easy to wrap your head around and reasonably powerful but require less code/effort than actually rolling your own. For Go, I've used pigeon <https://github.com/mna/pigeon> for a few projects (e.g., edif2qmasm <https://github.com/lanl/edif2qmasm>, for which a PEG parser is probably overkill). I like pigeon, but I admit I didn't do a thorough analysis of all the available PEG parsers for Go before going with that one. Nice work! — Scott -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.