Doug Ransom <[EMAIL PROTECTED]> wrote,
> Are combinator parsers recursive decent? Do they have any disadvantages
> over generated parsers like that yacc would produce?
Yes, they are recursive decent. As a consequence you have
to left-factorise left-recursive productions in the grammar;
otherwise, the parser won't terminate. Other than that,
there are two disadvantages that I see: Speed and clarity of
error messages during *development* of the parser.
As for speed, Doitase Swierstra et al. came up with a very
clever scheme of computing a parse table in a combinator
parser on-the-fly during parsing, which makes things a lot
faster:
http://www.cs.uu.nl/groups/ST/Software/UU_Parsing/
I provide a variant of Doaitse's parser combinators in the
Compiler Toolkit:
http://www.cse.unsw.edu.au/~chak/haskell/ctk/index.html
This also includes a set of lexer combinators, which are
also self-optimising. (Quite close to a hand written lexer:
<http://www.cse.unsw.edu.au/~chak/papers/Cha99.html>.)
As for the second problem - clarity of error messages -
there has been quite some effort in letting parser
combinators deal gracefully with parse errors; however,
there is another set of errors, which have received a lot
less attention: the errors that the developer of a parser
gets while debugging the parser. This is not so relevant
for the toy parsers most people are writing, but if you are
handling a complete programming language, it starts to get
more serious.
Having said that, let me point out that parser combinators
also have strong advantages when compared to generators.
Some can handle much more complex classes of grammars than
what generator parsers usually over you. Its also easier to
generate grammers on the fly or adapt the parser depending
on the runtime configuration of the program. Moreover, you
don't need an extra tool. And despite the mentioned
disadvantages you can use combinators to produce parsers for
real programming languages. The C front-end of the tool
C->Haskell
http://www.cse.unsw.edu.au/~chak/haskell/c2hs/
is implemented using self-optimising lexer and parser
combinators and I use it on C headers that (after being run
through cpp) have up to 200k.
Cheers,
Manuel