"Neil Cerutti" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On 2007-11-07, Just Another Victim of the Ambient Morality > <[EMAIL PROTECTED]> wrote: >> "Neil Cerutti" <[EMAIL PROTECTED]> wrote in message >> news:[EMAIL PROTECTED] >>> You might be interested in the Early parsing algorithm. It is >>> more efficient than the naive approach used in your prototype, >>> and still handles ambiguous grammars. >> > > I'll take this opportunity to correct my misspelling. It's > "Earley". > >> I think I might be interested in this algorithm, thank you! >> >>> http://www.cpsc.ucalgary.ca/~aycock/spark/ >> >> You know, I tried this thing but, for the life of me, I >> can't figure out how to use it and the few tutorials out there >> are less than illuminating... > > I'll send you the code I composed. > > The tricky part of Spark is the Token and AST classes you have to > use. A type used as a token class is required to provide a > __cmp__ function that behaves in the following confounding > manner: > > class Token(object): > def __cmp__(self, o): > return cmp(self.type, o) > > If you attempt to use a list, string or a tuple as token, it just > barfs. AST's are required to provide an even weirder interface. > > In effect, you have to write badly designed wrappers around > tuples and lists, respectively to take advantage of the generic > classes. > > Go to the examples directory of the distribution to find working > versions of these stupid classes. > > Once you get over that hurdle it becomes easier. Be sure to > provide your Parser and Scanner classes with an error method to > prevent the library from raising SystemExit(!) on errors. Scanner > classes are also required to override the t_default method to > prevent this mishap. > > In short, it hasn't really evovled into a user-friendly package > yet.
Thank you. How is it that I seem to be the only one in the market for a correct parser? Earley has a runtine of O(n^3) in the worst case and O(n^2) typically. I have trouble believing that everyone else in the world has such intense run-time requirements that they're willing to forego correctness. Why can't I find a pyparsing-esque library with this implementation? I'm tempted to roll my own except that it's a fairly complicated algorithm and I don't really understand how it's any more efficient than the naive approach... -- http://mail.python.org/mailman/listinfo/python-list