On Nov 11, 12:26 am, Thomas Mlynarczyk <[EMAIL PROTECTED] webdesign.de> wrote: > John Machin schrieb: > > >> On the other hand: If all my tokens are "mutually exclusive" then, > > But they won't *always* be mutually exclusive (another example is > > relational operators (< vs <=, > vs >=)) and AFAICT there is nothing > > useful that the lexer can do with an assumption/guess/input that they > > are mutually exclusive or not. > > "<" vs. "<=" can be handled with lookaheads (?=...) / (?!...) in regular > expressions.
Single-character tokens like "<" may be more efficiently handled by doing a dict lookup after failing to find a match in the list of (name, regex) tuples. > True, the lexer cannot do anything useful with the > assumption that all tokens are mutually exclusive. But if they are, > there will be no ambiguity and I am guaranteed to get always the same > sequence of tokens from the same input string. So what? That is useless knowledge. It is the ambiguous cases that you need to be concerned with. > > > Your Lexer class should promise to check the regexes in the order > > given. Then the users of your lexer can arrange the order to suit > > themselves. > > Yes. So there's no way around a list of tuples instead of dict(). Correct. > > > Your code uses dict methods; this forces your callers to *create* a > > mapping. However (as I said) your code doesn't *use* that mapping -- > > there is no RHS usage of dict[key] or dict.get(key) etc. In fact I'm > > having difficulty imagining what possible practical use there could be > > for a mapping from token-name to regex. > > Sorry, but I still don't quite get it. > > for name, regex in self.tokens.iteritems(): > # ... > self.result.append( ( name, match, self.line ) ) > > What I do here is take a name and its associated regex and then store a > tuple (name, match, line). In a simpler version of the lexer, I might > store only `name` instead of the tuple. Is your point that the lexer > doesn't care what `name` actually is, but simply passes it through from > the tokenlist to the result? No, not at all. The point is that you were not *using* any of the mapping functionality of the dict object, only ancillary methods like iteritems -- hence, you should not have been using a dict at all. -- http://mail.python.org/mailman/listinfo/python-list