John Machin schrieb:

On the other hand: If all my tokens are "mutually exclusive" then,

But they won't *always* be mutually exclusive (another example is
relational operators (< vs <=, > vs >=)) and AFAICT there is nothing
useful that the lexer can do with an assumption/guess/input that they
are mutually exclusive or not.

"<" vs. "<=" can be handled with lookaheads (?=...) / (?!...) in regular expressions. True, the lexer cannot do anything useful with the assumption that all tokens are mutually exclusive. But if they are, there will be no ambiguity and I am guaranteed to get always the same sequence of tokens from the same input string.

Your Lexer class should promise to check the regexes in the order
given. Then the users of your lexer can arrange the order to suit
themselves.

Yes. So there's no way around a list of tuples instead of dict().

Your code uses dict methods; this forces your callers to *create* a
mapping. However (as I said) your code doesn't *use* that mapping --
there is no RHS usage of dict[key] or dict.get(key) etc. In fact I'm
having difficulty imagining what possible practical use there could be
for a mapping from token-name to regex.

Sorry, but I still don't quite get it.

for name, regex in self.tokens.iteritems():
    # ...
    self.result.append( ( name, match, self.line ) )

What I do here is take a name and its associated regex and then store a tuple (name, match, line). In a simpler version of the lexer, I might store only `name` instead of the tuple. Is your point that the lexer doesn't care what `name` actually is, but simply passes it through from the tokenlist to the result?

To *best* see whitespace (e.g. Is that a TAB or multiple spaces?), use
%r.

(Just having modified my code accordingly:) Ah, yes, indeed, that is much better!

General advice: What you think you see is often not what you've
actually got. repr() is your friend; use it.

Lesson learned :-)

Greetings,
Thomas

--
Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison!
(Coluche)
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to