On Nov 10, 9:33 am, Thomas Mlynarczyk <[EMAIL PROTECTED]> wrote: > John Machin schrieb: > > >>> dict.iter<anything>() will return its results in essentially random > >>> order. > > A list of somethings does seem indicated. > > On the other hand: If all my tokens are "mutually exclusive" then,
But they won't *always* be mutually exclusive (another example is relational operators (< vs <=, > vs >=)) and AFAICT there is nothing useful that the lexer can do with an assumption/guess/input that they are mutually exclusive or not. > in > theory, the order in which they are tried, should not matter, as at most > one token could match at any given offset. Still, having the most > frequent tokens being tried first should improve performance. Your Lexer class should promise to check the regexes in the order given. Then the users of your lexer can arrange the order to suit themselves. > > > A dict is a hashtable, intended to provide a mapping from keys to > > values. It's not intended to have order. In any case your code doesn't > > use the dict as a mapping. > > I map token names to regular expressions. Isn't that a mapping? Your code uses dict methods; this forces your callers to *create* a mapping. However (as I said) your code doesn't *use* that mapping -- there is no RHS usage of dict[key] or dict.get(key) etc. In fact I'm having difficulty imagining what possible practical use there could be for a mapping from token-name to regex. > >>>> return "\n".join( > >>>> [ "[L:%s]\t[O:%s]\t[%s]\t'%s'" % > > The first 3 are %s, the last one is '%s' > > I only put the single quotes so I could better "see" whitespace in the > output. To *best* see whitespace (e.g. Is that a TAB or multiple spaces?), use %r. General advice: What you think you see is often not what you've actually got. repr() is your friend; use it. Cheers, John -- http://mail.python.org/mailman/listinfo/python-list