Op 09-03-15 om 15:39 schreef Chris Angelico:
> On Tue, Mar 10, 2015 at 1:34 AM, Antoon Pardon
> <antoon.par...@rece.vub.ac.be> wrote:
>>> There is str.isidentifier, which returns True if something is a valid
>>> identifier name:
>>>
>>>>>> '℮'.isidentifier()
>>> True
>> Which is not very usefull in a context of lexical analysis. I don't need to 
>> know
>> if a particular string is useful as an identifier, I want to know which 
>> parts of
>> a text are identifiers.
> If you're doing lexical analysis, you probably want a lexer. For
> Python, I would recommend parsing to AST and doing your analysis on
> that; I've had pretty good success doing that, and then using the
> line/column info to go back to the original text if I need it. A regex
> is probably not going to be sufficient for that kind of work.

Maybe I am getting behind, but until now the lexers that I used require a 
regular
expression per kind of token you want to recognize. At least PLY still seems to
work like that. So if an identifier is one such kind of token, I need a regular
expression that matches what an identifier is.

-- 
Antoon Pardon 

-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to