Hi Nils and all, On 3 Mrz., 08:25, Nils Bruin <nbr...@sfu.ca> wrote: > A possibly somewhat heavyhanded approach: > > sage: import tokenize,StringIO > sage: S="QQ['t'], a, a_2, for" > sage: list((a[0],a[1]) for a in > tokenize.generate_tokens(StringIO.StringIO(S).readline)) > [(1, 'QQ'), (51, '['), (3, "'t'"), (51, ']'), (51, ','), (1, 'a'), > (51, ','), (1, 'a_2'), (51, ','), (1, 'for'), (0, '')]
I guess using the regular expression defined at http://docs.python.org/reference/lexical_analysis.html#identifiers would be a better tool. > It does do the token splitting according to python rules and marks > most "special" characters. Surprisingly, the tokenizer doesn't mark a > reserved word like for yet, though. Yes, refusing reserved words is a separate issue: sage: var('while') while sage: while ------------------------------------------------------------ File "<ipython console>", line 1 while ^ SyntaxError: invalid syntax sage: globals()['while'] while So, using a reserved name for a variable currently does not produce an error. I suggest that var(s) does something like the following: # the following regular expression should be compiled # only once and then stored somewhere identifier = re.compile("([a-z]|[A-Z]|_)([a-z]|[A-Z]|[0-9]|_)*") if not isinstance(s,basestring): raise TypeError, "Variable name must be a string" if not identifier.match(s): raise ValueError, "'%s' is no valid identifier"%s import keyword if keyword.iskeyword(s): raise ValueError, "'%s' is a reserved keyword in Python"%s Would this work? If it would work, I think the test above should be implemented as a function in sage.misc.defaults. In that way, it could also be used by normalize_variable_names. Cheers, Simon -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org