Hi Nils and all,

On 3 Mrz., 08:25, Nils Bruin <nbr...@sfu.ca> wrote:
> A possibly somewhat heavyhanded approach:
>
> sage: import tokenize,StringIO
> sage: S="QQ['t'], a, a_2, for"
> sage: list((a[0],a[1]) for a in
> tokenize.generate_tokens(StringIO.StringIO(S).readline))
> [(1, 'QQ'), (51, '['), (3, "'t'"), (51, ']'), (51, ','), (1, 'a'),
> (51, ','), (1, 'a_2'), (51, ','), (1, 'for'), (0, '')]

I guess using the regular expression defined at
http://docs.python.org/reference/lexical_analysis.html#identifiers
would be a better tool.

> It does do the token splitting according to python rules and marks
> most "special" characters. Surprisingly, the tokenizer doesn't mark a
> reserved word like for yet, though.

Yes, refusing reserved words is a separate issue:

sage: var('while')
while
sage: while
------------------------------------------------------------
   File "<ipython console>", line 1
     while
          ^
SyntaxError: invalid syntax

sage: globals()['while']
while

So, using a reserved name for a variable currently does not produce an
error.

I suggest that var(s) does something like the following:

# the following regular expression should be compiled
# only once and then stored somewhere
identifier = re.compile("([a-z]|[A-Z]|_)([a-z]|[A-Z]|[0-9]|_)*")

if not isinstance(s,basestring):
    raise TypeError, "Variable name must be a string"
if not identifier.match(s):
    raise ValueError, "'%s' is no valid identifier"%s
import keyword
if keyword.iskeyword(s):
    raise ValueError, "'%s' is a reserved keyword in Python"%s

Would this work?

If it would work, I think the test above should be implemented as a
function in sage.misc.defaults. In that way, it could also be used by
normalize_variable_names.

Cheers,
Simon

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to