Hi, A while ago I asked a question on the list about a simple eval function, capable of eval'ing simple python constructs (tuples, dicts, lists, strings, numbers etc) in a secure manner: http://groups.google.com/group/comp.lang.python/browse_thread/thread/58a01273441d445f/ >From the answers I got I chose to use simplejson... However, I was also pointed to a simple eval function by Fredrik Lundh: http://effbot.org/zone/simple-iterator-parser.htm. His solution, using module tokenize, was short and elegant. So I used his code as a starting point for simple evaluation of dicts, tuples, lists, strings, unicode strings, integers, floats, None, True, and False. I've included the code below, together with some basic tests, and profiling... On my computer (winXP, python 2.5), simple eval is about 5 times slower than builtin eval...
Comments, speedups, improvements in general, etc are appreciated. As this is a contribution to the community I suggest that any improvements are posted in this thread... -Tor Erik Code (tested on 2.5, but should work for versions >= 2.3): ''' Recursive evaluation of: tuples, lists, dicts, strings, unicode strings, ints, floats, True, False, and None ''' import cStringIO, tokenize, itertools KEYWORDS = {'None': None, 'False': False, 'True': True} def atom(next, token): if token[1] == '(': out = [] token = next() while token[1] != ')': out.append(atom(next, token)) token = next() if token[1] == ',': token = next() return tuple(out) elif token[1] == '[': out = [] token = next() while token[1] != ']': out.append(atom(next, token)) token = next() if token[1] == ',': token = next() return out elif token[1] == '{': out = {} token = next() while token[1] != '}': key = atom(next, token) next() # Skip key-value delimiter token = next() out[key] = atom(next, token) token = next() if token[1] == ',': token = next() return out elif token[1].startswith('u'): return token[1][2:-1].decode('unicode-escape') elif token[0] is tokenize.STRING: return token[1][1:-1].decode('string-escape') elif token[0] is tokenize.NUMBER: try: return int(token[1], 0) except ValueError: return float(token[1]) elif token[1] in KEYWORDS: return KEYWORDS[token[1]] raise SyntaxError('malformed expression (%r)¨' % token[1]) def simple_eval(source): src = cStringIO.StringIO(source).readline src = tokenize.generate_tokens(src) src = itertools.ifilter(lambda x: x[0] is not tokenize.NL, src) res = atom(src.next, src.next()) if src.next()[0] is not tokenize.ENDMARKER: raise SyntaxError("bogus data after expression") return res if __name__ == '__main__': expr = (1, 2.3, u'h\xf8h\n', 'h\xc3\xa6', ['a', 1], {'list': [], 'tuple': (), 'dict': {}}, False, True, None) rexpr = repr(expr) a = simple_eval(rexpr) b = eval(rexpr) assert a == b import timeit print timeit.Timer('eval(rexpr)', 'from __main__ import rexpr').repeat(number=1000) print timeit.Timer('simple_eval(rexpr)', 'from __main__ import rexpr, simple_eval').repeat(number=1000) -- http://mail.python.org/mailman/listinfo/python-list