En Mon, 23 Jul 2007 18:13:05 -0300, Matteo <[EMAIL PROTECTED]> escribió:
> I am trying to get Python to extract attributes in full dotted form > from compiled expression. For instance, if I have the following: > > param = compile('a.x + a.y','','single') > > then I would like to retrieve the list consisting of ['a.x','a.y']. > > The reason I am attempting this is to try and automatically determine > data dependencies in a user-supplied formula (in order to build a > dataflow network). I would prefer not to have to write my own parser > just yet. If it is an expression, I think you should use "eval" instead of "single" as the third argument to compile. > Alternatively, I've looked at the parser module, but I am experiencing > some difficulties in that the symbol list does not seem to match that > listed in the python grammar reference (not surprising, since I am > using python2.5, and the docs seem a bit dated) Yes, the grammar.txt in the docs is a bit outdated (or perhaps it's a simplified one), see the Grammar/Grammar file in the Python source distribution. > In particular: > >>>> import parser >>>> import pprint >>>> import symbol >>>> tl=parser.expr("a.x").tolist() >>>> pprint.pprint(tl) > > [258, > [326, > [303, > [304, > [305, > [306, > [307, > [309, > [310, > [311, > [312, > [313, > [314, > [315, > [316, [317, [1, 'a']], [321, [23, '.'], [1, > 'x']]]]]]]]]]]]]]]], > [4, ''], > [0, '']] > >>>> print symbol.sym_name[316] > power > > Thus, for some reason, 'a.x' seems to be interpreted as a power > expression, and not an 'attributeref' as I would have anticipated (in > fact, the symbol module does not seem to contain an 'attributeref' > symbol) Using this little helper function to translate symbols and tokens: names = symbol.sym_name.copy() names.update(token.tok_name) def human_readable(lst): lst[0] = names[lst[0]] for item in lst[1:]: if isinstance(item,list): human_readable(item) the same tree becomes: ['eval_input', ['testlist', ['test', ['or_test', ['and_test', ['not_test', ['comparison', ['expr', ['xor_expr', ['and_expr', ['shift_expr', ['arith_expr', ['term', ['factor', ['power', ['atom', ['NAME', 'a']], ['trailer', ['DOT', '.'], ['NAME', 'x']]]]]]]]]]]]]]]], ['NEWLINE', ''], ['ENDMARKER', '']] which is correct is you look at the symbols in the (right) Grammar file. But if you are only interested in things like a.x, maybe it's a lot simpler to use the tokenizer module, looking for the NAME and OP tokens as they appear in the source expression. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list