On Apr 17, 4:03 am, Clarendon <jine...@hotmail.com> wrote: > Thank you very much for this information. It seems to point me to the > right direction. However, I do not fully understand the flatten > function and its output. Some indices seem to be inaccurate. I tried > to find this function at nltk.tree.Tree.flatten, but it returns a > flattened tree, not a tuple. > > So your flatten function must be a different one, and it's not one of > the builtins, either. Could you tell me where I can find the > documentation about this flatten function?
No, it is a different one. I don't even have it. We'd have to write it. The indices weren't included in the flattened tree, but if you're writing it, it can. 0: ( 'ROOT', None, <object>, None --no parent--, 0 ) 1: ( 'S', None, <object>, 0 --parent is 'ROOT'--, 1 ) 2: ( 'NP', None, <object>, 1 --parent is 'S'--, 2 ) 3: ( 'PRP', 'I', <object>, 2 --parent is 'NP'--, 3 ) 4: ( 'VP', None, <object>, 1 --parent is 'S', 2 ) 5: ( 'VBD', 'came', <object>, 4 --parent is 'VP'--, 2 ) I screwed up the 'depth' field on #5. It should be: 5: ( 'VBD', 'came', <object>, 4 --parent is 'VP'--, **3** ) Otherwise I'm not sure what you mean by 'indices seem to be inaccurate'. I'm still not completely sure though. After all, I did it by hand, not by program. If your package comes with a flatten function, it would be a good place to start. Flatten functions can get hairy. What is its code, and what is its output? Here's an example: >>> a= [ 'p', [ [ 'q', 'r' ], 's', 't' ], 'u' ] >>> a ['p', [['q', 'r'], 's', 't'], 'u'] >>> def flatten( x ): ... for y in x: ... if isinstance( y, list ): ... for z in flatten( y ): ... yield z ... else: ... yield y ... >>> list( flatten( a ) ) ['p', 'q', 'r', 's', 't', 'u'] -- http://mail.python.org/mailman/listinfo/python-list