"jog" wrote: > I want to get text out of some nodes of a huge xml file (1,5 GB). The > architecture of the xml file is something like this
> I want to combine the text out of page:title and page:revision:text for > every single page element. One by one I want to index these combined > texts (so for each page one index) here's one way to do it: try: import cElementTree as ET except ImportError: from elementtree import ElementTree as ET for event, elem in ET.iterparse(file): if elem.tag == "page": title = elem.findtext("title") revision = elem.findtext("revision/text") print title, revision elem.clear() # won't need this any more references: http://effbot.org/zone/element-index.htm http://effbot.org/zone/celementtree.htm (for best performance) http://effbot.org/zone/element-iterparse.htm </F> -- http://mail.python.org/mailman/listinfo/python-list