Hi, I want to get text out of some nodes of a huge xml file (1,5 GB). The architecture of the xml file is something like this <parent> <page> <title>bla</title> <id></id> <revision> <id></id> <text>blablabla</text> <revision> </page> <page> </page> .... </parent> I want to combine the text out of page:title and page:revision:text for every single page element. One by one I want to index these combined texts (so for each page one index) What is the most efficient API for that?: SAX ( I don“t thonk so) DOM or pulldom? Or should I just use Xpath somehow. I don`t want to do anything else with his xml file afterwards. I hope someone will understand me..... Thank you very much Jog
-- http://mail.python.org/mailman/listinfo/python-list