HI Stefan, thank you for your advice. I am running into an issue though (which is likely a newbie problem):
My xml file looks like (which i got from the internet): <?xml version="1.0"?> <catalog> <book id="bk101"> <author>Gambardella, Matthew</author> <title>XML Developer's Guide</title> <genre>Computer</genre> <price>44.95</price> <publish_date>2000-10-01</publish_date> <description>An in-depth look at creating applications with XML.</description> </book> </catalog> Then i try to access it as: from lxml import etree from lxml import objectify inputfile="../input/books.xml" parser = etree.XMLParser(ns_clean=True) parser = etree.XMLParser(remove_comments=True) root = objectify.parse(inputfile,parser) I try to access elements by (for example): print root.catalog.book.author.text But it errors out with: AttributeError: 'lxml.etree._ElementTree' object has no attribute 'catalog' So i guess i don't understand why this is. Also, how can i print all the attributes of the root for examples or obtain keys? I really appreciate your help thanks matt On 2/21/2011 10:43 AM, Stefan Behnel wrote: > Matt Funk, 21.02.2011 18:30: >> I want to create a set of xml input files to my code that look as >> follows: >> <?xml version="1.0" encoding="UTF-8"?> >> >> <!-- Settings for the algorithm to be performed >> --> >> <Algorithm> >> >> <!-- The algorithm type. >> --> >> <!-- The supported options are: >> --> >> <!-- - Alg0 >> --> >> <!-- - Alg1 >> --> >> <Type>Alg1</Type> >> >> <!-- the location/path of the input file for this algorithm >> --> >> <path>./Alg1.in</path> >> >> </Algorithm> >> >> >> <!-- Relevant information during the processing will be written to a >> logfile --> >> <Logfile> >> >> <!-- the location/path of the logfile (i.e. where to put the >> logfile) --> >> <path>c:\tmp</path> >> >> <!-- verbosity level (i.e. how much to print) >> --> >> <!-- The supported options are: >> --> >> <!-- - 0 (nothing printed) >> --> >> <!-- - 1 (print on error) >> --> >> <verbosity>1</verbosity> >> >> </Logfile> > > That's not XML. XML documents have exactly one root element, i.e. you > need an enclosing element around these two tags. > > >> So there are comments, whitespace etc ... in it. >> I would like to be able to put everything into some sort of structure > > Including the comments or without them? Note that ElementTree will > ignore comments. > > >> such that i can access it as: >> structure['Algorithm']['Type'] == Alg1 > > Have a look at lxml.objectify. It allows you to write > > alg_type = root.Algorithm.Type.text > > and a couple of other niceties. > > http://lxml.de/objectify.html > > >> I was wondering if there is something out there that does this. >> I found and tried a few things: >> 1) >> http://code.activestate.com/recipes/534109-xml-to-python-data-structure/ >> It simply doesn't work. I get the following error: >> raise exception >> xml.sax._exceptions.SAXParseException:<unknown>:1:2: not well-formed >> (invalid token) > > "not well formed" == "not XML". > > >> But i removed everything from the file except:<?xml version="1.0" >> encoding="UTF-8"?> >> and i still got the error. > > That's not XML, either. > > >> Anyway, i looked at ElementTree, but that error out with: >> xml.parsers.expat.ExpatError: junk after document element: line 19, >> column 0 > > In any case, ElementTree is preferable over a SAX based solution, both > for performance and maintainability reasons. > > Stefan > -- http://mail.python.org/mailman/listinfo/python-list