Hi, Mike Driscoll wrote: > I got lxml to create a tree by doing the following: > > from lxml import etree > from StringIO import StringIO > > parser = etree.HTMLParser() > tree = etree.parse(filename, parser) > xml_string = etree.tostring(tree) > context = etree.iterparse(StringIO(xml_string))
No idea why you need the two steps here. lxml 2.0 supports parsing HTML in iterparse() directly when you pass the boolean "html" keyword. > However, when I iterate over the contents of "context", I can't figure > out how to nab the row's contents: > > for action, elem in context: > if action == 'end' and elem.tag == 'relationship': > # do something...but what!? > # this if statement probably isn't even right I would really encourage you to use the normal parser here instead of iterparse(). from lxml import etree parser = etree.HTMLParser() # parse the HTML/XML melange tree = etree.parse(filename, parser) # if you want, you can construct a pure XML document row_root = etree.Element("newroot") for row in tree.iterfind("//Row"): row_root.append(row) In your specific case, I'd encourage using lxml.objectify: http://codespeak.net/lxml/dev/objectify.html It will allow you to do this (untested): from lxml import etree, objectify parser = etree.HTMLParser() lookup = objectify.ObjectifyElementClassLookup() parser.setElementClassLookup(lookup) tree = etree.parse(filename, parser) for row in tree.iterfind("//Row"): print row.relationship, row.StartDate, row.Priority * 2.7 Stefan -- http://mail.python.org/mailman/listinfo/python-list