Hi, I'm using the TidyHTMLTreeBuilder to generate some elementtrees from html. One by-product is that I'm losing comments embedded in the html. So I'm trying to put them back in, but I'm doing something wrong: here's the code snippet of how I generate the Trees:
from elementtree import ElementTree as ET from elementtidy import TidyHTMLTreeBuilder XHTML = "{http://www.w3.org/1999/xhtml}" htmfile = os.path.join(self.htmloc,filename) fd = open(htmfile) tidyTree = TidyHTMLTreeBuilder.TidyHTMLTreeBuilder('utf-8') tidyTree.feed(fd.read()) fd.close() try: tmp = tidyTree.close() except: print 'Bad file: %s\nSkipping.' % filename continue tree = ET.ElementTree(tmp) and here's the method I use to put the comments back in: def addComments(self,tree): body = tree.find('./%sbody' % XHTML) for elem in body: if elem.tag == '%sdiv' % XHTML and elem.get('class'): if elem.get('class') == 'remapped': comElem = ET.SubElement(elem,ET.Comment('stopindex')) self.addComments(tree) filename = os.path.join(self.deliverloc,name) self.htmlcontent.write(tree,filename,encoding=self.encoding when I try this I get errors from the ElementTree _write method: TypeError: cannot concatenate 'str' and 'instance' objects thanks for any help! --Tim Arnold -- http://mail.python.org/mailman/listinfo/python-list