Tim Arnold wrote: > Hi, I'm using elementtree and elementtidy to work with some HTML files. For > some of these files I need to enclose the body content in a new div tag, > like this: > <body> > <div class="remapped"> > original contents... > </div> > </body> > > I figure there must be a way to do it by creating a 'div' SubElement to the > 'body' tag and somehow copying the rest of the tree under that SubElement, > but it's beyond my comprehension. > > How can I accomplish this? > (I know I could put the class on the body tag itself, but that won't satisfy > the powers-that-be). > > thanks, > --Tim Arnold > >
You could also try something like this: from sgmllib import SGMLParser class IParse(SGMLParser): def __init__(self, verbose=0): SGMLParser.__init__(self, verbose) self.data = "" def _attr_to_str(self, attrs): return ' '.join(['%s="%s"' % a for a in attrs]) def start_body(self, attrs): self.data += "<body %s>" % self._attr_to_str(attrs) print "remapping" self.data += '''<div class="remapped">''' def end_body(self): self.data += "</div>" # end remapping self.data += "</body>" def handle_data(self, data): self.data += data def unknown_starttag(self, tag, attrs): self.data+="<%s %s>" % (tag, self._attr_to_str(attrs),) def unknown_endtag(self, tag): self.data += "</%s>" % tag if __name__=="__main__": i = IParse() i.feed(''' <html> <body bgcolor="#fffff"> original <i>italic</i> <b class="test">contents</b>... </body> </html>'''); print i.data i.close() just look at the code from sgmllib (standard lib) and it is very easy to make a parser. for some much needed refactoring -- http://mail.python.org/mailman/listinfo/python-list