On Aug 24, 7:29 pm, Dave Angel <da...@ieee.org> wrote: > Stefan Behnel wrote: > > Hi, > > > elsa wrote: > > >> I know how to turn HTML into an ElementTree object > > > I don't. ;) > > > ElementTree doesn't have an HTML parser, so what do you use for parsing? > > >> but I don't know > >> how to then view the structure of this object. Is there a method or > >> module that you can give an ElementTree object to, and it returns some > >> kind of graphical or printed representation of the tree? Otherwise, if > >> you can't see you're tree's structure, how do you know what is a > >> sensible way of iterating over the tree to access the info you need? > > > ElementTree has a tostring() method that returns a string. To get a pretty > > printed representation, you can use the indent() function from this recipe: > > >http://effbot.org/zone/element-lib.htm#prettyprint > > > Stefan > > Perhaps the OP was referring to XHTML, which should be eligible for > ElementTree. But could you tell me whether ElementTree is at all > tolerant of malformed XML? Most HTML and XHTML I encounter in the wild > is so buggy it's amazing it all works at all. > > DaveA
I used elementtidy, also available from effbot -- http://mail.python.org/mailman/listinfo/python-list