Greg Aumann wrote: > In reading the elementtree documentation I found the > ElementTree.TreeBuilder class which it says can be used to create > parsers for XML-like languages.
a TreeBuilder is a thing that turns a sequence of start(), data(), and end() method calls into an Element tree structure. a Parser is a think that turns a sequence of feed() method calls into a stream of start(), data(), and end() method calls on a target object. the standard parsers all automatically uses a TreeBuilder instance as the default target. unfortunately, the current ET release uses classes named XXXTreeBuilder also for the actual parsers, which is a bit confusing. (the reason for this is historical; the separate TreeBuilder class is factored out from a couple of format-specific XXXTreeBuilder parsers, but the naming wasn't fully sorted out). > Essentially I was trying to implement the following advice from Frederik > Lundh (Wed, Sep 8 2004 12:54 am): > > by the way, it's trivial to build trees from arbitrary SAX-style sources. > > just create an instance of the ElementTree.TreeBuilder class, and call > > the "start", "end", and "data" methods as appropriate. > > > > builder = ElementTree.TreeBuilder() > > builder.start("tag", {}) > > builder.data("text") > > builder.end("tag") > > elem = builder.close() that's the intended use of the TreeBuilder class. > but in another post he wrote (Wed, May 21 2003 2:56 am): > > usage: > > > > from elementtree import ElementTree, HTMLTreeBuilder > > > > # file is either a filename or an open stream > > tree = ElementTree.parse(file, parser=HTMLTreeBuilder.TreeBuilder()) > > root = tree.getroot() > > > > or > > > > from elementtree import HTMLTreeBuilder > > > > parser = HTMLTreeBuilder.TreeBuilder() > > parser.feed(data) > > root = parser.close() and this is the confusing naming; here, the HTMLTreeBuilder.TreeBuilder class is actually doing the parsing (which uses a TreeBuilder instance on the inside). > This second one makes me think I should have implemented a parser class > using Treebuilder. that's entirely up to you: the only real advantage of having a parser class is that you can pass it to any other module that uses the Python consumer interface: http://effbot.org/zone/consumer.htm but if that's not relevant for your application, feel free to use a TreeBuilder directly. > Also when I used return builder.close() in the code below it didn't return > an ElementTree structure but an _ElementInterface. an Element, in other words (i.e. the thing returned by the Element factory in this specific implementation). that's the documented behaviour; if you want an ElementTree wrapper, you have to wrap it yourself. </F> -- http://mail.python.org/mailman/listinfo/python-list