I am trying to write some python code for a library that reads an XML-like language from a file into elementtree data structures. Then I want to be able to read and/or modify the structure and then be able to write it out either as XML or in the original format. I really want the api for the XML-like language to be the same as the elementtree api to reduce confusion, ease of learning etc.
In reading the elementtree documentation I found the ElementTree.TreeBuilder class which it says can be used to create parsers for XML-like languages. So I wrote the code below. The code is working but I am not sure that this is really the intended way to use the ElementTree.TreeBuilder class. Essentially I was trying to implement the following advice from Frederik Lundh (Wed, Sep 8 2004 12:54 am): > by the way, it's trivial to build trees from arbitrary SAX-style sources. > just create an instance of the ElementTree.TreeBuilder class, and call > the "start", "end", and "data" methods as appropriate. > > builder = ElementTree.TreeBuilder() > builder.start("tag", {}) > builder.data("text") > builder.end("tag") > elem = builder.close() but in another post he wrote (Wed, May 21 2003 2:56 am): > usage: > > from elementtree import ElementTree, HTMLTreeBuilder > > # file is either a filename or an open stream > tree = ElementTree.parse(file, parser=HTMLTreeBuilder.TreeBuilder()) > root = tree.getroot() > > or > > from elementtree import HTMLTreeBuilder > > parser = HTMLTreeBuilder.TreeBuilder() > parser.feed(data) > root = parser.close() This second one makes me think I should have implemented a parser class using Treebuilder. Also when I used return builder.close() in the code below it didn't return an ElementTree structure but an _ElementInterface. So my question is really about how I should structure the code so that it is as similar to use this XML format as to use XML itself in elementtree. from elementtree import ElementTree from nltk_lite.corpora.shoebox import ShoeboxFile class Settings(ShoeboxFile): def __init__(self): super(Settings, self).__init__() def parse(self, encoding=None): builder = ElementTree.TreeBuilder() for mkr, value in self.fields(encoding, unwrap=False): block=mkr[0] if block in ("+", "-"): mkr=mkr[1:] else: block=None if block == "+": builder.start(mkr, {}) builder.data(value) elif block == '-': builder.end(mkr) else: builder.start(mkr, {}) builder.data(value) builder.end(mkr) return ElementTree.ElementTree(builder.close()) -- http://mail.python.org/mailman/listinfo/python-list