Grzegorz Adam Hankiewicz wrote: > I'm looking for two specific features in XML libraries. One is two be > able to tell which source file line a tag starts and ends. Say, tag > <para> is located on line 34 column 7, and the matching </para> three > lines later on column 56. > > Another feature is to be able to save the processed XML code in a way > that unmodified tags preserve the original identation. Or in the worst > case, all identation is lost, but I can control to some degree the > outlook of the final XML output. > > I have looked at xml.minidom, elementtree and gnosis and haven found any > such features. Are there libs providing these?
here's a custom parser that adds a "lineno" attribute to element nodes: from elementtree import XMLTreeBuilder class MyParser(XMLTreeBuilder.FancyTreeBuilder): def start(self, elem): elem.lineno = self.lineno def parse(file): # feed one line at a time, and keep track of the line number lineno = 1 parser = MyParser() for line in open(file).readlines(): parser.lineno = lineno parser.feed(line) lineno = lineno + 1 return parser.close() for elem in parse("samples/simple.xml").getiterator(): print elem.tag, elem.lineno (the FancyTreeBuilder is somewhat broken in 1.2.1 through 1.2.3, at least if you're using Python 2.3 or later. or in other words, use ElementTree 1.2 or 1.2.4 if you want this to work). the standard elementtree writer may modify the tags, but it preserves all whitespace around them; depending on what you mean by "indentation", that may or may not be what you want. (but if you want to preserve all whitespace in an XML document, you shouldn't run it through an XML parser...) </F> -- http://mail.python.org/mailman/listinfo/python-list