I have a program which produces well-formed XML documents, but takes several hours if not days to do so. It would be useful to be able to take the incomplete output and manipulate it as XML.
Clearly, however, the incomplete output will not be well- formed, so before being able to manipulate it I need to make it wellformed. This is essentially fairly simple. I know it's well-formed up to now - all I need to do is close unclosed tags. So, I 've made a short function that will do this: #!/usr/bin/python import sys import xml.sax tagStack = [] closingTags = "" class DodgyHandler(xml.sax.ContentHandler): def startElement(self, tag, attributes): tagStack.append(tag) def endElement(self, tag): tagStack.pop() class DodgyErrorHandler(xml.sax.ErrorHandler): def fatalError(self,exception): global closingTags tagStack.reverse() for tag in tagStack: closingTags += "</%s>" % tag return closingTags def finishXML(text): p = xml.sax.make_parser() p.setContentHandler(DodgyHandler()) p.setErrorHandler(DodgyErrorHandler()) for line in text: p.feed(line) p.close() text.append(closingTags) However - while this works for 90% of the cases I need, it fails in the case where my incomplete output stops in the middle of a tag (not to mention some other more arcane places I don't really care about). The problem is that when the sax handler raises an exception, I can't see how to find out why. What I want to do is for DodgyErrorHandler to do something different depending on where we are in the course of parsing. Is there anyway to get that information back from xml.sax (or indeed from any other sax handler?) Toby -- Dr. Toby White Dept. of Earth Sciences, Downing Street, Cambridge CB2 3EQ. UK Email: <[EMAIL PROTECTED]> -- http://mail.python.org/mailman/listinfo/python-list