I decided to use SAX to parse my xml file. But the parser crashes on: File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError raise exception xml.sax._exceptions.SAXParseException: NCBI_Entrezgene.dtd:8:0: error in processing external entity reference
This is caused by: <!DOCTYPE Entrezgene-Set PUBLIC "-//NCBI//NCBI Entrezgene/EN" "NCBI_Entrezgene.dtd"> If I remove it, it parses normally. I've created my parser like this: import sys from xml.sax import make_parser from handler import EntrezGeneHandler fopen = open("mouse2.xml", "r") ch = EntrezGeneHandler() saxparser = make_parser() saxparser.setContentHandler(ch) saxparser.parse(fopen) And the handler is: from xml.sax import ContentHandler class EntrezGeneHandler(ContentHandler): """ A handler to deal with EntrezGene in XML """ def startElement(self, name, attrs): print "Start element:", name So it doesn't do much yet. And still it crashes... How can I tell the parser not to look at the DOCTYPE declaration. On a website: http://www.devarticles.com/c/a/XML/Parsing-XML-with-SAX-and-Python/1/ it states that the SAX parsers are not validating, so this error shouldn't even occur? Cheers, Willem -- http://mail.python.org/mailman/listinfo/python-list