f = open('words.xml', 'r') s = f.read() xml.sax.parseString(s, MyParser())
This produced the following error:
Traceback (most recent call last):
File "sax5.py", line 87, in ?
xml.sax.parseString(s, MyParser())
File "D:\Python\lib\xml\sax\__init__.py", line 49, in parseString
parser.parse(inpsrc)
File "D:\Python\lib\xml\sax\expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "D:\Python\lib\xml\sax\xmlreader.py", line 125, in parse
self.close()
File "D:\Python\lib\xml\sax\expatreader.py", line 218, in close
self._cont_handler.endDocument()
File "sax5.py", line 81, in endDocument
f.write(header + self.all + footer)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 745-751: ordinal not in range(128)
The XML declaration should be enough to tell the encoding. Anyway, I read some previous posts, and found that the unicode() function may help:
f = open('words.xml', 'r') s = f.read() u = unicode(s, "utf-8") xml.sax.parseString(u, MyParser())
But I just got another error:
Traceback (most recent call last):
File "sax5.py", line 87, in ?
xml.sax.parseString(u, MyParser())
File "D:\Python\lib\xml\sax\__init__.py", line 49, in parseString
parser.parse(inpsrc)
File "D:\Python\lib\xml\sax\expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "D:\Python\lib\xml\sax\xmlreader.py", line 123, in parse
self.feed(buffer)
File "D:\Python\lib\xml\sax\expatreader.py", line 211, in feed
self._err_handler.fatalError(exc)
File "D:\Python\lib\xml\sax\handler.py", line 38, in fatalError
raise exception
xml.sax._exceptions.SAXParseException: <unknown>:1:30: encoding specified in XML declaration is incorrect
I see nothing wrong with my XML declaration:
<?xml version="1.0" encoding="utf-8"?>
And the file is indeed in UTF-8 (or I wouldn't be able to open it in IE and FF). I tried removing the BOM, but it didn't help. What more can be wrong?
Gustaf -- http://mail.python.org/mailman/listinfo/python-list