Edward K. Ream wrote: >>> Can anyone tell me how the content handler can determine the encoding of >>> the file? Can sax provide this info? > >> there is no encoding on the "inside" of an XML document; it's all >> Unicode. > > True, but sax is reading the file, so sax is producing the unicode, so it > should (must) be able to determine the encoding.
It is, by reading the xml header. > Furthermore, xml files > start with lines like: > > <?xml version="1.0" encoding="utf-8"?> > > so it would seem reasonable for sax to be able to return 'utf-8' somehow. > Am I missing something? That sax outputs unicode, which has no encoding associated anymore. And thus it is a pretty much irrelevant information. It _could_ be retained, but for what purpose? Diez -- http://mail.python.org/mailman/listinfo/python-list