Abhimanyu Seth wrote: > Sorry, my mistake. The file was not saved as utf-8. Saving it as utf-8 > solves my problems. > >> f = codecs.open ("c:/test.txt", "r", "utf-8") > >> dom = minidom.parseString (codecs.encode (f.read(), "utf-8")) > > However, I still need to encode the string returned by f.read () before > passing it to parseString. Otherwise I get an exception.
if the file contains UTF-8 data, dom = minidom.parse("c:/test.txt") should be exactly equivalent to your recoding solution. if it isn't, post a copy of the sample file. (if you've double-checked, and are 100% certain that it's not your editor or your environment that's playing tricks with you, you can also report this over here: http://sourceforge.net/tracker/?group_id=5470&atid=105470 ) </F> -- http://mail.python.org/mailman/listinfo/python-list