On Dec 14, 2008, at 9:21 AM, Daniel Woodhouse wrote:
Is it possible to re-encode a string to a different character set in
python? To be more specific, I want to change a text file encoded in
windows-1251 to UTF-8.
I've tried using string.encode, but get the error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position
0:
ordinal not in range(128)
Without seeing your code, I can't be sure, but I suspect that first
you need to decode the file to Unicode.
# Untested --
s = file("in.txt").read()
s = s.decode("win-1251") # Might be "cp1251" instead
assert(isinstance(s, unicode))
s = s.encode("utf-8")
file("out.txt", "w").write(s)
--
http://mail.python.org/mailman/listinfo/python-list