On Dec 14, 2008, at 9:21 AM, Daniel Woodhouse wrote:

Is it possible to re-encode a string to a different character set in
python?  To be more specific, I want to change a text file encoded in
windows-1251 to UTF-8.
I've tried using string.encode, but get the error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 0:
ordinal not in range(128)

Without seeing your code, I can't be sure, but I suspect that first you need to decode the file to Unicode.

# Untested --
s = file("in.txt").read()

s = s.decode("win-1251") # Might be "cp1251" instead

assert(isinstance(s, unicode))

s = s.encode("utf-8")

file("out.txt", "w").write(s)


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to