Michael Goerz wrote: > Hi, > > I am writing unicode stings into a special text file that requires to > have non-ascii characters as as octal-escaped UTF-8 codes. > > For example, the letter "Í" (latin capital I with acute, code point 205) > would come out as "\303\215". > > I will also have to read back from the file later on and convert the > escaped characters back into a unicode string. > > Does anyone have any suggestions on how to go from "Í" to "\303\215" and > vice versa? > Perhaps something along the lines of:
>>> def encode(source): ... return "".join("\%o" % ord(c) for c in source.encode('utf8')) ... >>> def decode(encoded): ... bytes = "".join(chr(int(c, 8)) for c in encoded.split('\\')[1:]) ... return bytes.decode('utf8') ... >>> encode(u"Í") '\\303\\215' >>> print decode(_) Í >>> HTH Michael -- http://mail.python.org/mailman/listinfo/python-list