On Feb 19, 2:28 pm, Dietrich Bollmann <dir...@web.de> wrote: > Are there any functions in python to convert between different Japanese > coding systems? > > I would like to convert between (at least) ISO-2022-JP, UTF-8, EUC-JP > and SJIS. I also need some function to encode / decode base64 encoded > strings. > > Example: > > email = '''[...] > Subject: > =?UTF-8?Q?romaji=E3=81=B2=E3=82=89=E3=81=8C=E3=81=AA=E3=82=AB=E3=82=BF?= > =?UTF-8?Q?=E3=82=AB=E3=83=8A=E6=BC=A2=E5=AD=97?= > [...] > Content-Type: text/plain; charset=EUC-JP > [...] > Content-Transfer-Encoding: base64 > [...] > > cm9tYWpppNKk6aSspMqlq6W/paulyrTBu/oNCg0K > > ''' > > from = contentType > to = 'utf-8' > contentUtf8 = convertCodingSystem(decodeBase64(content), from, to) > > The only problem is that I could not find any standard functionality to > convert between different Japanese coding systems. > > Thanks, > > Dietrich Bollmann
import base64 ENCODINGS = ['ISO-2022-JP', 'UTF-8', 'EUC-JP', 'SJIS'] def decodeBase64(content): return base64.decodestring(content) def convertCodingSystem(s, _from, _to): unicode = s.decode(_from) return unicode.encode(_to) if __name__ == '__main__': content = 'cm9tYWpppNKk6aSspMqlq6W/paulyrTBu/oNCg0K' _from = 'EUC-JP' for _to in ENCODINGS: x = convertCodingSystem(decodeBase64(content), _from, _to) print _to, repr(x) -- http://mail.python.org/mailman/listinfo/python-list