Dietrich Bollmann wrote: > I get the strings (which actually are emails) from a server on the > internet with: > > import urllib > server = urllib.urlopen(serverURL, parameters) > email = server.read() > > The coding systems are given in the response string: > > Example: > > email = '''[...] > Subject: > =?UTF-8?Q?romaji=E3=81=B2=E3=82=89=E3=81=8C=E3=81=AA=E3=82=AB=E3=82=BF?= > =?UTF-8?Q?=E3=82=AB=E3=83=8A=E6=BC=A2=E5=AD=97?= > [...] > Content-Type: text/plain; charset=EUC-JP > [...] > Content-Transfer-Encoding: base64 > [...] > > cm9tYWpppNKk6aSspMqlq6W/paulyrTBu/oNCg0K > > '''
Is that an email? Maybe you can get it in a format that is supported by the email package in the standard library. > The only problem is that I could not find any standard functionality to > convert between different Japanese coding systems. Then you didn't look hard enough: >>> s = "会社概要".decode("utf8") # i have no idea what that means >>> s.encode("iso-2022-jp") '\x1b$B2q<R35MW\x1b(B' >>> s.encode("euc-jp") '\xb2\xf1\xbc\xd2\xb3\xb5\xcd\xd7' >>> s.encode("sjis") '\x89\xef\x8e\xd0\x8aT\x97v' See also http://www.amk.ca/python/howto/unicode Peter -- http://mail.python.org/mailman/listinfo/python-list