On Jun 7, 10:55 pm, higer <higerinbeij...@gmail.com> wrote: > My file contains such strings : > \xe6\x97\xa5\xe6\x9c\x9f\xef\xbc\x9a
Are you sure? Does that occupy 9 bytes in your file or 36 bytes? > > I want to read the content of this file and transfer it to the > corresponding gbk code,a kind of Chinese character encode style. > Everytime I was trying to transfer, it will output the same thing no > matter which method was used. > It seems like that when Python reads it, Python will taks '\' as a > common char and this string at last will be represented as "\\xe6\\x97\ > \xa5\\xe6\\x9c\\x9f\\xef\\xbc\\x9a" , then the "\" can be 'correctly' > output,but that's not what I want to get. > > Anyone can help me? > try this: utf8_data = your_data.decode('string-escape') unicode_data = utf8_data.decode('utf8') # unicode derived from your sample looks like this 日期: is that what you expected? gbk_data = unicode_data.encode('gbk') If that "doesn't work", do three things: (1) give us some unambiguous hard evidence about the contents of your data: e.g. # assuming Python 2.x your_data = open('your_file.txt', 'rb').read(36) print repr(your_data) print len(your_data) print your_data.count('\\') print your_data.count('x') (2) show us the source of the script that you used (3) Tell us what "doesn't work" means in this case Cheers, John -- http://mail.python.org/mailman/listinfo/python-list