import codecs
def read_utf8_txt_file (filename):
    fileObj = codecs.open( filename, "r", "utf-8" )
    content = fileObj.read()
    content = content[1:] #exclude BOM
    print content
    fileObj.close()
   
read_utf8_txt_file("e:\\u.txt")

22 Dec 2005 18:12:28 -0800, [EMAIL PROTECTED] < [EMAIL PROTECTED]>:
Hi Friends:

        fileObj = codecs.open( filename, "r", "utf-8" )
        u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes in
the file
        print u

It says error:
        UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in
position 0:
        illegal multibyte sequence

I want to know how read from UTF-8 file, and convert to specified
locale (default is current system locale) and print out string. I hope
put away BOM header automatically.

Rgds, David

--
http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to