def read_utf8_txt_file (filename):
fileObj = codecs.open( filename, "r", "utf-8" )
content = fileObj.read()
content = content[1:] #exclude BOM
print content
fileObj.close()
read_utf8_txt_file("e:\\u.txt")
22 Dec 2005 18:12:28 -0800, [EMAIL PROTECTED] <
[EMAIL PROTECTED]>:
Hi Friends:
fileObj = codecs.open( filename, "r", "utf-8" )
u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes in
the file
print u
It says error:
UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in
position 0:
illegal multibyte sequence
I want to know how read from UTF-8 file, and convert to specified
locale (default is current system locale) and print out string. I hope
put away BOM header automatically.
Rgds, David
--
http://mail.python.org/mailman/listinfo/python-list
-- http://mail.python.org/mailman/listinfo/python-list