Re: print UTF-8 file with BOM

2005-12-23 Thread Martin v. Löwis
John Bauman wrote: > UTF-8 shouldn't need a BOM, as it is designed for character streams, and > there is only one logical ordering of the bytes. Only UTF-16 and greater > should output a BOM, AFAIK. Yes and no. Yes, UTF-8 does not need a BOM to identify endianness. No, usage of the BOM with UTF

Re: print UTF-8 file with BOM

2005-12-23 Thread Walter Dörwald
John Bauman wrote: > UTF-8 shouldn't need a BOM, as it is designed for character streams, and > there is only one logical ordering of the bytes. Only UTF-16 and greater > should output a BOM, AFAIK. However there's a pending patch (http://bugs.python.org/1177307) for a new encoding named utf-

Re: print UTF-8 file with BOM

2005-12-23 Thread John Bauman
UTF-8 shouldn't need a BOM, as it is designed for character streams, and there is only one logical ordering of the bytes. Only UTF-16 and greater should output a BOM, AFAIK. -- http://mail.python.org/mailman/listinfo/python-list

Re: print UTF-8 file with BOM

2005-12-23 Thread Carsten Haese
> 2005/12/23, David Xiao <[EMAIL PROTECTED]>: > Hi Kuan: > > Thanks a lot! One more question here: How to write if I want > to > specify locale other than current locale? > > For example, running on Korea locale system, and try read a >

Re: print UTF-8 file with BOM

2005-12-23 Thread davihigh
FYI. I had just receive something from a friend, he give me following nice example! I have one more question on this: How to write if I want to specify locale other than current locale? For example, program runn on Korea locale system, and try reading a UTF-8 file that save chinese characters. --

Re: print UTF-8 file with BOM

2005-12-23 Thread Kevin Yuan
Sorry, I'm newbie in python. I can't help you further, indeed I don't know either.:)2005/12/23, David Xiao <[EMAIL PROTECTED]>: Hi Kuan:Thanks a lot! One more question here: How to write if I want tospecify locale other than current locale?For example, running on Korea locale system, and try read a

Re: print UTF-8 file with BOM

2005-12-22 Thread Kevin Yuan
import codecsdef read_utf8_txt_file (filename):    fileObj = codecs.open( filename, "r", "utf-8" )    content = fileObj.read()    content = content[1:] #exclude BOM     print content     fileObj.close()    read_utf8_txt_file("e:\\u.txt")22 Dec 2005 18:12:28 -0800, [EMAIL PROTECTED] < [EMAIL PROTECT

print UTF-8 file with BOM

2005-12-22 Thread davihigh
Hi Friends: fileObj = codecs.open( filename, "r", "utf-8" ) u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes in the file print u It says error: UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in position 0: illegal multiby