Re: processing a large utf-8 file

2005-05-20 Thread "Martin v. Löwis"
Ivan Voras wrote: > Since the .encoding attribute of file objects are read-only, what is the > proper way to process large utf-8 text files? You should use codecs.open, or codecs.getreader to get a StreamReader for UTF-8. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list

processing a large utf-8 file

2005-05-20 Thread Ivan Voras
Since the .encoding attribute of file objects are read-only, what is the proper way to process large utf-8 text files? I need "bulk" processing (i.e. in blocks - the file is ~ 1GB), but reading it in fixed blocks is bound to result in partially-read utf-8 characters at block boundaries. -- ht