On 3/5/2011 10:21 AM, tkp...@hotmail.com wrote:
Question: how do I use f.tell() to identify if an offset is legal or illegal?
Read backwards in binary mode, byte by byte, until you reach a byte which is, in binary, either 0xxxxxxx 11xxxxxx You are then at the beginning of an ASCII or UTF-8 character. You can copy the bytes forward from there into an array of bytes, then apply the appropriate codec. This is also what you do if skipping ahead in a UTF-8 file, to get in sync. Reading the last line or lines is easier. Read backwards in binary until you hit an LF or CR, both of which are the same in ASCII and UTF-8. Copy the bytes forward from that point into an array of bytes, then apply the appropriate codec. John Nagle -- http://mail.python.org/mailman/listinfo/python-list