On Fri, 6 May 2022 21:19:48 +0100, MRAB <pyt...@mrabarnett.plus.com> declaimed the following:
>Is the file UTF-8? That's a variable-width encoding, so are any of the >characters > U+007F? > >Which OS? On Windows, it's common/normal for UTF-8 files to start with a >BOM/signature, which is 3 bytes/1 codepoint. Windows also uses <cr><lf> for the EOL marker, but Python's I/O system condenses that to just <lf> internally (for TEXT mode) -- so using the length of a string so read to compute a file position may be off-by-one for each EOL in the string. https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files """ In text mode, the default when reading is to convert platform-specific line endings (\n on Unix, \r\n on Windows) to just \n. When writing in text mode, the default is to convert occurrences of \n back to platform-specific line endings. This behind-the-scenes modification to file data is fine for text files, but will corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files. """ -- Wulfraed Dennis Lee Bieber AF6VN wlfr...@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ -- https://mail.python.org/mailman/listinfo/python-list