On 2022-05-06 20:21, Marco Sulla wrote:
I have a little problem.
I tried to extend the tail function, so it can read lines from the bottom
of a file object opened in text mode.
The problem is it does not work. It gets a starting position that is lower
than the expected by 3 characters. So the first line is read only for 2
chars, and the last line is missing.
import os
_lf = "\n"
_cr = "\r"
_lf_ord = ord(_lf)
def tail(f, n=10, chunk_size=100):
n_chunk_size = n * chunk_size
pos = os.stat(f.fileno()).st_size
chunk_line_pos = -1
lines_not_found = n
binary_mode = "b" in f.mode
lf = _lf_ord if binary_mode else _lf
while pos != 0:
pos -= n_chunk_size
if pos < 0:
pos = 0
f.seek(pos)
chars = f.read(n_chunk_size)
for i, char in enumerate(reversed(chars)):
if char == lf:
lines_not_found -= 1
if lines_not_found == 0:
chunk_line_pos = len(chars) - i - 1
print(chunk_line_pos, i)
break
if lines_not_found == 0:
break
line_pos = pos + chunk_line_pos + 1
f.seek(line_pos)
res = b"" if binary_mode else ""
for i in range(n):
res += f.readline()
return res
Maybe the problem is 1 char != 1 byte?
Is the file UTF-8? That's a variable-width encoding, so are any of the
characters > U+007F?
Which OS? On Windows, it's common/normal for UTF-8 files to start with a
BOM/signature, which is 3 bytes/1 codepoint.
--
https://mail.python.org/mailman/listinfo/python-list