On Sat, 23 Apr 2022 at 20:59, Chris Angelico <ros...@gmail.com> wrote: > > On Sun, 24 Apr 2022 at 04:37, Marco Sulla <marco.sulla.pyt...@gmail.com> > wrote: > > > > What about introducing a method for text streams that reads the lines > > from the bottom? Java has also a ReversedLinesFileReader with Apache > > Commons IO. > > It's fundamentally difficult to get precise. In general, there are > three steps to reading the last N lines of a file: > > 1) Find out the size of the file (currently, if it's being grown) > 2) Seek to the end of the file, minus some threshold that you hope > will contain a number of lines > 3) Read from there to the end of the file, split it into lines, and > keep the last N > > Reading the preceding N lines is basically a matter of repeating the > same exercise, but instead of "end of the file", use the byte position > of the line you last read. > > The problem is, seeking around in a file is done by bytes, not > characters. So if you know for sure that you can resynchronize > (possible with UTF-8, not possible with some other encodings), then > you can do this, but it's probably best to build it yourself (opening > the file in binary mode).
Well, indeed I have an implementation that does more or less what you described for utf8 only. The only difference is that I just started from the end of file -1. I'm just wondering if this will be useful in the stdlib. I think it's not too difficult to generalise for every encoding. > This is quite inefficient in general. Why inefficient? I think that readlines() will be much slower, not only more time consuming. -- https://mail.python.org/mailman/listinfo/python-list