subject:"tail"

Re: on a tail-recursive square-and-multiply

2023-11-09 Thread Julieta Shem via Python-list

Julieta Shem writes: [...] > I agree. By the way, I once read or watched an interview with Guido van > Rossum and and he was asked why not to tail-call optimize Python and the > answer he gave --- IIRC --- was that tail-call optimization makes it > harder for a beginner to unders

Re: on a tail-recursive square-and-multiply

2023-11-08 Thread Julieta Shem via Python-list

Greg Ewing writes: > On 8/11/23 2:26 pm, Julieta Shem wrote: >> For the first time I'm trying to write a tail-recursive >> square-and-multiply and, even though it /seems/ to work, I'm not happy >> with what I wrote and I don't seem to understand it so well.

Re: on a tail-recursive square-and-multiply

2023-11-07 Thread Greg Ewing via Python-list

On 8/11/23 2:26 pm, Julieta Shem wrote: For the first time I'm trying to write a tail-recursive square-and-multiply and, even though it /seems/ to work, I'm not happy with what I wrote and I don't seem to understand it so well. Stepping back a bit, why do you feel the need to

Re: on a tail-recursive square-and-multiply

2023-11-07 Thread Michael Torrie via Python-list

On 11/7/23 18:26, Julieta Shem via Python-list wrote: > For the first time I'm trying to write a tail-recursive > square-and-multiply and, even though it /seems/ to work, I'm not happy > with what I wrote and I don't seem to understand it so well. > &g

on a tail-recursive square-and-multiply

2023-11-07 Thread Julieta Shem via Python-list

For the first time I'm trying to write a tail-recursive square-and-multiply and, even though it /seems/ to work, I'm not happy with what I wrote and I don't seem to understand it so well. --8<---cut here---start->8--- def sam(b, e, m,

Re: Precision Tail-off?

2023-02-18 Thread Oscar Benjamin

On Sat, 18 Feb 2023 at 11:19, Peter J. Holzer wrote: > > On 2023-02-18 03:52:51 +, Oscar Benjamin wrote: > > On Sat, 18 Feb 2023 at 01:47, Chris Angelico wrote: > > > On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list > > > > To avoid it you would need to use an algorithm that computes

Re: Precision Tail-off?

2023-02-18 Thread Peter J. Holzer

On 2023-02-18 03:52:51 +, Oscar Benjamin wrote: > On Sat, 18 Feb 2023 at 01:47, Chris Angelico wrote: > > On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list > > > To avoid it you would need to use an algorithm that computes nth > > > roots directly rather than raising to the power 1/n. >

Re: Precision Tail-off?

2023-02-17 Thread Oscar Benjamin

On Sat, 18 Feb 2023 at 01:47, Chris Angelico wrote: > > On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list > wrote: > > > > On 18/02/23 7:42 am, Richard Damon wrote: > > > On 2/17/23 5:27 AM, Stephen Tucker wrote: > > >> None of the digits in RootNZZZ's string should be different from the >

Re: Precision Tail-off?

2023-02-17 Thread Michael Torrie

On 2/17/23 15:03, Grant Edwards wrote: > Every fall, the groups were again full of a new crop of people who had > just discovered all sorts of bugs in the way > implemented floating point, and pointing them to a nicely written > document that explained it never did any good. But to be fair, Goldb

Re: Precision Tail-off?

2023-02-17 Thread Chris Angelico

On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list wrote: > > On 18/02/23 7:42 am, Richard Damon wrote: > > On 2/17/23 5:27 AM, Stephen Tucker wrote: > >> None of the digits in RootNZZZ's string should be different from the > >> corresponding digits in RootN. > > > > Only if the storage form

Re: Precision Tail-off?

2023-02-17 Thread Greg Ewing via Python-list

On 18/02/23 7:42 am, Richard Damon wrote: On 2/17/23 5:27 AM, Stephen Tucker wrote: None of the digits in RootNZZZ's string should be different from the corresponding digits in RootN. Only if the storage format was DECIMAL. Note that using decimal wouldn't eliminate this particular problem,

Re: Precision Tail-off?

2023-02-17 Thread Grant Edwards

On 2023-02-17, Mats Wichmann wrote: > And... this topic as a whole comes up over and over again, like > everywhere. That's an understatement. I remember it getting rehashed over and over again in various USENET groups 35 years ago when when the VAX 11/780 BSD machine on which I read news exchan

Re: Precision Tail-off?

2023-02-17 Thread Mats Wichmann

On 2/17/23 11:42, Richard Damon wrote: On 2/17/23 5:27 AM, Stephen Tucker wrote: The key factor here is IEEE floating point is storing numbers in BINARY, not DECIMAL, so a multiply by 1000 will change the representation of the number, and thus the possible resolution errors. Store you numbe

Re: Precision Tail-off?

2023-02-17 Thread Grant Edwards

On 2023-02-17, Richard Damon wrote: > [...] > >> Perhaps this observation should be brought to the attention of the IEEE. I >> would like to know their response to it. > > That is why they have developed the Decimal Floating point format, to > handle people with those sorts of problems. > > They

Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer

On 2023-02-17 14:39:42 +, Weatherby,Gerard wrote: > IEEE did not define a standard for floating point arithmetics. They > designed multiple standards, including a decimal float point one. > Although decimal floating point (DFP) hardware used to be > manufactured, I couldn’t find any current man

Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer

On 2023-02-17 10:27:08 +, Stephen Tucker wrote: > This is a hugely controversial claim, I know, but I would consider this > behaviour to be a serious deficiency in the IEEE standard. > > Consider an integer N consisting of a finitely-long string of digits in > base 10. > > Consider the infini

Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer

On 2023-02-17 08:38:58 -0700, Michael Torrie wrote: > On 2/17/23 03:27, Stephen Tucker wrote: > > Thanks, one and all, for your reponses. > > > > This is a hugely controversial claim, I know, but I would consider this > > behaviour to be a serious deficiency in the IEEE standard. > > No matter ho

Re: Precision Tail-off?

2023-02-17 Thread Oscar Benjamin

On Fri, 17 Feb 2023 at 10:29, Stephen Tucker wrote: > > Thanks, one and all, for your reponses. > > This is a hugely controversial claim, I know, but I would consider this > behaviour to be a serious deficiency in the IEEE standard. [snip] > > Perhaps this observation should be brought to the atte

Re: Precision Tail-off?

2023-02-17 Thread Richard Damon

On 2/17/23 5:27 AM, Stephen Tucker wrote: Thanks, one and all, for your reponses. This is a hugely controversial claim, I know, but I would consider this behaviour to be a serious deficiency in the IEEE standard. Consider an integer N consisting of a finitely-long string of digits in base 10.

Re: Precision Tail-off?

2023-02-17 Thread Michael Torrie

On 2/17/23 03:27, Stephen Tucker wrote: > Thanks, one and all, for your reponses. > > This is a hugely controversial claim, I know, but I would consider this > behaviour to be a serious deficiency in the IEEE standard. No matter how you do it, there are always tradeoffs and inaccuracies moving fr

Re: Precision Tail-off?

2023-02-17 Thread Peter Pearson

p] >> >> I have just produced the following log in IDLE (admittedly, in Python >> >> 2.7.10 and, yes I know that it has been superseded). >> >> >> >> It appears to show a precision tail-off as the supplied float gets >> bigger. >> [sn

RE: Precision Tail-off?

2023-02-17 Thread avi.e.gross

? -Original Message- From: Python-list On Behalf Of Stephen Tucker Sent: Friday, February 17, 2023 5:27 AM To: python-list@python.org Subject: Re: Precision Tail-off? Thanks, one and all, for your reponses. This is a hugely controversial claim, I know, but I would consider this behaviour to be

Re: Precision Tail-off?

2023-02-17 Thread Weatherby,Gerard

until a few years ago, but they seem to have gone dark: https://twitter.com/SilMinds From: Python-list on behalf of Thomas Passin Date: Friday, February 17, 2023 at 9:02 AM To: python-list@python.org Subject: Re: Precision Tail-off? *** Attention: This is an external email. Use caution

Re: Precision Tail-off?

2023-02-17 Thread Thomas Passin

On Tue, 14 Feb 2023 11:17:20 +, Oscar Benjamin wrote: On Tue, 14 Feb 2023 at 07:12, Stephen Tucker wrote: [snip] I have just produced the following log in IDLE (admittedly, in Python 2.7.10 and, yes I know that it has been superseded). It appears to show a precision tail-off as the supp

Re: Precision Tail-off?

2023-02-17 Thread Stephen Tucker

> > Stephen Tucker. > > > On Thu, Feb 16, 2023 at 6:49 PM Peter Pearson > wrote: > >> On Tue, 14 Feb 2023 11:17:20 +, Oscar Benjamin wrote: >> > On Tue, 14 Feb 2023 at 07:12, Stephen Tucker >> wrote: >> [snip] >> >> I have just produced

Re: Precision Tail-off?

2023-02-17 Thread Stephen Tucker

2023 at 07:12, Stephen Tucker > wrote: > [snip] > >> I have just produced the following log in IDLE (admittedly, in Python > >> 2.7.10 and, yes I know that it has been superseded). > >> > >> It appears to show a precision tail-off as the supplied float gets >

Re: Precision Tail-off?

2023-02-16 Thread Peter Pearson

On Tue, 14 Feb 2023 11:17:20 +, Oscar Benjamin wrote: > On Tue, 14 Feb 2023 at 07:12, Stephen Tucker wrote: [snip] >> I have just produced the following log in IDLE (admittedly, in Python >> 2.7.10 and, yes I know that it has been superseded). >> >> It appears to s

Re: Precision Tail-off?

2023-02-15 Thread Weatherby,Gerard

) 8.881784197001252e-16 1E-99 From: Python-list on behalf of Michael Torrie Date: Tuesday, February 14, 2023 at 5:52 PM To: python-list@python.org Subject: Re: Precision Tail-off? *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. *** On 2

Re: Precision Tail-off?

2023-02-14 Thread Michael Torrie

On 2/14/23 00:09, Stephen Tucker wrote: > I have two questions: > 1. Is there a straightforward explanation for this or is it a bug? To you 1/3 may be an exact fraction, and the definition of raising a number to that power means a cube root which also has an exact answer, but to the computer, 1/3 i

Re: Precision Tail-off?

2023-02-14 Thread Weatherby,Gerard

Use Python3 Use the decimal module: https://docs.python.org/3/library/decimal.html From: Python-list on behalf of Stephen Tucker Date: Tuesday, February 14, 2023 at 2:11 AM To: Python Subject: Precision Tail-off? *** Attention: This is an external email. Use caution responding, opening

Re: Precision Tail-off?

2023-02-14 Thread Oscar Benjamin

On Tue, 14 Feb 2023 at 07:12, Stephen Tucker wrote: > > Hi, > > I have just produced the following log in IDLE (admittedly, in Python > 2.7.10 and, yes I know that it has been superseded). > > It appears to show a precision tail-off as the supplied float gets bigger. > >

Precision Tail-off?

2023-02-13 Thread Stephen Tucker

Hi, I have just produced the following log in IDLE (admittedly, in Python 2.7.10 and, yes I know that it has been superseded). It appears to show a precision tail-off as the supplied float gets bigger. I have two questions: 1. Is there a straightforward explanation for this or is it a bug? 2

Re: tail

2022-05-19 Thread Cameron Simpson

thon interpreter. Try: >> >> time python3 your-tail-prog.py /home/marco/lorem.txt > >Well, I'll try it, but it's not a bit unfair to compare Python startup with C? Yes it is. But timeit goes the other way and only measures the code. Admittedly I'd expect a C

Re: tail

2022-05-19 Thread Marco Sulla

On Wed, 18 May 2022 at 23:32, Cameron Simpson wrote: > > On 17May2022 22:45, Marco Sulla wrote: > >Well, I've done a benchmark. > >>>> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail}, > >>>>

Re: tail

2022-05-18 Thread Cameron Simpson

On 17May2022 22:45, Marco Sulla wrote: >Well, I've done a benchmark. >>>> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail}, >>>> number=10) >1.5963431186974049 >>>> timeit.timei

Re: tail

2022-05-18 Thread Marco Sulla

Well, I've done a benchmark. >>> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail}, >>> number=10) 1.5963431186974049 >>> timeit.timeit("tail('/home/marco/lorem.txt')", globals={"t

Re: tail

2022-05-16 Thread Marco Sulla

ion" I have ever seen. > > > You're lucky. I've seen much worse (or no one). > > At least with *no* documentation, the source code stands for itself. So I did it well to not put one in the first time. I think that after 100 posts about tail, chunks etc it was clear what th

Re: tail

2022-05-13 Thread 2QdxY4RzWzUUiLuE

On 2022-05-13 at 12:16:57 +0200, Marco Sulla wrote: > On Fri, 13 May 2022 at 00:31, Cameron Simpson wrote: [...] > > This is nearly the worst "specification" I have ever seen. > You're lucky. I've seen much worse (or no one). At least with *no* documentation, the source code stands for itsel

Re: tail

2022-05-13 Thread Marco Sulla

> >""" > >A function that "tails" the file. If you don't know what that means, > >google "man tail" > > > >filepath: the file path of the file to be "tailed" > >n: the numbers of lines "tailed" > &g

Re: tail

2022-05-12 Thread Cameron Simpson

On 12May2022 19:48, Marco Sulla wrote: >On Thu, 12 May 2022 at 00:50, Stefan Ram wrote: >> There's no spec/doc, so one can't even test it. > >Excuse me, you're very right. > >""" >A function that "tails" the file. If you don'

Re: tail

2022-05-12 Thread Dennis Lee Bieber

On Thu, 12 May 2022 22:45:42 +0200, Marco Sulla declaimed the following: > >Maybe. Maybe not. What if the file ends with no newline? https://github.com/coreutils/coreutils/blob/master/src/tail.c Lines 567-569 (also lines 550-557 for "bytes_read" determination) -- Wulfraed

Re: tail

2022-05-12 Thread Marco Sulla

Thank you very much. This helped me to improve the function: import os _lf = b"\n" _err_n = "Parameter n must be a positive integer number" _err_chunk_size = "Parameter chunk_size must be a positive integer number" def tail(filepath, n=10, chunk_size=100):

Re: tail

2022-05-12 Thread Marco Sulla

On Thu, 12 May 2022 at 00:50, Stefan Ram wrote: > > Marco Sulla writes: > >def tail(filepath, n=10, chunk_size=100): > >if (n <= 0): > >raise ValueError(_err_n) > ... > > There's no spec/doc, so one can't even test it. Excuse me, you&#

Re: tail

2022-05-11 Thread Avi Gross via Python-list

numpy/pandas in Python often provide functions with names like head or tail as do other languages where data structures with names like data.frame are commonly used. These structures are in some way indexed to make it easy to jump towards the end. Text files are not. Efficiency aside, a 3-year-old

Re: tail

2022-05-11 Thread Avi Gross via Python-list

Just FYI, UNIX had a bunch of utilities that could emulate a vanilla version of tail on a command line. You can use sed, awk and quite a few others to simply show line N to the end of a file or other variations. Of course the way many things were done back then had less focus on efficiency

Re: tail

2022-05-11 Thread Dennis Lee Bieber

On Thu, 12 May 2022 06:07:18 +1000, Chris Angelico declaimed the following: >I don't understand why this wants to be in the standard library. > Especially as any Linux distribution probably includes the compiled "tail" command, so this would only be of use on Wi

Re: tail

2022-05-11 Thread Chris Angelico

other > tests, and, frankly, I don't want to. I don't want to because I'm > quite sure the implementation is fast, since it reads by chunks and > cache them. I'm not sure it's 100% free of bugs, but the concept is > very simple, since it simply mimics the

Re: tail

2022-05-11 Thread Marco Sulla

I'm quite sure the implementation is fast, since it reads by chunks and cache them. I'm not sure it's 100% free of bugs, but the concept is very simple, since it simply mimics the *nix tail, so it should be reliable. > > > I'd very much like to see a CPython implementation

Re: tail

2022-05-11 Thread Chris Angelico

read method). > > I suppose the function is reliable. File is opened in binary mode and only > b"\n" is searched as line end, as *nix tail (and python readline in binary > mode) do. And bytes are returned. The caller can use them as is or convert > them to a string using the

Re: tail

2022-05-11 Thread Marco Sulla

On Mon, 9 May 2022 at 23:15, Dennis Lee Bieber wrote: > > On Mon, 9 May 2022 21:11:23 +0200, Marco Sulla > declaimed the following: > > >Nevertheless, tail is a fundamental tool in *nix. It's fast and > >reliable. Also the tail command can't handle different

Re: tail

2022-05-09 Thread Alan Bawden

Marco Sulla writes: On Mon, 9 May 2022 at 19:53, Chris Angelico wrote: ... Nevertheless, tail is a fundamental tool in *nix. It's fast and reliable. Also the tail command can't handle different encodings? It definitely can't. It works for UTF-8, and all the ASCII co

Re: tail

2022-05-09 Thread Dennis Lee Bieber

On Mon, 9 May 2022 21:11:23 +0200, Marco Sulla declaimed the following: >Nevertheless, tail is a fundamental tool in *nix. It's fast and >reliable. Also the tail command can't handle different encodings? Based upon https://github.com/coreutils/coreutils/blob/master/src

Re: tail

2022-05-09 Thread Chris Angelico

On Tue, 10 May 2022 at 07:07, Barry wrote: > POSIX tail just prints the bytes to the output that it finds between \n bytes. > At no time does it need to care about encodings as that is a problem solved > by the terminal software. I would not expect utf-16 to work with tail on > l

Re: tail

2022-05-09 Thread Barry

he middle of some character. And there are encodings >>>> where you cannot inspect the data to find a character boundary in the >>>> byte stream. >>> >>> Ooook, now I understand what you and Barry mean. I suppose there's no >>> reliable way to

Re: tail

2022-05-09 Thread Barry

> On 9 May 2022, at 17:41, r...@zedat.fu-berlin.de wrote: > > Barry Scott writes: >> Why use tiny chunks? You can read 4KiB as fast as 100 bytes > > When optimizing code, it helps to be aware of the orders of > magnitude That is true and we’ll know to me, now show how what I said is wrong.

Re: tail

2022-05-09 Thread Chris Angelico

in the middle of some character. And there are encodings > > > > where you cannot inspect the data to find a character boundary in the > > > > byte stream. > > > > > > Ooook, now I understand what you and Barry mean. I suppose there's no > > >

Re: tail

2022-05-09 Thread Marco Sulla

to find a character boundary in the > > > byte stream. > > > > Ooook, now I understand what you and Barry mean. I suppose there's no > > reliable way to tail a big file opened in text mode with a decent > > performance. > > > > Anyway, the previous-previo

Re: tail

2022-05-09 Thread 2QdxY4RzWzUUiLuE

On 2022-05-08 at 18:52:42 +, Stefan Ram wrote: > Remember how recently people here talked about how you cannot copy > text from a video? Then, how did I do it? Turns out, for my > operating system, there's a screen OCR program! So I did this OCR > and then manually corrected a few wro

Re: tail

2022-05-09 Thread Chris Angelico

sized characters. _If_ you did a seek to an arbitrary number > > you can end up in the middle of some character. And there are encodings > > where you cannot inspect the data to find a character boundary in the > > byte stream. > > Ooook, now I understand what you and Barry m

Re: tail

2022-05-09 Thread Marco Sulla

p in the middle of some character. And there are encodings > where you cannot inspect the data to find a character boundary in the > byte stream. Ooook, now I understand what you and Barry mean. I suppose there's no reliable way to tail a big file opened in text mode with a decent performa

Re: tail

2022-05-09 Thread Dennis Lee Bieber

On Sun, 8 May 2022 22:48:32 +0200, Marco Sulla declaimed the following: > >Emh. I re-quote > >seek(offset, whence=SEEK_SET) >Change the stream position to the given byte offset. > >And so on. No mention of differences between text and binary mode. You ignore that, underneath, Python is j

Re: tail

2022-05-09 Thread Greg Ewing

On 9/05/22 7:47 am, Marco Sulla wrote: It will fail if the contents is not ASCII. Why? For some encodings, if you seek to an arbitrary byte position and then read, it may *appear* to succeed but give you complete gibberish. Your method might work for a certain subset of encodings (those that

Re: tail

2022-05-08 Thread Cameron Simpson

On 08May2022 22:48, Marco Sulla wrote: >On Sun, 8 May 2022 at 22:34, Barry wrote: >> >> In text mode you can only seek to a value return from f.tell() >> >> otherwise the behaviour is undefined. >> > >> > Why? I don't see any recommendation about it in the docs: >> > https://docs.python.org/3/li

Re: tail

2022-05-08 Thread Marco Sulla

On Sun, 8 May 2022 at 22:34, Barry wrote: > > > On 8 May 2022, at 20:48, Marco Sulla wrote: > > > > On Sun, 8 May 2022 at 20:31, Barry Scott wrote: > >> > >>>> On 8 May 2022, at 17:05, Marco Sulla > >>>> wrote: > >>> &g

Re: tail

2022-05-08 Thread Barry

> On 8 May 2022, at 20:48, Marco Sulla wrote: > > On Sun, 8 May 2022 at 20:31, Barry Scott wrote: >> >>>> On 8 May 2022, at 17:05, Marco Sulla wrote: >>> >>> def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): >>&g

Re: tail

2022-05-08 Thread Marco Sulla

On Sun, 8 May 2022 at 22:02, Chris Angelico wrote: > > Absolutely not. As has been stated multiple times in this thread, a > fully general approach is extremely complicated, horrifically > unreliable, and hopelessly inefficient. Well, my implementation is quite general now. It's not complicated a

Re: tail

2022-05-08 Thread Chris Angelico

On Mon, 9 May 2022 at 05:49, Marco Sulla wrote: > Anyway, apart from my implementation, I'm curious if you think a tail > method is worth it to be a method of the builtin file objects in > CPython. Absolutely not. As has been stated multiple times in this thread, a fully gener

Re: tail

2022-05-08 Thread Marco Sulla

On Sun, 8 May 2022 at 20:31, Barry Scott wrote: > > > On 8 May 2022, at 17:05, Marco Sulla wrote: > > > > def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): > >n_chunk_size = n * chunk_size > > Why use tiny chunks? You can read 4KiB as f

Re: tail

2022-05-08 Thread MRAB

On 2022-05-08 19:15, Barry Scott wrote: On 7 May 2022, at 22:31, Chris Angelico wrote: On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: MRAB writes: On 2022-05-07 19:47, Stefan Ram wrote: ... def encoding( name ): path = pathlib.Path( name ) for encoding in( "utf_8", "latin_1", "cp1

Re: tail

2022-05-08 Thread Barry Scott

> On 8 May 2022, at 17:05, Marco Sulla wrote: > > I think I've _almost_ found a simpler, general way: > > import os > > _lf = "\n" > _cr = "\r" > > def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): >n_chun

Re: tail

2022-05-08 Thread Chris Angelico

On Mon, 9 May 2022 at 04:15, Barry Scott wrote: > > > > > On 7 May 2022, at 22:31, Chris Angelico wrote: > > > > On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: > >> > >> MRAB writes: > >>> On 2022-05-07 19:47, Stefan Ram wrote: > >> ... > def encoding( name ): > path = pathlib.Path(

Re: tail

2022-05-08 Thread Barry Scott

> On 7 May 2022, at 22:31, Chris Angelico wrote: > > On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: >> >> MRAB writes: >>> On 2022-05-07 19:47, Stefan Ram wrote: >> ... def encoding( name ): path = pathlib.Path( name ) for encoding in( "utf_8", "latin_1", "cp1252" ):

Re: tail

2022-05-08 Thread Barry Scott

> On 7 May 2022, at 14:40, Stefan Ram wrote: > > Marco Sulla writes: >> So there's no way to reliably read lines in reverse in text mode using >> seek and read, but the only option is readlines? > > I think, CPython is based on C. I don't know whether > Python's seek function directly call

Re: tail

2022-05-08 Thread Marco Sulla

I think I've _almost_ found a simpler, general way: import os _lf = "\n" _cr = "\r" def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): n_chunk_size = n * chunk_size pos = os.stat(filepath).st_size chunk_line_pos = -1 lines_not

Re: tail

2022-05-08 Thread Barry

> On 7 May 2022, at 17:29, Marco Sulla wrote: > > On Sat, 7 May 2022 at 16:08, Barry wrote: >> You need to handle the file in bin mode and do the handling of line endings >> and encodings yourself. It’s not that hard for the cases you wanted. > "\n".encode("utf-16") > b'\xff\xfe\n\x00'

Re: tail

2022-05-07 Thread Chris Angelico

On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: > > MRAB writes: > >On 2022-05-07 19:47, Stefan Ram wrote: > ... > >>def encoding( name ): > >>path = pathlib.Path( name ) > >>for encoding in( "utf_8", "latin_1", "cp1252" ): > >>try: > >>with path.open( encoding=encoding

Re: tail

2022-05-07 Thread Chris Angelico

On Sun, 8 May 2022 at 04:37, Marco Sulla wrote: > > On Sat, 7 May 2022 at 19:02, MRAB wrote: > > > > On 2022-05-07 17:28, Marco Sulla wrote: > > > On Sat, 7 May 2022 at 16:08, Barry wrote: > > >> You need to handle the file in bin mode and do the handling of line > > >> endings and encodings yo

Re: tail

2022-05-07 Thread MRAB

On 2022-05-07 19:47, Stefan Ram wrote: Marco Sulla writes: Well, ok, but I need a generic method to get LF and CR for any encoding an user can input. "LF" and "CR" come from US-ASCII. It is theoretically possible that there might be some encodings out there (not for Unicode) that are

Re: tail

2022-05-07 Thread MRAB

On 2022-05-07 19:35, Marco Sulla wrote: On Sat, 7 May 2022 at 19:02, MRAB wrote: > > On 2022-05-07 17:28, Marco Sulla wrote: > > On Sat, 7 May 2022 at 16:08, Barry wrote: > >> You need to handle the file in bin mode and do the handling of line endings and encodings yourself. It’s not that hard

Re: tail

2022-05-07 Thread Dennis Lee Bieber

On Sat, 7 May 2022 20:35:34 +0200, Marco Sulla declaimed the following: >Well, ok, but I need a generic method to get LF and CR for any >encoding an user can input. Other than EBCDIC, and AS BYTES should appear as x0A and x0D in any of the 8-bit encodings (ASCII, ISO-8859-x, CP, UT

Re: tail

2022-05-07 Thread Marco Sulla

On Sat, 7 May 2022 at 19:02, MRAB wrote: > > On 2022-05-07 17:28, Marco Sulla wrote: > > On Sat, 7 May 2022 at 16:08, Barry wrote: > >> You need to handle the file in bin mode and do the handling of line > >> endings and encodings yourself. It’s not that hard for the cases you > >> wanted. > >

Re: tail

2022-05-07 Thread MRAB

On 2022-05-07 17:28, Marco Sulla wrote: On Sat, 7 May 2022 at 16:08, Barry wrote: You need to handle the file in bin mode and do the handling of line endings and encodings yourself. It’s not that hard for the cases you wanted. "\n".encode("utf-16") b'\xff\xfe\n\x00' "".encode("utf-16") b

Re: tail

2022-05-07 Thread Dan Stromberg

I believe I'd do something like: #!/usr/local/cpython-3.10/bin/python3 """ Output the last 10 lines of a potentially-huge file. O(n). But technically so is scanning backward from the EOF. It'd be faster to use a dict, but this has the advantage of working for huge num_lines. """ import d

Re: tail

2022-05-07 Thread Marco Sulla

On Sat, 7 May 2022 at 16:08, Barry wrote: > You need to handle the file in bin mode and do the handling of line endings > and encodings yourself. It’s not that hard for the cases you wanted. >>> "\n".encode("utf-16") b'\xff\xfe\n\x00' >>> "".encode("utf-16") b'\xff\xfe' >>> "a\nb".encode("utf-16

Re: tail

2022-05-07 Thread Barry

> On 7 May 2022, at 14:24, Marco Sulla wrote: > > On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote: >> >>Windows also uses for the EOL marker, but Python's I/O system >> condenses that to just internally (for TEXT mode) -- so using the >> length of a string so read to compute a

Re: tail

2022-05-07 Thread Avi Gross via Python-list

general purpose tool, internationalization from ASCII has created a challenge for lots of such tools. -Original Message- From: Marco Sulla To: Dennis Lee Bieber Cc: python-list@python.org Sent: Sat, May 7, 2022 9:21 am Subject: Re: tail On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote: > >

Re: tail

2022-05-07 Thread Marco Sulla

On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote: > > Windows also uses for the EOL marker, but Python's I/O system > condenses that to just internally (for TEXT mode) -- so using the > length of a string so read to compute a file position may be off-by-one for > each EOL in the stri

Re: tail

2022-05-06 Thread Dennis Lee Bieber

On Fri, 6 May 2022 21:19:48 +0100, MRAB declaimed the following: >Is the file UTF-8? That's a variable-width encoding, so are any of the >characters > U+007F? > >Which OS? On Windows, it's common/normal for UTF-8 files to start with a >BOM/signature, which is 3 bytes/1 codepoint. Windo

Re: tail

2022-05-06 Thread MRAB

On 2022-05-06 20:21, Marco Sulla wrote: I have a little problem. I tried to extend the tail function, so it can read lines from the bottom of a file object opened in text mode. The problem is it does not work. It gets a starting position that is lower than the expected by 3 characters. So the

Re: tail

2022-05-06 Thread Marco Sulla

I have a little problem. I tried to extend the tail function, so it can read lines from the bottom of a file object opened in text mode. The problem is it does not work. It gets a starting position that is lower than the expected by 3 characters. So the first line is read only for 2 chars, and

Re: tail

2022-05-02 Thread Marco Sulla

On Mon, 2 May 2022 at 00:20, Cameron Simpson wrote: > > On 01May2022 18:55, Marco Sulla wrote: > >Something like this is OK? > [...] > >def tail(f): > >chunk_size = 100 > >size = os.stat(f.fileno()).st_size > > I think you want

Re: tail

2022-05-02 Thread Marco Sulla

Ok, I suppose \n and \r are enough: readline(size=- 1, /) Read and return one line from the stream. If size is specified, at most size bytes will be read. The line terminator is always b'\n' for binary files; for text files, the newline argument to open() can be used to select the line

Re: tail

2022-05-02 Thread Chris Angelico

On Tue, 3 May 2022 at 04:38, Marco Sulla wrote: > > On Mon, 2 May 2022 at 18:31, Stefan Ram wrote: > > > > |The Unicode standard defines a number of characters that > > |conforming applications should recognize as line terminators:[7] > > | > > |LF:Line Feed, U+000A > > |VT:Vertical Tab,

Re: tail

2022-05-02 Thread Marco Sulla

On Mon, 2 May 2022 at 18:31, Stefan Ram wrote: > > |The Unicode standard defines a number of characters that > |conforming applications should recognize as line terminators:[7] > | > |LF:Line Feed, U+000A > |VT:Vertical Tab, U+000B > |FF:Form Feed, U+000C > |CR:Carriage Return, U+0

Re: tail

2022-05-01 Thread Chris Angelico

On Mon, 2 May 2022 at 11:54, Cameron Simpson wrote: > > On 01May2022 23:30, Stefan Ram wrote: > >Dan Stromberg writes: > >>But what about Unicode? Are all 10 bytes newlines in Unicode encodings? > > It seems in UTF-8, when a value is above U+007F, it will be > > encoded with bytes that always

Re: tail

2022-05-01 Thread Cameron Simpson

On 01May2022 23:30, Stefan Ram wrote: >Dan Stromberg writes: >>But what about Unicode? Are all 10 bytes newlines in Unicode encodings? > It seems in UTF-8, when a value is above U+007F, it will be > encoded with bytes that always have their high bit set. Aye. Design festure enabling easy resy

Re: tail

2022-05-01 Thread Chris Angelico

On Mon, 2 May 2022 at 09:19, Dan Stromberg wrote: > > On Sun, May 1, 2022 at 3:19 PM Cameron Simpson wrote: > > > On 01May2022 18:55, Marco Sulla wrote: > > >Something like this is OK? > > > > Scanning backward for a byte == 10 in ASCII or ISO-8859 seems fine. > > But what about Unicode? Are al

Re: tail

2022-05-01 Thread Dan Stromberg

On Sun, May 1, 2022 at 3:19 PM Cameron Simpson wrote: > On 01May2022 18:55, Marco Sulla wrote: > >Something like this is OK? > Scanning backward for a byte == 10 in ASCII or ISO-8859 seems fine. But what about Unicode? Are all 10 bytes newlines in Unicode encodings? If not, and you have a hu

Re: tail

2022-05-01 Thread Cameron Simpson

On 01May2022 18:55, Marco Sulla wrote: >Something like this is OK? [...] >def tail(f): >chunk_size = 100 >size = os.stat(f.fileno()).st_size I think you want os.fstat(). >positions = iter(range(size, -1, -chunk_size)) >next(positions) I was wondering about

Re: tail

2022-05-01 Thread Marco Sulla

Something like this is OK? import os def tail(f): chunk_size = 100 size = os.stat(f.fileno()).st_size positions = iter(range(size, -1, -chunk_size)) next(positions) chunk_line_pos = -1 pos = 0 for pos in positions: f.seek(pos) chars = f.read

1 2 3 4 5 6 >

1 - 100 of 513 matches

Mail list logo