> On 19 Mar 2022, at 03:07, Greg Ewing <greg....@canterbury.ac.nz> wrote: > > On 19/03/22 9:40 am, Rathmann wrote: >> The other challenge/question would be to see if there is a way to implement >> this basic approach with lower overhead. > > My original implementation of yield-from didn't have this problem, > because it didn't enter any of the intermediate frames -- it just > ran down a chain of C pointers to find the currently active one. > > At some point this was changed, I think so that all the frames > would show up in the traceback in the event of an exception. > This was evidently seen as more important than having efficient > resumption of nested generators.
So that sounds like the original CPython implementation had an O(depth) component, but with a lower constant factor than the current version? (Of course, since Python limits recursion depth, speaking of O(n) is not quite right.) Prioritizing stack traces over performance seems consistent with the well known blog post on tail recursion elimination. This is one of the reasons I was motivated to try to attack this at the library/application level -- when the programmer invokes a transformation like I am proposing, they know that it is going to affect the stack trace. I am still hoping to find a way to implement the transformed function/driver combination with lower overhead. This is a little tricky since the performance of the approaches is very dependent on the Python implementation. Below are some timings for 2.7 and 3.8 CPython and PyPy. For 2.7, for the untransformed version I used a loop of yield statements, and timed with time.clock(), for 3.8, it is "yield from" and time.perf_counter(). All times are for traversing a balanced tree of 8 million nodes on X86_64 Linux. 3.8.10 (default, Nov 26 2021, 20:14:08) [GCC 9.3.0] For depth= 22 nodes= 8388607 iters= 1 Elapsed time (yield from): 6.840768865999053 driver: 6.702329244999419 3.6.9 (7.3.1+dfsg-4, Apr 22 2020, 05:15:29) [PyPy 7.3.1 with GCC 9.3.0] For depth= 22 nodes= 8388607 iters= 1 Elapsed time (yield from): 4.037276787999872 driver: 2.3724582960003318 2.7.18 (default, Mar 8 2021, 13:02:45) [GCC 9.3.0] For depth= 22 nodes= 8388607 iters= 1 Elapsed time (yield loop): 8.863244 driver: 13.788312 2.7.13 (7.3.1+dfsg-2, Apr 21 2020, 05:05:41) [PyPy 7.3.1 with GCC 9.3.0] For depth= 22 nodes= 8388607 iters= 1 Elapsed time (yield loop): 8.802712562 driver: 2.034388533 The most extreme variation was 2.7 pypy, where the driver was 4 times faster. (Of course, 2.7 is less used these days.) -- https://mail.python.org/mailman/listinfo/python-list