Re: [Python-ideas] PEP draft: context variables
On 13 October 2017 at 10:56, Guido van Rossum wrote: > I'm out of energy to debate every point (Steve said it well -- that > decimal/generator example is too contrived), but I found one nit in Nick's > email that I wish to correct. > > On Wed, Oct 11, 2017 at 1:28 AM, Nick Coghlan wrote: >> >> As a less-contrived example, consider context managers implemented as >> generators. >> >> We want those to run with the execution context that's active when >> they're used in a with statement, not the one that's active when they're >> created (the fact that generator-based context managers can only be used >> once mitigates the risk of creation time context capture causing problems, >> but the implications would still be weird enough to be worth avoiding). >> > > Here I think we're in agreement about the desired semantics, but IMO all > this requires is some special casing for @contextlib.contextmanager. To me > this is the exception, not the rule -- in most *other* places I would want > the yield to switch away from the caller's context. > > >> For native coroutines, we want them to run with the execution context >> that's active when they're awaited or when they're prepared for submission >> to an event loop, not the one that's active when they're created. >> > > This caught my eye as wrong. Considering that asyncio's tasks (as well as > curio's and trio's) *are* native coroutines, we want complete isolation > between the context active when `await` is called and the context active > inside the `async def` function. > The rationale for this behaviour *does* arise from a refactoring argument: async def original_async_function(): with some_context(): do_some_setup() raw_data = await some_operation() data = do_some_postprocessing(raw_data) Refactored: async def async_helper_function(): do_some_setup() raw_data = await some_operation() return do_some_postprocessing(raw_data) async def refactored_async_function(): with some_context(): data = await async_helper_function() However, considering that coroutines are almost always instantiated at the point where they're awaited, I do concede that creation time context capture would likely also work out OK for the coroutine case, which would leave contextlib.contextmanager as the only special case (and it would turn off both creation-time context capture *and* context isolation). Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Add time.time_ns(): system clock with nanosecond resolution
Hi, I would like to add new functions to return time as a number of nanosecond (Python int), especially time.time_ns(). It would enhance the time.time() clock resolution. In my experience, it decreases the minimum non-zero delta between two clock by 3 times, new "ns" clock versus current clock: 84 ns (2.8x better) vs 239 ns on Linux, and 318 us (2.8x better) vs 894 us on Windows, measured in Python. The question of this long email is if it's worth it to add more "_ns" time functions than just time.time_ns()? I would like to add: * time.time_ns() * time.monotonic_ns() * time.perf_counter_ns() * time.clock_gettime_ns() * time.clock_settime_ns() time(), monotonic() and perf_counter() clocks are the 3 most common clocks and users use them to get the best available clock resolution. clock_gettime/settime() are the generic UNIX API to access these clocks and so should also be enhanced to get nanosecond resolution. == Nanosecond resolution == More and more clocks have a frequency in MHz, up to GHz for the "TSC" CPU clock, and so the clocks resolution is getting closer to 1 nanosecond (or even better than 1 ns for the TSC clock!). The problem is that Python returns time as a floatting point number which is usually a 64-bit binary floatting number (in the IEEE 754 format). This type starts to loose nanoseconds after 104 days. Conversion from nanoseconds (int) to seconds (float) and then back to nanoseconds (int) to check if conversions loose precision: # no precision loss >>> x=2**52+1; int(float(x * 1e-9) * 1e9) - x 0 # precision loss! (1 nanosecond) >>> x=2**53+1; int(float(x * 1e-9) * 1e9) - x -1 >>> print(datetime.timedelta(seconds=2**53 / 1e9)) 104 days, 5:59:59.254741 While a system administrator can be proud to have an uptime longer than 104 days, the problem also exists for the time.time() clock which returns the number of seconds since the UNIX epoch (1970-01-01). This clock started to loose nanoseconds since mid-May 1970 (47 years ago): >>> import datetime >>> print(datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=2**53 / >>> 1e9)) 1970-04-15 05:59:59.254741 == PEP 410 == Five years ago, I proposed a large and complex change in all Python functions returning time to support nanosecond resolution using the decimal.Decimal type: https://www.python.org/dev/peps/pep-0410/ The PEP was rejected for different reasons: * it wasn't clear if hardware clocks really had a resolution of 1 nanosecond, especially when the clock is read from Python, since reading a clock in Python also takes time... * Guido van Rossum rejected the idea of adding a new optional parameter to change the result type: it's an uncommon programming practice (bad design in Python) * decimal.Decimal is not widely used, it might be surprised to get such type == CPython enhancements of the last 5 years == Since this PEP was rejected: * the os.stat_result got 3 fields for timestamps as nanoseconds (Python int): st_atime_ns, st_ctime_ns, st_mtime_ns * Python 3.3 got 3 new clocks: time.monotonic(), time.perf_counter() and time.process_time() * I enhanced the private C API of Python handling time (API called "pytime") to store all timings as the new _PyTime_t type which is a simple 64-bit signed integer. The unit of _PyTime_t is not part of the API, it's an implementation detail. The unit is currently 1 nanosecond. This week, I converted one of the last clock to new _PyTime_t format: time.perf_counter() now has internally a resolution of 1 nanosecond, instead of using the C double type. XXX technically https://github.com/python/cpython/pull/3983 is not merged yet :-) == Clocks resolution in Python == I implemented time.time_ns(), time.monotonic_ns() and time.perf_counter_ns() which are similar of the functions without the "_ns" suffix, but return time as nanoseconds (Python int). I computed the smallest difference between two clock reads (ignoring a differences of zero): Linux: * time_ns(): 84 ns <=== !!! * time(): 239 ns <=== !!! * perf_counter_ns(): 84 ns * perf_counter(): 82 ns * monotonic_ns(): 84 ns * monotonic(): 81 ns Windows: * time_ns(): 318000 ns <=== !!! * time(): 894070 ns <=== !!! * perf_counter_ns(): 100 ns * perf_counter(): 100 ns * monotonic_ns(): 1500 ns * monotonic(): 1500 ns The difference on time.time() is significant: 84 ns (2.8x better) vs 239 ns on Linux and 318 us (2.8x better) vs 894 us on Windows. The difference will be larger next years since every day adds 864,00,000,000,000 nanoseconds to the system clock :-) (please don't bug me with leap seconds! you got my point) The difference on perf_counter and monotonic clocks are not visible in this quick script since my script runs less than 1 minute, my computer uptime is smaller than 1 weak, ... and Python internally starts these clocks at zero *to reduce the precision loss*! Using an uptime larger than 104 days, you would probably see a significant difference (at least +/- 1 nanosecond) between the regular (seconds a
Re: [Python-ideas] Add time.time_ns(): system clock with nanosecond resolution
Victor Stinner schrieb am 13.10.2017 um 16:12: > I would like to add new functions to return time as a number of > nanosecond (Python int), especially time.time_ns(). I might have missed it while skipping through your post, but could you quickly explain why improving the precision of time.time() itself wouldn't help already? Would double FP precision not be accurate enough here? Stefan ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add time.time_ns(): system clock with nanosecond resolution
2017-10-13 16:57 GMT+02:00 Stefan Behnel :
> I might have missed it while skipping through your post, but could you
> quickly explain why improving the precision of time.time() itself wouldn't
> help already? Would double FP precision not be accurate enough here?
80-bit binary float ("long double") is not portable. Since SSE, Intel
CPU don't use them anymore, no?
Modifying the Python float type would be a large change.
Victor
___
Python-ideas mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add time.time_ns(): system clock with nanosecond resolution
On Fri, 13 Oct 2017 16:57:28 +0200 Stefan Behnel wrote: > Victor Stinner schrieb am 13.10.2017 um 16:12: > > I would like to add new functions to return time as a number of > > nanosecond (Python int), especially time.time_ns(). > > I might have missed it while skipping through your post, but could you > quickly explain why improving the precision of time.time() itself wouldn't > help already? Would double FP precision not be accurate enough here? To quote Victor's message: « The problem is that Python returns time as a floatting point number which is usually a 64-bit binary floatting number (in the IEEE 754 format). This type starts to loose nanoseconds after 104 days. » Regards Antoine. ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On Fri, Oct 13, 2017 at 3:25 AM, Nick Coghlan wrote: [..] > The rationale for this behaviour *does* arise from a refactoring argument: > >async def original_async_function(): > with some_context(): > do_some_setup() > raw_data = await some_operation() > data = do_some_postprocessing(raw_data) > > Refactored: > >async def async_helper_function(): > do_some_setup() > raw_data = await some_operation() > return do_some_postprocessing(raw_data) > >async def refactored_async_function(): > with some_context(): > data = await async_helper_function() > > However, considering that coroutines are almost always instantiated at the > point where they're awaited, "almost always" is an incorrect assumption. "usually" would be the correct one. > I do concede that creation time context capture > would likely also work out OK for the coroutine case, which would leave > contextlib.contextmanager as the only special case (and it would turn off > both creation-time context capture *and* context isolation). I still believe that both versions of PEP 550 (v1 & latest) got this right: * Coroutines on their own don't capture context; * Tasks manage context for coroutines they wrap. Yury ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] (PEP 555 subtopic) Propagation of context in async code
This is a continuation of the PEP 555 discussion in
https://mail.python.org/pipermail/python-ideas/2017-September/046916.html
And this month in
https://mail.python.org/pipermail/python-ideas/2017-October/047279.html
If you are new to the discussion, the best point to start reading this
might be at my second full paragraph below ("The status quo...").
On Fri, Oct 13, 2017 at 10:25 AM, Nick Coghlan wrote:
> On 13 October 2017 at 10:56, Guido van Rossum wrote:
>
>> I'm out of energy to debate every point (Steve said it well -- that
>> decimal/generator example is too contrived), but I found one nit in Nick's
>> email that I wish to correct.
>>
>> On Wed, Oct 11, 2017 at 1:28 AM, Nick Coghlan wrote:
>>>
>>> As a less-contrived example, consider context managers implemented as
>>> generators.
>>>
>>> We want those to run with the execution context that's active when
>>> they're used in a with statement, not the one that's active when they're
>>> created (the fact that generator-based context managers can only be used
>>> once mitigates the risk of creation time context capture causing problems,
>>> but the implications would still be weird enough to be worth avoiding).
>>>
>>
>> Here I think we're in agreement about the desired semantics, but IMO all
>> this requires is some special casing for @contextlib.contextmanager. To me
>> this is the exception, not the rule -- in most *other* places I would want
>> the yield to switch away from the caller's context.
>>
>>
>>> For native coroutines, we want them to run with the execution context
>>> that's active when they're awaited or when they're prepared for submission
>>> to an event loop, not the one that's active when they're created.
>>>
>>
>> This caught my eye as wrong. Considering that asyncio's tasks (as well as
>> curio's and trio's) *are* native coroutines, we want complete isolation
>> between the context active when `await` is called and the context active
>> inside the `async def` function.
>>
>
> The rationale for this behaviour *does* arise from a refactoring argument:
>
>async def original_async_function():
> with some_context():
> do_some_setup()
> raw_data = await some_operation()
> data = do_some_postprocessing(raw_data)
>
> Refactored:
>
>async def async_helper_function():
> do_some_setup()
> raw_data = await some_operation()
> return do_some_postprocessing(raw_data)
>
>async def refactored_async_function():
> with some_context():
> data = await async_helper_function()
>
>
*This* type of refactoring argument I *do* subscribe to.
> However, considering that coroutines are almost always instantiated at the
> point where they're awaited, I do concede that creation time context
> capture would likely also work out OK for the coroutine case, which would
> leave contextlib.contextmanager as the only special case (and it would turn
> off both creation-time context capture *and* context isolation).
>
The difference between context propagation through coroutine function
calls and awaits comes up when you need help from "the" event loop, which
means things like creating new tasks from coroutines. However, we cannot
even assume that the loop is the only one. So far, it makes no difference
where you call the coroutine function. It is only when you await it or
schedule it for execution in a loop when something can actually happen.
The status quo is that there's nothing that prevents you from calling a
coroutine function from within one event loop and then awaiting it in
another. So if we want an event loop to be able to pass information down
the call chain in such a way that the information is available *throughout
the whole task that it is driving*, then the contexts needs to a least
propagate through `await`s.
This was my starting point 2.5 years ago, when Yury was drafting this
status quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that
there will be a problem, where each API that uses "blocking IO" somewhere
under the hood would need a duplicate version for asyncio (and one for each
third-party async framework!). I felt it was necessary to think about a
solution before PEP 492 is accepted, and this became a fairly short-lived
thread here on python-ideas:
https://mail.python.org/pipermail/python-ideas/2015-May/033267.html
This year, the discussion on Yury's PEP 550 somehow ended up with a very
similar need before I got involved, apparently for independent reasons.
A design for solving this need (and others) is also found in my first draft
of PEP 555, found at
https://mail.python.org/pipermail/python-ideas/2017-September/046916.html
Essentially, it's a way of *passing information down the call chain* when
it's inconvenient or impossible to pass the information as normal function
arguments. I now call the concept "context arguments".
More recently, I put some focus on the direct needs of normal users (as
opposed direct nee
Re: [Python-ideas] (PEP 555 subtopic) Propagation of context in async code
On Fri, Oct 13, 2017 at 11:49 AM, Koos Zevenhoven wrote: [..] > This was my starting point 2.5 years ago, when Yury was drafting this status > quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there > will be a problem, where each API that uses "blocking IO" somewhere under > the hood would need a duplicate version for asyncio (and one for each > third-party async framework!). I felt it was necessary to think about a > solution before PEP 492 is accepted, and this became a fairly short-lived > thread here on python-ideas: Well, it's obvious why the thread was "short-lived". Don't mix non-blocking and blocking code and don't nest asyncio loops. But I believe this new subtopic is a distraction. You should start a new thread on Python-ideas if you want to discuss the acceptance of PEP 492 2.5 years ago. [..] > The bigger question is, what should happen when a coroutine awaits on > another coroutine directly, without giving the framework a change to > interfere: > > > async def inner(): > do_context_aware_stuff() > > async def outer(): > with first_context(): > coro = inner() > > with second_context(): > await coro > > The big question is: In the above, which context should the coroutine be run > in? The real big question is how people usually write code. And the answer is that they *don't write it like that* at all. Many context managers in many frameworks (aiohttp, tornado, and even asyncio) require you to wrap your await expressions in them. Not coroutine instantiation. A more important point is that existing context solutions for async frameworks can only support a with statement around an await expression. And people that use such solutions know that 'with ...: coro = inner()' isn't going to work at all. Therefore wrapping coroutine instantiation in a 'with' statement is not a pattern. It can only become a pattern, if whatever execution context PEP accepted in Python 3.7 encouraged people to use it. [..] > Both of these would have their own stack of (argument, value) assignment > pairs, explained in the implementation part of the first PEP 555 draft. > While this is a complication, the performance overhead of these is so small, > that doubling the overhead should not be a performance concern. Please stop handwaving performance. Using big O notation: PEP 555, worst complexity for uncached lookup: O(N), where 'N' is the total number of all context values for all context keys for the current frame stack. For a recursive function you can easily have a situation where cache is invalidated often, and code starts to run slower and slower. PEP 550 v1, worst complexity for uncached lookup: O(1), see [1]. PEP 550 v2+, worst complexity for uncached lookup: O(k), where 'k' is the number of nested generators for the current frame. Usually k=1. While caching will mitigate PEP 555' bad performance characteristics in *tight loops*, the performance of uncached path must not be ignored. Yury [1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt-performance-analysis ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On Fri, Oct 13, 2017 at 3:25 AM, Nick Coghlan wrote: [..] > However, considering that coroutines are almost always instantiated at the > point where they're awaited, I do concede that creation time context capture > would likely also work out OK for the coroutine case, which would leave > contextlib.contextmanager as the only special case (and it would turn off > both creation-time context capture *and* context isolation). Actually, capturing context at the moment of coroutine creation (in PEP 550 v1 semantics) will not work at all. Async context managers will break. class AC: async def __aenter__(self): pass ^ If the context is captured when coroutines are instantiated, __aenter__ won't be able to set context variables and thus affect the code it wraps. That's why coroutines shouldn't capture context when created, nor they should isolate context. It's a job of async Task. Yury ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On 13Oct2017 0941, Yury Selivanov wrote: On Fri, Oct 13, 2017 at 3:25 AM, Nick Coghlan wrote: [..] However, considering that coroutines are almost always instantiated at the point where they're awaited, I do concede that creation time context capture would likely also work out OK for the coroutine case, which would leave contextlib.contextmanager as the only special case (and it would turn off both creation-time context capture *and* context isolation). Actually, capturing context at the moment of coroutine creation (in PEP 550 v1 semantics) will not work at all. Async context managers will break. class AC: async def __aenter__(self): pass ^ If the context is captured when coroutines are instantiated, __aenter__ won't be able to set context variables and thus affect the code it wraps. That's why coroutines shouldn't capture context when created, nor they should isolate context. It's a job of async Task. Then make __aenter__/__aexit__ when called by "async with" an exception to the normal semantics? It seems simpler to have one specially named and specially called function be special, rather than make the semantics more complicated for all functions. Cheers, Steve ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On Fri, Oct 13, 2017 at 2:07 AM, Stefan Krah wrote: [..] > So the decimal examples can be helpful for understanding, but (except > for the performance issues) shouldn't be the centerpiece of the > discussion. > > > Speaking of performance, I have seen that adressed in Koos' PEP at all. > Perhaps I missed something. > Indeed I do mention performance here and there in the PEP 555 draft. Lookups can be made fast and O(1) in most cases. Even with the simplest unoptimized implementation, the worst-case lookup complexity would be O(n), where n is the number of assignment contexts entered after the one which is being looked up from (or in other words, nested inside the one that is being looked up from). This means that for use cases where the relevant context is entered as the innermost context level, the lookups are O(1) even without any optimizations. It is perfectly reasonable to make an implementation where lookups are *always* O(1). Still, it might make more sense to implement a half-way solution with "often O(1)", because that has somewhat less overhead in case the callees end up not doing any lookups. For synchronous code that does not use context arguments and that does not involve generators, there is absolutely *zero* overhead. For code that uses generators, but does not use context arguments, there is virtually no overhead either. I explain this in terms of C code in https://mail.python.org/pipermail/python-ideas/2017-October/047292.html In fact, I might want to add a another Py_INCREF and Py_DECREF per each call to next/send, because the hack to defer (and often avoid) the Py_INCREF of the outer stack would not be worth the performance gain. But that's it. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On 10/13/2017 09:48 AM, Steve Dower wrote: On 13Oct2017 0941, Yury Selivanov wrote: Actually, capturing context at the moment of coroutine creation (in PEP 550 v1 semantics) will not work at all. Async context managers will break. class AC: async def __aenter__(self): pass ^ If the context is captured when coroutines are instantiated, __aenter__ won't be able to set context variables and thus affect the code it wraps. That's why coroutines shouldn't capture context when created, nor they should isolate context. It's a job of async Task. Then make __aenter__/__aexit__ when called by "async with" an exception to the normal semantics? It seems simpler to have one specially named and specially called function be special, rather than make the semantics more complicated for all functions. +1. I think that would make it much more usable by those of us who are not experts. -- ~Ethan~ ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] (PEP 555 subtopic) Propagation of context in async code
On Fri, Oct 13, 2017 at 7:38 PM, Yury Selivanov wrote: > On Fri, Oct 13, 2017 at 11:49 AM, Koos Zevenhoven > wrote: > [..] > > This was my starting point 2.5 years ago, when Yury was drafting this > status > > quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there > > will be a problem, where each API that uses "blocking IO" somewhere under > > the hood would need a duplicate version for asyncio (and one for each > > third-party async framework!). I felt it was necessary to think about a > > solution before PEP 492 is accepted, and this became a fairly short-lived > > thread here on python-ideas: > > Well, it's obvious why the thread was "short-lived". Don't mix > non-blocking and blocking code and don't nest asyncio loops. But I > believe this new subtopic is a distraction. Nesting is not the only way to have interaction between two event loops. But whenever anyone *does* want to nest two loops, they are perhaps more likely to be loops of different frameworks. You believe that the semantics in async code is a distraction? > You should start a new > thread on Python-ideas if you want to discuss the acceptance of PEP > 492 2.5 years ago. > I 'm definitely not interested in discussing the acceptance of PEP 492. > > [..] > > The bigger question is, what should happen when a coroutine awaits on > > another coroutine directly, without giving the framework a change to > > interfere: > > > > > > async def inner(): > > do_context_aware_stuff() > > > > async def outer(): > > with first_context(): > > coro = inner() > > > > with second_context(): > > await coro > > > > The big question is: In the above, which context should the coroutine be > run > > in? > > The real big question is how people usually write code. And the > answer is that they *don't write it like that* at all. Many context > managers in many frameworks (aiohttp, tornado, and even asyncio) > require you to wrap your await expressions in them. Not coroutine > instantiation. > You know very well that I've been talking about how people usually write code etc. But we still need to handle the corner cases too. > > A more important point is that existing context solutions for async > frameworks can only support a with statement around an await > expression. And people that use such solutions know that 'with ...: > coro = inner()' isn't going to work at all. > > Therefore wrapping coroutine instantiation in a 'with' statement is > not a pattern. It can only become a pattern, if whatever execution > context PEP accepted in Python 3.7 encouraged people to use it. > > The code is to illustrate semantics, not an example of real code. The point is to highlight that the context has changed between when the coroutine function was called and when it is awaited. That's certainly a thing that can happen in real code, even if it is not the most typical case. I do mention this in my previous email. > [..] > > Both of these would have their own stack of (argument, value) assignment > > pairs, explained in the implementation part of the first PEP 555 draft. > > While this is a complication, the performance overhead of these is so > small, > > that doubling the overhead should not be a performance concern. > > Please stop handwaving performance. Using big O notation: > > There is discussion on perfomance elsewhere, now also in this other subthread: https://mail.python.org/pipermail/python-ideas/2017-October/047327.html PEP 555, worst complexity for uncached lookup: O(N), where 'N' is the > total number of all context values for all context keys for the > current frame stack. Not true. See the above link. Lookups are fast (*and* O(1), if we want them to be). PEP 555 stacks are independent of frames, BTW. > For a recursive function you can easily have a > situation where cache is invalidated often, and code starts to run > slower and slower. > Not true either. The lookups are O(1) in a recursive function, with and without nested contexts. I started this thread for discussion about semantics in an async context. Stefan asked about performance in the other thread, so I posted there. ––Koos > PEP 550 v1, worst complexity for uncached lookup: O(1), see [1]. > > PEP 550 v2+, worst complexity for uncached lookup: O(k), where 'k' is > the number of nested generators for the current frame. Usually k=1. > > While caching will mitigate PEP 555' bad performance characteristics > in *tight loops*, the performance of uncached path must not be > ignored. > > Yury > > [1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt- > performance-analysis > -- + Koos Zevenhoven + http://twitter.com/k7hoven + ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On Fri, Oct 13, 2017 at 8:45 PM, Ethan Furman wrote: > On 10/13/2017 09:48 AM, Steve Dower wrote: > >> On 13Oct2017 0941, Yury Selivanov wrote: >> > > Actually, capturing context at the moment of coroutine creation (in >>> PEP 550 v1 semantics) will not work at all. Async context managers >>> will break. >>> >>> class AC: >>> async def __aenter__(self): >>> pass >>> >>> ^ If the context is captured when coroutines are instantiated, >>> __aenter__ won't be able to set context variables and thus affect the >>> code it wraps. That's why coroutines shouldn't capture context when >>> created, nor they should isolate context. It's a job of async Task. >>> >> >> Then make __aenter__/__aexit__ when called by "async with" an exception >> to the normal semantics? >> >> It seems simpler to have one specially named and specially called >> function be special, rather than make the semantics >> more complicated for all functions. >> > > +1. I think that would make it much more usable by those of us who are > not experts. > The semantics is not really dependent on __aenter__ and __aexit__. They can be used together with both semantic variants that I'm describing for PEP 555, and without any special casing. IOW, this is independent of any remaining concerns in PEP 555. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Add a module itertools.recipes
Hello, A very useful part of the itertools module's documentation is the section "Recipes", giving utility functions that use itertools iterators. But when you want to use one of theese functions, you have to copy it in your source code (or use external PyPI modules like iterutils). Can we consider making itertools a package and adding a module itertools.recipes that implements all these utilility functions? Regards. -- Antoine Rozo ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On Fri, Oct 13, 2017 at 1:45 PM, Ethan Furman wrote: > On 10/13/2017 09:48 AM, Steve Dower wrote: >> >> On 13Oct2017 0941, Yury Selivanov wrote: > > >>> Actually, capturing context at the moment of coroutine creation (in >>> PEP 550 v1 semantics) will not work at all. Async context managers >>> will break. >>> >>> class AC: >>> async def __aenter__(self): >>> pass >>> >>> ^ If the context is captured when coroutines are instantiated, >>> __aenter__ won't be able to set context variables and thus affect the >>> code it wraps. That's why coroutines shouldn't capture context when >>> created, nor they should isolate context. It's a job of async Task. >> >> >> Then make __aenter__/__aexit__ when called by "async with" an exception to >> the normal semantics? >> >> It seems simpler to have one specially named and specially called function >> be special, rather than make the semantics >> more complicated for all functions. > It's not possible to special case __aenter__ and __aexit__ reliably (supporting wrappers, decorators, and possible side effects). > +1. I think that would make it much more usable by those of us who are not > experts. I still don't understand what Steve means by "more usable", to be honest. Yury ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add a module itertools.recipes
On Sat, Oct 14, 2017 at 5:16 AM, Antoine Rozo wrote: > Hello, > > A very useful part of the itertools module's documentation is the section > "Recipes", giving utility functions that use itertools iterators. > But when you want to use one of theese functions, you have to copy it in > your source code (or use external PyPI modules like iterutils). > > Can we consider making itertools a package and adding a module > itertools.recipes that implements all these utilility functions? Check out more-itertools on PyPI - maybe that's what you want? ChrisA ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] (PEP 555 subtopic) Propagation of context in async code
On Fri, Oct 13, 2017 at 1:46 PM, Koos Zevenhoven wrote: > On Fri, Oct 13, 2017 at 7:38 PM, Yury Selivanov > wrote: >> >> On Fri, Oct 13, 2017 at 11:49 AM, Koos Zevenhoven >> wrote: >> [..] >> > This was my starting point 2.5 years ago, when Yury was drafting this >> > status >> > quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there >> > will be a problem, where each API that uses "blocking IO" somewhere >> > under >> > the hood would need a duplicate version for asyncio (and one for each >> > third-party async framework!). I felt it was necessary to think about a >> > solution before PEP 492 is accepted, and this became a fairly >> > short-lived >> > thread here on python-ideas: >> >> Well, it's obvious why the thread was "short-lived". Don't mix >> non-blocking and blocking code and don't nest asyncio loops. But I >> believe this new subtopic is a distraction. > > > Nesting is not the only way to have interaction between two event loops. > But whenever anyone *does* want to nest two loops, they are perhaps more > likely to be loops of different frameworks. > > You believe that the semantics in async code is a distraction? Discussing blocking calls and/or nested event loops in async code is certainly a distraction :) [..] >> The real big question is how people usually write code. And the >> answer is that they *don't write it like that* at all. Many context >> managers in many frameworks (aiohttp, tornado, and even asyncio) >> require you to wrap your await expressions in them. Not coroutine >> instantiation. > > > You know very well that I've been talking about how people usually write > code etc. But we still need to handle the corner cases too. [..] > The code is to illustrate semantics, not an example of real code. The point > is to highlight that the context has changed between when the coroutine > function was called and when it is awaited. That's certainly a thing that > can happen in real code, even if it is not the most typical case. I do > mention this in my previous email. I understand the point and what you're trying to illustrate. I'm saying that people don't write 'with smth: c = coro()' because it's currently pointless. And unless you tell them they should, they won't. > >> >> [..] >> > Both of these would have their own stack of (argument, value) assignment >> > pairs, explained in the implementation part of the first PEP 555 draft. >> > While this is a complication, the performance overhead of these is so >> > small, >> > that doubling the overhead should not be a performance concern. >> >> Please stop handwaving performance. Using big O notation: >> > > There is discussion on perfomance elsewhere, now also in this other > subthread: > > https://mail.python.org/pipermail/python-ideas/2017-October/047327.html > >> PEP 555, worst complexity for uncached lookup: O(N), where 'N' is the >> total number of all context values for all context keys for the >> current frame stack. Quoting you from that link: "Indeed I do mention performance here and there in the PEP 555 draft. Lookups can be made fast and O(1) in most cases. Even with the simplest unoptimized implementation, the worst-case lookup complexity would be O(n), where n is the number of assignment contexts entered after the one which is being looked up from (or in other words, nested inside the one that is being looked up from). This means that for use cases where the relevant context is entered as the innermost context level, the lookups are O(1) even without any optimizations. It is perfectly reasonable to make an implementation where lookups are *always* O(1). Still, it might make more sense to implement a half-way solution with "often O(1)", because that has somewhat less overhead in case the callees end up not doing any lookups." So where's the actual explanation of how you can make *uncached* lookups O(1) in your best implementation? I only see you claiming that you know how to do that. And since you're using a stack of values instead of hash tables, your explanation can make a big impact on the CS field :) It's perfectly reasonable to say that "cached lookups in my optimization is O(1)". Saying that "in most cases it's O(1)" isn't how the big O notation should be used. BTW, what's the big O for capturing the entire context in PEP 555 (get_execution_context() in PEP 550)? How will that operation be implemented? A shallow copy of the stack? Also, if I had this: with c.assign(o1): with c.assign(o2): with c.assign(o3): ctx = capture_context() will ctx have references to o1, o2, and o3? > Not true. See the above link. Lookups are fast (*and* O(1), if we want them > to be). > > PEP 555 stacks are independent of frames, BTW. > > >> >> For a recursive function you can easily have a >> situation where cache is invalidated often, and code starts to run >> slower and slower. > > > Not true either. The lookups are O(1) in a recursive function, with and > without nested conte
Re: [Python-ideas] PEP draft: context variables
On 13 October 2017 at 19:32, Yury Selivanov wrote: >>> It seems simpler to have one specially named and specially called function >>> be special, rather than make the semantics >>> more complicated for all functions. >> > > It's not possible to special case __aenter__ and __aexit__ reliably > (supporting wrappers, decorators, and possible side effects). > >> +1. I think that would make it much more usable by those of us who are not >> experts. > > I still don't understand what Steve means by "more usable", to be honest. I'd consider myself a "non-expert" in async. Essentially, I ignore it - I don't write the sort of applications that would benefit significantly from it, and I don't see any way to just do "a little bit" of async, so I never use it. But I *do* see value in the context variable proposals here - if only in terms of them being a way to write my code to respond to external settings in an async-friendly way. I don't follow the underlying justification (which is based in "we need this to let things work with async/coroutines) at all, but I'm completely OK with the basic idea (if I want to have a setting that behaves "naturally", like I'd expect decimal contexts to do, it needs a certain amount of language support, so the proposal is to add that). I'd expect to be able to write context variables that my code could respond to using a relatively simple pattern, and have things "just work". Much like I can write a context manager using @contextmanager and yield, and not need to understand all the intricacies of __enter__ and __exit__. (BTW, apologies if I'm mangling the terminology here - write it off as part of me being "not an expert" :-)) What I'm getting from this discussion is that even if I *do* have a simple way of writing context variables, they'll still behave in ways that seem mildly weird to me (as a non-async user). Specifically, my head hurts when I try to understand what that decimal context example "should do". My instincts say that the current behaviour is wrong - but I'm not sure I can explain why. So on that example, I'd ask the following of any proposal: 1. Users trying to write a context variable[1] shouldn't have to jump through hoops to get "natural" behaviour. That means that suggestions that the complexity be pushed onto decimal.context aren't OK unless it's also accepted that the current behaviour is wrong, and the only reason decimal.context needs to replicated is for backward compatibility (and new code can ignore the problem). 2. The proposal should clearly establish what it views as "natural" behaviour, and why. I'm not happy with "it's how decimal.context has always behaved" as an explanation. Sure, people asking to break backward compatibility should have a good justification, but equally, people arguing to *preserve* an unintuitive current behaviour in new code should be prepared to explain why it's not a bug. To put it another way, context variables aren't required to be bug-compatible with thread local storage. [1] I'm assuming here that "settings that affect how a library behave" is a common requirement, and the PEP is intended as the "one obvious way" to implement them. Nick's other async refactoring example is different. If the two forms he showed don't behave identically in all contexts, then I'd consider that to be a major problem. Saying that "coroutines are special" just reads to me as "coroutines/async are sufficiently weird that I can't expect my normal patterns of reasoning to work with them". (Apologies if I'm conflating coroutines and async incorrectly - as a non-expert, they are essentially indistinguishable to me). I sincerely hope that isn't the message I should be getting - async is already more inaccessible than I'd like for the average user. The fact that Nick's async example immediately devolved into a discussion that I can't follow at all is fine - to an extent. I don't mind the experts debating implementation details that I don't need to know about. But if you make writing context variables harder, just to fix Nick's example, or if you make *using* async code like (either of) Nick's forms harder, then I do object, because that's affecting the end user experience. In that context, I take Steve's comment as meaning "fiddling about with how __aenter__ and __aexit__ work is fine, as that's internals that non-experts like me don't care about - but making context variables behave oddly because of this is *not* fine". Apologies if the above is unhelpful. I've been lurking but not commenting here, precisely because I *am* a non-expert, and I trust the experts to build something that works. But when non-experts were explicitly mentioned, I thought my input might be useful. The following quote from the Zen seems particularly relevant here: If the implementation is hard to explain, it's a bad idea. (although the one about needing to be Dutch to understand why something is obvious might well trump it ;-)) Paul __
Re: [Python-ideas] PEP draft: context variables
On Fri, Oct 13, 2017 at 4:29 PM, Paul Moore wrote: [..] > Nick's other async refactoring example is different. If the two forms > he showed don't behave identically in all contexts, then I'd consider > that to be a major problem. Saying that "coroutines are special" just > reads to me as "coroutines/async are sufficiently weird that I can't > expect my normal patterns of reasoning to work with them". (Apologies > if I'm conflating coroutines and async incorrectly - as a non-expert, > they are essentially indistinguishable to me). I sincerely hope that > isn't the message I should be getting - async is already more > inaccessible than I'd like for the average user. Nick's idea that coroutines can isolate context was actually explored before in PEP 550 v3, and then, rather quickly, it became apparent that it wouldn't work. Steve's comments were about a specific example about generators, not coroutines. We can't special case __aenter__, we simply can not. __aenter__ can be a chain of coroutines -- its own separate call stack, we can't say that this whole call stack is behaving differently from all other code with respect to execution context. At this time, we have so many conflicted examples and tangled discussions on these topics, that I myself just lost what everybody is implying by "this semantics isn't obvious to *me*". Which semantics? It's hard to tell. At this point of time, there's just one place which describes one well defined semantics: PEP 550 latest version. Paul, if you have time/interest, please take a look at it, and say what's confusing there. Yury ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On 13Oct2017 1132, Yury Selivanov wrote: On Fri, Oct 13, 2017 at 1:45 PM, Ethan Furman wrote: On 10/13/2017 09:48 AM, Steve Dower wrote: On 13Oct2017 0941, Yury Selivanov wrote: Actually, capturing context at the moment of coroutine creation (in PEP 550 v1 semantics) will not work at all. Async context managers will break. class AC: async def __aenter__(self): pass ^ If the context is captured when coroutines are instantiated, __aenter__ won't be able to set context variables and thus affect the code it wraps. That's why coroutines shouldn't capture context when created, nor they should isolate context. It's a job of async Task. Then make __aenter__/__aexit__ when called by "async with" an exception to the normal semantics? It seems simpler to have one specially named and specially called function be special, rather than make the semantics more complicated for all functions. It's not possible to special case __aenter__ and __aexit__ reliably (supporting wrappers, decorators, and possible side effects). Why not? Can you not add a decorator that sets a flag on the code object that means "do not create a new context when called", and then it doesn't matter where the call comes from - these functions will always read and write to the caller's context. That seems generally useful anyway, and then you just say that __aenter__ and __aexit__ are special and always have that flag set. +1. I think that would make it much more usable by those of us who are not experts. I still don't understand what Steve means by "more usable", to be honest. I don't know that I said "more usable", but it would certainly be easier to explain. The Zen has something to say about that... Cheers, Steve ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add a module itertools.recipes
On Fri, Oct 13, 2017 at 11:35 AM, Chris Angelico wrote: > On Sat, Oct 14, 2017 at 5:16 AM, Antoine Rozo > wrote: > [...] > Can we consider making itertools a package and adding a module > > itertools.recipes that implements all these utilility functions? > > Check out more-itertools on PyPI - maybe that's what you want? > toolz is another good collection of itertools-related recipes: http://toolz.readthedocs.io/en/latest/api.html - Lucas ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On 10/13/2017 03:30 PM, Yury Selivanov wrote: At this time, we have so many conflicted examples and tangled discussions on these topics, that I myself just lost what everybody is implying by "this semantics isn't obvious to *me*". Which semantics? It's hard to tell. For me, it's not apparent why __aenter__ and __aexit__ cannot be special-cased. I would be grateful for a small code-snippet illustrating the danger. -- ~Ethan~ ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
really like what Paul Moore wrote here as it matches a *LOT* of what I have been feeling as I have been reading this whole discussion; specifically: - I find the example, and discussion, really hard to follow. - I also, don't understand async, but I do understand generators very well (like Paul Moore) - A lot of this doesn't seem natural (generators & context variable syntax) - And particular: " If the implementation is hard to explain, it's a bad idea." I've spend a lot of time thinking about this, and what the issues are. I think they are multi-fold: - I really use Generators a lot -- and find them wonderful & are one of the joy's of python. They are super useful. However, as I am going to, hopefully, demonstrate here, they are not initially intuitive (to a beginner). - Generators are not really functions; but they appear to be functions, this was very confusing to me when I started working with generators. - Now, I'm used to it -- BUT, we really need to consider new people - and I suggest making this easier. - I find the proposed context syntax very confusing (and slow). I think contexts are super-important & instead need to be better integrated into the language (like nonlocal is) - People keep writing they want a real example -- so this is a very real example from real code I am writing (a python parser) and how I use contexts (obviously they are not part of the language yet, so I have emulated them) & how they interact with generators. The full example, which took me a few hours to write is available here (its a very very reduced example from a real parser of the python language written in python): - https://github.com/AmitGreen/Gem/blob/emerald_6/work/demo.py Here is the result of running the code -- which reads & executes demo1.py (importing & executing demo2.py twice): [Not by executing, I mean the code is running its own parser to execute it & its own code to emulate an 'import' -- thus showing nested contexts): It creates two input files for testing -- demo1.py: print 1 print 8 - 2 * 3 import demo2 print 9 - sqrt(16) print 10 / (8 - 2 * 3) import demo2 print 2 * 2 * 2 + 3 - 4 And it also creates demo2.py: print 3 * (2 - 1) error print 4 There are two syntax errors (on purpose) in the files, but since demo2.py is imported twice, this will show three syntax errors. Running the code produces the following: demo1.py#1: expression '1' evaluates to 1 demo1.py#2: expression '8 - 2 * 3' evaluates to 2 demo1.py#3: importing module demo2 demo2.py#1: expression '3 * (3 - 2)' evaluates to 3 demo2.py#2: UNKNOWN STATEMENT: 'error' demo2.py#3: expression '4' evaluates to 4 demo1.py#4: UNKNOWN ATOM: ' sqrt(16)' demo1.py#5: expression '10 / (8 - 2 * 3)' evaluates to 5 demo1.py#6: importing module demo2 demo2.py#1: expression '3 * (3 - 2)' evaluates to 3 demo2.py#2: UNKNOWN STATEMENT: 'error' demo2.py#3: expression '4' evaluates to 4 demo1.py#7: expression '2 * 2 * 2 + 3 - 4' evaluates to 7 This code demonstrates all of the following: - Nested contexts - Using contexts 'naturally' -- i.e.: directly as variables; without a 'context.' prefix -- which I would find too harder to read & also slower. - Using a generator that is deliberately broken up into three parts, start, next & stop. - Handling errors & how it interacts with both the generator & 'context' - Actually parsing the input -- which creates a deeply nested stack (due to recursive calls during expression parsing) -- thus a perfect example for contexts. So given all of the above, I'd first like to focus on the generator: - Currently we can write generators as either: (1) functions; or (2) classes with a __next__ method. However this is very confusing to a beginner. - Given a generator like the following (actually in the code): def __iter__(self): while not self.finished: self.loop += 1 yield self - What I found so surprising when I started working with generator, is that calling the generator does *NOT* actually start the function. - Instead, the actual code does not actually get called until the first __next__ method is called. - This is quite counter-intuitive. I therefore suggest the following: - Give generators their own first class language syntax. - This syntax, would have good entry point's, to allow their interaction with context variables. Here is the generator in my code sample: # # Here is our generator to walk over a file. # # This generator has three sections: # # generator_start - Always run when the generator is started. # This opens the file & reads it. # # generator_next - Run each time the generator needs to retrieve # The next value. # # generator_stop - Called when the
Re: [Python-ideas] Add a module itertools.recipes
I am not searching for an external library (as I pointed, there are some on PyPI like iterutils or more-itertools). My point was that recipes are documented in itertools module, but not implemented in standard library, and it would be useful to have them available. 2017-10-13 20:35 GMT+02:00 Chris Angelico : > On Sat, Oct 14, 2017 at 5:16 AM, Antoine Rozo > wrote: > > Hello, > > > > A very useful part of the itertools module's documentation is the section > > "Recipes", giving utility functions that use itertools iterators. > > But when you want to use one of theese functions, you have to copy it in > > your source code (or use external PyPI modules like iterutils). > > > > Can we consider making itertools a package and adding a module > > itertools.recipes that implements all these utilility functions? > > Check out more-itertools on PyPI - maybe that's what you want? > > ChrisA > ___ > Python-ideas mailing list > [email protected] > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Antoine Rozo ___ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
