Spencer Nelson <swnel...@uw.edu> added the comment:
Josh, > Making literally every await equivalent to: > > await asyncio.sleep(0) > > followed by the actual await (which is effectively what you're proposing when > you expect all await to be preemptible) means adding non-trivial overhead to > all async operations (asyncio is based on system calls of the > select/poll/epoll/kpoll variety, which add meaningful overhead when we're > talking about an operation that is otherwise equivalent to an extremely cheap > simple collections.deque.append call). A few things: First, I don't think I proposed that. I was simply saying that my expectations on behavior were incorrect, which points towards documentation. Second, I don't think making a point "preemptible" is the same as actually executing a cooperative-style yield to the scheduler. I just expected that it would always be in the cards - that it would always be a potential point where I'd get scheduled away. Third, I don't think that await asyncio.sleep(0) triggers a syscall, but I certainly could be mistaken. It looks to me like it is special-cased in asyncio, from my reading of the source. Again - could be wrong. Fourth, I think that the idea of non-cooperative preempting scheduling is not nearly as bizarre as you make it sound. There's certainly plenty of prior art on preemptive schedulers out there. Go uses a sort of partial preemption at function call sites *because* it's a particularly efficient way to do things. But anyway - I didn't really want to discuss this. As I said above, it's obviously a way way way bigger design discussion than my specific issue. > It also breaks many reasonable uses of asyncio.wait and asyncio.as_completed, > where the caller can reasonably expect to be able to await the known-complete > tasks without being preempted (if you know the coroutine is actually done, it > could be quite surprising/problematic when you await it and get preempted, > potentially requiring synchronization that wouldn't be necessary otherwise). I think this cuts both ways. Without reading the source code of asyncio.Queue, I don't see how it's possible to know whether its put method yields. Because of this, I tend to assume synchronization is necessary everywhere. The way I know for sure that a function call can complete without yielding is supposed to be that it isn't an `async` function, right? That's why asyncio.Queue.put_nowait exists and isn't asynchronous. > In real life, if whatever you're feeding the queue with is infinite and > requires no awaiting to produce each value, you should probably just avoid > the queue and have the consumer consume the iterable directly. The stuff I'm feeding the queue doesn't require awaiting, but I *wish* it did. It's just a case of not having the libraries for asynchronicity yet on the source side. I was hoping that the queue would let me pace my work in a way that would let me do more concurrent work. > Or just apply a maximum size to the queue; since the source of data to put is > infinite and not-awaitable, there's no benefit to an unbounded queue, you may > as well use a bound roughly fitted to the number of consumers, because any > further items are just wasting memory well ahead of when it's needed. The problem isn't really that put doesn't yield for unbounded queues - it's that put doesn't yield *unless the queue is full*. That means that, if I use a very high maximum size for the queue, I'll still spend a big chunk of time filling up the queue, and only then will consumers start doing work. I could pick a small queue bound, but then I'm more likely to waste time doing nothing if consumers are slower than the producer - I'll sit there with a full-but-tiny queue. Work-units in the queue can take wildly different amounts of time, so consumers will often be briefly slow, so the producer races ahead - until it hits its tiny limit. But then new work units arrive, and so the consumers are fast again - and they're quickly starved for work because the producer didn't build a good backlog. So, the problem still remains, if work takes an uncertain amount of time which would seem to be the common reason for using a queue in the first place. > Point is, regular queue puts only block (and potentially release the GIL > early) when they're full or, as a necessary consequence of threading being > less predictable than asyncio, when there is contention on the lock > protecting the queue internals (which is usually resolved quickly); why would > asyncio queues go out of their way to block when they don't need to? I think you have it backwards. asyncio.Queue.put *always* blocks other coroutines' execution for unbounded queues. Why do they always block? If I wanted that, I wouldn't use anything in asyncio.Queue. I'd just use a collections.deque. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue43119> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com