I have a suggestion about the implementation of asyncio queues that
could improve performance. I might be missing something, however. I am
sort of new to Python. Below a short description of the problem I am facing.
I wrote a daemon in Python 3 (running in Linux) which test many devices
at the same time, to be used in a factory environment. This daemon
include multiple communication events to a back-end running in another
country. I am using a class for each device I test, and embedded into
the class I use asyncio. Due to the application itself and the number of
devices tested simultaneously, I soon run out of file descriptor. Well,
I increased the number of file descriptor in the application and then I
started running into problems like “ValueError: filedescriptor out of
range in select()”. I guess this problem is related to a package called
serial_asyncio, and of course, that could be corrected. However I became
curious about the number of open file descriptors opened: why so many?
Apparently asyncio Queues use a Linux pipe and each queue require 2 file
descriptors. Am I correct? So I asked my self: if a asyncio queue is
just a mechanism of piping information between two asyncio tasks, which
should never run at the same time, why do I need the operating system in
the middle of that? Isn’t the whole idea about asyncio that the
operating system would be avoided whenever possible? No one will put
anything into a queue if asyncio called epoll, because some Python code
should be running to push things into the queue. If there is nothing in
a particular queue, nothing will show up while asyncio is waiting for a
file descriptor event. So, if I am correct, it would be more efficient
to put the queue in a ready-queue list whenever something is pushed into
it. Then, just before asyncio calls epoll (or select), it would check
that ready queue, and it would process it before the epoll call. I mean
that epoll would not be called unless all the queues have been properly
processed. Queues would be implemented in a much simpler way, using
local memory: a simple array may be enough to do the job. With that the
OS would be avoided, and a much lower number of file descriptors would
be necessary.
--
https://mail.python.org/mailman/listinfo/python-list