Maxim Veksler <[EMAIL PROTECTED]> wrote: ... > Thank you. I'm attaching the full code so far for reference, sadly it > still doesn't work. It seems that select.select gets it's count of > fd's not from the amount passed to it by the sub_list but from the > kernel (or whatever) count for the process; The main issue here is
It's not a problem of COUNT of FD's, i.e., how many you're passing to select; the problem is the value of the _highest_ number you can pass. It's an API-level limitation, not an issue with Python per se: the select API takes a "bit vector" of N bits, representing a set of FDs in that way, and N is fixed at kernel-compilation time (normally to 1024). The poll system call does not have this particular limitation, which is why select.poll may be better for you. Moreover, your code has other performance problems: > while 1: > for select_cap_sockets in slice_by_fd_limit(all_sockets): > ready_to_read, ready_to_write, in_error = > select.select(select_cap_sockets, [], [], 0) > for nb_active_socket in all_sockets: > if nb_active_socket in ready_to_read: A small issue is with the last two lines -- instead of looping directly on the small "ready-to-read" list, you're looping on the large all_sockets one and looking each up in the small list -- that's just throwing performance out of the window, and adding complexity, for no benefit whatsoever. The big issue is that you are "ceaselessly polling". If no socket is ready to read, you force select to return immediately anyway, and basically call select at once afterwards. You churn on the CPU without surcease, using 100% of it, hogging it for this "busy wait", possibly to the point of crowding out the kernel from some of the CPU time it needs to do useful work in the TCP-IP stack. Busy-wait is a bad thing... never call select with a timeout of 0 in a tight loop. This recommendation also applies to the polling-object that you can build with select.poll, and any other situation where you're waiting for another thread or process to deliver some data -- ideally you should wait in a blocking way, if that's unfeasible at least make sure you're letting some time pass between such calls, by using small but non-0 timeout (or even by inserting calls to time.sleep if that's what it takes). The risk of such "antipatterns" is a good reason why it would be better to use a well-designed, well-coded, well-debugged existing framework, such as Twisted, rather than roll your own, btw. With twisted, you can choose among many appropriate implementations of "reactor" (the key design pattern for async prorgramming) and activate the one that is most suitable for your needs (including, e.g., one based on epoll, which gives better performance than poll on suitable operating systems). If you're adamant on "rolling your own", though, you can find a Python epoll module at <http://cheeseshop.python.org/pypi/pyepoll/0.2> (it's said to be in alpha status, though; I believe there are other such modules around, but pyepoll seems to be the only one on Cheese Shop). Alex -- http://mail.python.org/mailman/listinfo/python-list