Hi all, I found a few bugs in Cygwin w.r.t. creating large numbers of sockets. For example, Cygwin will gladly let you create up to RLIMIT_NOFILE sockets (examples in Python, where I found this problem):
>>> import resource >>> import socket >>> resource.getrlimit(resource.RLIMIT_NOFILE) (256, 3200) >>> resource.setrlimit(resource.RLIMIT_NOFILE, (3200, 3200)) >>> socks = [socket.socket() for _ in range(3000)] # A bit fewer than the max >>> but it doesn't matter However, if I try to do anything interesting with those sockets, such as poll on them, I get a rather unexpected error: >>> import select >>> poll = select.poll() >>> for sock in socks: ... poll.register(sock, select.POLLOUT) ... >>> poll.poll() Traceback (most recent call last): File "<stdin>", line 1, in <module> OSError: [Errno 14] Bad address After some playing around I found that I could make up to exactly 1365 sockets and use them without error. At 1366 I get the error. A very strange and arbitrary number. It turns out this is limited in Cygwin by the array in fhandler_socket.cc: 496 /* Maximum number of concurrently opened sockets from all Cygwin processes 497 per session. Note that shared sockets (through dup/fork/exec) are 498 counted as one socket. */ 499 #define NUM_SOCKS (32768 / sizeof (wsa_event)) ... 510 static wsa_event wsa_events[NUM_SOCKS] __attribute__((section (".cygwin_dll _common"), shared)); This choice for NUM_SOCKS is still seemingly small and pretty arbitrary, but at least it's a choice, and probably well-motivated. However, I think it's a problem that it's defined in terms of sizeof(wsa_event). On 32-bit Cygwin this is 16, so NUM_SOCKS is 2048 (a less strange number), whereas on 64-bit Cygwin sizeof(wsa_event) == 24 (due to sizeof(long) == 8, plus alignment), so we are limited to...1365 sockets. If we have to set a limit I would just hard-code it to 2048 exactly. I understand that the overhead associated with sockets in Cygwin probably limits us from having 10s of thousands (much less millions) and that's OK--I'm not trying to run some kind of C10K challenge on Cygwin :) The other problem, then, seems to be a bug in fhandler_socket::init_events(). It doesn't check the return value of search_wsa_event_slot(), which returns NULL if the wsa_events array is full (and the socket is not a shared socket). There's not a great choice here for error code, but setting ENOBUF seems like the best option. Please see attached patch. Best, Erik
0001-Fix-two-bugs-in-the-limit-of-large-numbers-of-socket.patch
Description: Binary data
-- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple