On Sunday 16 October 2011 18:18:39 Vikash Badal wrote: > Greetings, > > Can some point me in the correction direction please. > > I have a treaded socket application that has a problem with select() > returning -1. > The select() and accept() is taken care of in one thread. The worker > threads deal with client requests after the new client connection is > pushed to queue. > > The logged error is : > select() failed (Bad file descriptor) getdtablesize = 65536 > > Sysctls at the moment are: > kern.maxfiles: 65536 > kern.maxfilesperproc: 65536 > > > <code> > void client_accept(int listen_socket) > { > ... > while ( loop ) > { > FD_ZERO(&socket_set); > FD_SET(listen_socket, &socket_set); > timeout.tv_sec = 1; > timeout.tv_usec = 0; > > rcode = select(listen_socket + 1, &socket_set, NULL, NULL, &timeout); > > if ( rcode < 0 ) > { > Log(DEBUG_0, "ERROR: select() failed (%s) getdtablesize = %d", > strerror(errno), getdtablesize()); > loop = 0; > sleep(30); > fcloseall(); > assert(1==0); > } > > if ( rcode > 0 ) > { > remotelen = sizeof(remote); > client_sock = accept(listen_socket, ..... > > if (msgsock != -1 ) > { > // Allocate memory for request > request = malloc(sizeof(struct requests)); > // test for malloc etc ... > // set request values ... > // > // Push request to a queue. > } > } > > } > ... > } > void* tcpworker(void* arg) > { > // initialise stuff > > While ( loop ) > { > // pop request from queue > > If ( request != NULL ) > { > // deal with request > free(request) > } > } > } > > </code> > When the problem occurs, i have between 1000 and 1400 clients > connected. > > Questions: > 1. do i need to FD_CLR(client_sock,&socket_set) before i push to a > queue ? > 2. do i need to FD_CLR(client_sock, &socket_set) when this client > request closes in the the tcpworker() function ? > 3. would setting kern.maxfilesperproc and kern.maxfiles to higher > values solve the problem or just take longer for the problem to > re-appear. > 4. should is replace select() with kqueue() as from google-ing it > seems select() is not that great.
The size of an fd_set is limited by FD_SETSIZE which is 1024 by default. So if you pass a descriptor larger than that to FD_SET() or select(), you have a buffer overflow and memory beyond the fd_set can become corrupted. You can define FD_SETSIZE to a larger value before including sys/select.h, but you should also verify if a descriptor is less than FD_SETSIZE before using it with select or any of the fd_set macros and return error if not. kqueue doesn't have this problem, but it's not as portable as select.
signature.asc
Description: This is a digitally signed message part.